1
|
Iron uptake and transport by the carboxymycobactin-mycobactin siderophore machinery of Mycobacterium tuberculosis is dependent on the iron-regulated protein HupB. Biometals 2021; 34:511-528. [PMID: 33609202 DOI: 10.1007/s10534-021-00292-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 02/07/2021] [Indexed: 12/27/2022]
Abstract
Iron-starved Mycobacterium tuberculosis utilises the carboxymycobactin-mycobactin siderophore machinery to acquire iron. These two siderophores have high affinity for ferric iron and can withdraw the metal ion from insoluble iron hydroxides and iron-binding proteins. We first reported HupB, a multi-functional mycobacterial protein to be associated with iron acquisition in M. tuberculosis. This 28 kDa cell wall protein, up regulated upon iron limitation functions as a transcriptional activator of mycobactin biosynthesis and is essential for the pathogen to survive inside macrophages. The focus of this study is to understand the role of HupB in iron uptake and transport by the carboxmycobactin-mycobactin siderophore machinery in M. tuberculosis. Experimental approaches included radiolabelled iron uptake studies by viable organisms and protein-ligand binding studies using the purified HupB and the two siderophores. Uptake of 55Fe-carboxymycobactin by wild type M. tuberculosis (WT M.tb.H37Rv) and not by the hupB KO mutant (M.tb.ΔhupB) showed that HupB is necessary for the uptake of ferri-carboxymycobactin. Additionally, the radiolabel recovery was high in HupB-incorporated liposomes upon addition of the labelled siderophore. Bioinformatic and experimental studies using spectrofluorimetry, CD analysis and surface plasmon resonance not only confirmed the binding of HupB with ferri-carboxymycobactin and ferri-mycobactin but also with free iron. In conclusion, HupB is established as a ferri- carboxymycobactin receptor and by virtue of its property to bind ferric iron, functions as a transporter of the ferric iron from the extracellular siderophore to mycobactin within the cell envelope.
Collapse
|
2
|
Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3-A versatile homology modelling toolbox. PLoS Comput Biol 2021; 17:e1008667. [PMID: 33507980 PMCID: PMC7872268 DOI: 10.1371/journal.pcbi.1008667] [Citation(s) in RCA: 142] [Impact Index Per Article: 47.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 02/09/2021] [Accepted: 01/03/2021] [Indexed: 11/18/2022] Open
Abstract
Computational methods for protein structure modelling are routinely used to complement experimental structure determination, thus they help to address a broad spectrum of scientific questions in biomedical research. The most accurate methods today are based on homology modelling, i.e. detecting a homologue to the desired target sequence that can be used as a template for modelling. Here we present a versatile open source homology modelling toolbox as foundation for flexible and computationally efficient modelling workflows. ProMod3 is a fully scriptable software platform that can perform all steps required to generate a protein model by homology. Its modular design aims at fast prototyping of novel algorithms and implementing flexible modelling pipelines. Common modelling tasks, such as loop modelling, sidechain modelling or generating a full protein model by homology, are provided as production ready pipelines, forming the starting point for own developments and enhancements. ProMod3 is the central software component of the widely used SWISS-MODEL web-server.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marco Biasini
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niklaus Johner
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
3
|
Energy propagation along polypeptide α-helix: Experimental data and ab initio zone structure. Biosystems 2019; 185:104016. [DOI: 10.1016/j.biosystems.2019.104016] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 08/09/2019] [Accepted: 08/09/2019] [Indexed: 12/11/2022]
|
4
|
Kundert K, Kortemme T. Computational design of structured loops for new protein functions. Biol Chem 2019; 400:275-288. [PMID: 30676995 PMCID: PMC6530579 DOI: 10.1515/hsz-2018-0348] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 12/18/2018] [Indexed: 12/20/2022]
Abstract
The ability to engineer the precise geometries, fine-tuned energetics and subtle dynamics that are characteristic of functional proteins is a major unsolved challenge in the field of computational protein design. In natural proteins, functional sites exhibiting these properties often feature structured loops. However, unlike the elements of secondary structures that comprise idealized protein folds, structured loops have been difficult to design computationally. Addressing this shortcoming in a general way is a necessary first step towards the routine design of protein function. In this perspective, we will describe the progress that has been made on this problem and discuss how recent advances in the field of loop structure prediction can be harnessed and applied to the inverse problem of computational loop design.
Collapse
Affiliation(s)
- Kale Kundert
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
| |
Collapse
|
5
|
Choudhury TP, Gupta L, Kumar S. Identification, characterization and expression analysis of Anopheles stephensi double peroxidase. Acta Trop 2019; 190:210-219. [PMID: 30352205 DOI: 10.1016/j.actatropica.2018.10.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2018] [Revised: 10/13/2018] [Accepted: 10/14/2018] [Indexed: 11/28/2022]
Abstract
Peroxidases catalyze the reduction of peroxides and that, in turn, oxidize various substrates. They have been widely reported to play an important role in mosquito innate immunity against various pathogens. Here, we have characterized double heme peroxidase (AsDBLOX) gene from the Indian malaria vector Anopheles stephensi. It is a true ortholog of An. gambiae DBLOX. This 4209 bp AsDBLOX gene encodes for a protein of 1402 amino acids that has two duplicated peroxidase domains, domain I (from amino acid 61 to 527) and domain II (from amino acid 714 to 1252). The first domain has only substrate binding sites and lacks all other motifs of a functional heme peroxidase (e.g. heme binding site, calcium binding site and homodimer interface). Instead, it has two integrin binding motifs-LDV (Leu-Asp-Val) and RGD (Arg-Gly-Asp). The second peroxidase domain, however, has all the features of a complete heme peroxidase along with an integrin binding motif LDI (Leu-Asp-Ile). Thus, AsDBLOX gene is a unique type of peroxinectin as these groups of proteins are characterized by integrin binding motifs along with a heme peroxidase domain. We also observed that the AsDBLOX gene is expressed in all the life cycle stages of mosquito and is highly induced in the pupal stage of development which indicates its possible role in development.
Collapse
Affiliation(s)
- Tania Pal Choudhury
- Molecular Parasitology and Vector Biology Laboratory, Department of Biological Sciences, Birla Institute of Technology and Science (BITS), Pilani, Rajasthan, India
| | - Lalita Gupta
- Molecular Parasitology and Vector Biology Laboratory, Department of Biological Sciences, Birla Institute of Technology and Science (BITS), Pilani, Rajasthan, India; Department of Zoology, Ch. Bansi Lal University, Bhiwani, Haryana, India
| | - Sanjeev Kumar
- Molecular Parasitology and Vector Biology Laboratory, Department of Biological Sciences, Birla Institute of Technology and Science (BITS), Pilani, Rajasthan, India; Department of Biotechnology, Ch. Bansi Lal University, Bhiwani, Haryana, India.
| |
Collapse
|
6
|
Hooper WF, Walcott BD, Wang X, Bystroff C. Fast design of arbitrary length loops in proteins using InteractiveRosetta. BMC Bioinformatics 2018; 19:337. [PMID: 30249181 PMCID: PMC6154894 DOI: 10.1186/s12859-018-2345-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 08/29/2018] [Indexed: 11/10/2022] Open
Abstract
Background With increasing interest in ab initio protein design, there is a desire to be able to fully explore the design space of insertions and deletions. Nature inserts and deletes residues to optimize energy and function, but allowing variable length indels in the context of an interactive protein design session presents challenges with regard to speed and accuracy. Results Here we present a new module (INDEL) for InteractiveRosetta which allows the user to specify a range of lengths for a desired indel, and which returns a set of low energy backbones in a matter of seconds. To make the loop search fast, loop anchor points are geometrically hashed using C α-C α and C β-C β distances, and the hash is mapped to start and end points in a pre-compiled random access file of non-redundant, protein backbone coordinates. Loops with superposable anchors are filtered for collisions and returned to InteractiveRosetta as poly-alanine for display and selective incorporation into the design template. Sidechains can then be added using RosettaDesign tools. Conclusions INDEL was able to find viable loops in 100% of 500 attempts for all lengths from 3 to 20 residues. INDEL has been applied to the task of designing a domain-swapping loop for T7-endonuclease I, changing its specificity from Holliday junctions to paranemic crossover (PX) DNA. Electronic supplementary material The online version of this article (10.1186/s12859-018-2345-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- William F Hooper
- Emmes Corporation, Rockville, Washington, MD, USA.,Department of Biology, Rensselaer Polytechnic Institute, Troy, NY, USA
| | | | - Xing Wang
- Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Christopher Bystroff
- Department of Biology, Rensselaer Polytechnic Institute, Troy, NY, USA. .,Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, USA.
| |
Collapse
|
7
|
Marks C, Nowak J, Klostermann S, Georges G, Dunbar J, Shi J, Kelm S, Deane CM. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction. Bioinformatics 2018; 33:1346-1353. [PMID: 28453681 PMCID: PMC5408792 DOI: 10.1093/bioinformatics/btw823] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 01/09/2017] [Indexed: 01/31/2023] Open
Abstract
Motivation Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. Results We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Availability and Implementation Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford, UK
| | - Jaroslaw Nowak
- Department of Statistics, University of Oxford, Oxford, UK
| | | | - Guy Georges
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, DE, Germany
| | - James Dunbar
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, DE, Germany
| | - Jiye Shi
- Department of Informatics, UCB Pharma, Slough, UK
| | | | | |
Collapse
|
8
|
Wong SWK, Liu JS, Kou SC. Fast de novo discovery of low-energy protein loop conformations. Proteins 2017; 85:1402-1412. [PMID: 28378911 DOI: 10.1002/prot.25300] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2017] [Revised: 03/19/2017] [Accepted: 03/27/2017] [Indexed: 12/25/2022]
Abstract
In the prediction of protein structure from amino acid sequence, loops are challenging regions for computational methods. Since loops are often located on the protein surface, they can have significant roles in determining protein functions and binding properties. Loop prediction without the aid of a structural template requires extensive conformational sampling and energy minimization, which are computationally difficult. In this article we present a new de novo loop sampling method, the Parallely filtered Energy Targeted All-atom Loop Sampler (PETALS) to rapidly locate low energy conformations. PETALS explores both backbone and side-chain positions of the loop region simultaneously according to the energy function selected by the user, and constructs a nonredundant ensemble of low energy loop conformations using filtering criteria. The method is illustrated with the DFIRE potential and DiSGro energy function for loops, and shown to be highly effective at discovering conformations with near-native (or better) energy. Using the same energy function as the DiSGro algorithm, PETALS samples conformations with both lower RMSDs and lower energies. PETALS is also useful for assessing the accuracy of different energy functions. PETALS runs rapidly, requiring an average time cost of 10 minutes for a length 12 loop on a single 3.2 GHz processor core, comparable to the fastest existing de novo methods for generating an ensemble of conformations. Proteins 2017; 85:1402-1412. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Samuel W K Wong
- Department of Statistics, University of Florida, Gainesville, Florida, 32611
| | - Jun S Liu
- Department of Statistics, Harvard University, Cambridge, Massachusetts, 02138
| | - S C Kou
- Department of Statistics, Harvard University, Cambridge, Massachusetts, 02138
| |
Collapse
|
9
|
Heo S, Lee J, Joo K, Shin HC, Lee J. Protein Loop Structure Prediction Using Conformational Space Annealing. J Chem Inf Model 2017; 57:1068-1078. [DOI: 10.1021/acs.jcim.6b00742] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Seungryong Heo
- School
of Systems Biomedical Science, Soongsil University, Seoul 06978, Korea
| | - Juyong Lee
- Laboratory
of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | | | - Hang-Cheol Shin
- School
of Systems Biomedical Science, Soongsil University, Seoul 06978, Korea
| | | |
Collapse
|
10
|
Marks C, Deane C. Antibody H3 Structure Prediction. Comput Struct Biotechnol J 2017; 15:222-231. [PMID: 28228926 PMCID: PMC5312500 DOI: 10.1016/j.csbj.2017.01.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Revised: 01/24/2017] [Accepted: 01/27/2017] [Indexed: 01/20/2023] Open
Abstract
Antibodies are proteins of the immune system that are able to bind to a huge variety of different substances, making them attractive candidates for therapeutic applications. Antibody structures have the potential to be useful during drug development, allowing the implementation of rational design procedures. The most challenging part of the antibody structure to experimentally determine or model is the H3 loop, which in addition is often the most important region in an antibody's binding site. This review summarises the approaches used so far in the pursuit of accurate computational H3 structure prediction.
Collapse
Affiliation(s)
- C. Marks
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford OX1 3LB, United Kingdom
| | | |
Collapse
|
11
|
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
12
|
Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1920] [Impact Index Per Article: 240.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
13
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | | |
Collapse
|
14
|
Tang K, Zhang J, Liang J. Fast protein loop sampling and structure prediction using distance-guided sequential chain-growth Monte Carlo method. PLoS Comput Biol 2014; 10:e1003539. [PMID: 24763317 PMCID: PMC3998890 DOI: 10.1371/journal.pcbi.1003539] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2013] [Accepted: 02/01/2014] [Indexed: 11/18/2022] Open
Abstract
Loops in proteins are flexible regions connecting regular secondary structures. They are often involved in protein functions through interacting with other molecules. The irregularity and flexibility of loops make their structures difficult to determine experimentally and challenging to model computationally. Conformation sampling and energy evaluation are the two key components in loop modeling. We have developed a new method for loop conformation sampling and prediction based on a chain growth sequential Monte Carlo sampling strategy, called Distance-guided Sequential chain-Growth Monte Carlo (DISGRO). With an energy function designed specifically for loops, our method can efficiently generate high quality loop conformations with low energy that are enriched with near-native loop structures. The average minimum global backbone RMSD for 1,000 conformations of 12-residue loops is 1:53 A° , with a lowest energy RMSD of 2:99 A° , and an average ensembleRMSD of 5:23 A° . A novel geometric criterion is applied to speed up calculations. The computational cost of generating 1,000 conformations for each of the x loops in a benchmark dataset is only about 10 cpu minutes for 12-residue loops, compared to ca 180 cpu minutes using the FALCm method. Test results on benchmark datasets show that DISGRO performs comparably or better than previous successful methods, while requiring far less computing time. DISGRO is especially effective in modeling longer loops (10-17 residues).
Collapse
Affiliation(s)
- Ke Tang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Jinfeng Zhang
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
- * E-mail: (JZ); (JL)
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- * E-mail: (JZ); (JL)
| |
Collapse
|
15
|
Abstract
Structural proteomics aims to understand the structural basis of protein interactions and functions. A prerequisite for this is the availability of 3D protein structures that mediate the biochemical interactions. The explosion in the number of available gene sequences set the stage for the next step in genome-scale projects -- to obtain 3D structures for each protein. To achieve this ambitious goal, the slow and costly structure determination experiments are supplemented with theoretical approaches. The current state and recent advances in structure modeling approaches are reviewed here, with special emphasis on comparative protein structure modeling techniques.
Collapse
Affiliation(s)
- András Fiser
- Department of Biochemistry, Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY 10461, USA.
| |
Collapse
|
16
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
17
|
Zhang Y, Hauser K. Unbiased, scalable sampling of protein loop conformations from probabilistic priors. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S9. [PMID: 24565175 PMCID: PMC3953323 DOI: 10.1186/1472-6807-13-s1-s9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND Protein loops are flexible structures that are intimately tied to function, but understanding loop motion and generating loop conformation ensembles remain significant computational challenges. Discrete search techniques scale poorly to large loops, optimization and molecular dynamics techniques are prone to local minima, and inverse kinematics techniques can only incorporate structural preferences in adhoc fashion. This paper presents Sub-Loop Inverse Kinematics Monte Carlo (SLIKMC), a new Markov chain Monte Carlo algorithm for generating conformations of closed loops according to experimentally available, heterogeneous structural preferences. RESULTS Our simulation experiments demonstrate that the method computes high-scoring conformations of large loops (>10 residues) orders of magnitude faster than standard Monte Carlo and discrete search techniques. Two new developments contribute to the scalability of the new method. First, structural preferences are specified via a probabilistic graphical model (PGM) that links conformation variables, spatial variables (e.g., atom positions), constraints and prior information in a unified framework. The method uses a sparse PGM that exploits locality of interactions between atoms and residues. Second, a novel method for sampling sub-loops is developed to generate statistically unbiased samples of probability densities restricted by loop-closure constraints. CONCLUSION Numerical experiments confirm that SLIKMC generates conformation ensembles that are statistically consistent with specified structural preferences. Protein conformations with 100+ residues are sampled on standard PC hardware in seconds. Application to proteins involved in ion-binding demonstrate its potential as a tool for loop ensemble generation and missing structure completion.
Collapse
Affiliation(s)
- Yajia Zhang
- School of Informatics and Computing, Indiana University, Bloomington, Indiana, USA
| | - Kris Hauser
- School of Informatics and Computing, Indiana University, Bloomington, Indiana, USA
| |
Collapse
|
18
|
Kelm S, Vangone A, Choi Y, Ebejer JP, Shi J, Deane CM. Fragment-based modeling of membrane protein loops: successes, failures, and prospects for the future. Proteins 2013; 82:175-86. [PMID: 23589399 DOI: 10.1002/prot.24299] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Revised: 02/22/2013] [Accepted: 03/26/2013] [Indexed: 11/12/2022]
Abstract
Membrane proteins (MPs) have become a major focus in structure prediction, due to their medical importance. There is, however, a lack of fast and reliable methods that specialize in the modeling of MP loops. Often methods designed for soluble proteins (SPs) are applied directly to MPs. In this article, we investigate the validity of such an approach in the realm of fragment-based methods. We also examined the differences in membrane and soluble protein loops that might affect accuracy. We test our ability to predict soluble and MP loops with the previously published method FREAD. We show that it is possible to predict accurately the structure of MP loops using a database of MP fragments (0.5-1 Å median root-mean-square deviation). The presence of homologous proteins in the database helps prediction accuracy. However, even when homologues are removed better results are still achieved using fragments of MPs (0.8-1.6 Å) rather than SPs (1-4 Å) to model MP loops. We find that many fragments of SPs have shapes similar to their MP counterparts but have very different sequences; however, they do not appear to differ in their substitution patterns. Our findings may allow further improvements to fragment-based loop modeling algorithms for MPs. The current version of our proof-of-concept loop modeling protocol produces high-accuracy loop models for MPs and is available as a web server at http://medeller.info/fread.
Collapse
Affiliation(s)
- Sebastian Kelm
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | | | | | | | | | | |
Collapse
|
19
|
MacDonald JT, Kelley LA, Freemont PS. Validating a Coarse-Grained Potential Energy Function through Protein Loop Modelling. PLoS One 2013; 8:e65770. [PMID: 23824634 PMCID: PMC3688807 DOI: 10.1371/journal.pone.0065770] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2013] [Accepted: 04/26/2013] [Indexed: 12/02/2022] Open
Abstract
Coarse-grained (CG) methods for sampling protein conformational space have the potential to increase computational efficiency by reducing the degrees of freedom. The gain in computational efficiency of CG methods often comes at the expense of non-protein like local conformational features. This could cause problems when transitioning to full atom models in a hierarchical framework. Here, a CG potential energy function was validated by applying it to the problem of loop prediction. A novel method to sample the conformational space of backbone atoms was benchmarked using a standard test set consisting of 351 distinct loops. This method used a sequence-independent CG potential energy function representing the protein using -carbon positions only and sampling conformations with a Monte Carlo simulated annealing based protocol. Backbone atoms were added using a method previously described and then gradient minimised in the Rosetta force field. Despite the CG potential energy function being sequence-independent, the method performed similarly to methods that explicitly use either fragments of known protein backbones with similar sequences or residue-specific /-maps to restrict the search space. The method was also able to predict with sub-Angstrom accuracy two out of seven loops from recently solved crystal structures of proteins with low sequence and structure similarity to previously deposited structures in the PDB. The ability to sample realistic loop conformations directly from a potential energy function enables the incorporation of additional geometric restraints and the use of more advanced sampling methods in a way that is not possible to do easily with fragment replacement methods and also enable multi-scale simulations for protein design and protein structure prediction. These restraints could be derived from experimental data or could be design restraints in the case of computational protein design. C++ source code is available for download from http://www.sbg.bio.ic.ac.uk/phyre2/PD2/.
Collapse
Affiliation(s)
- James T. MacDonald
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
- * E-mail:
| | - Lawrence A. Kelley
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
| | - Paul S. Freemont
- Division of Molecular Biosciences, Imperial College London, London, United Kingdom
| |
Collapse
|
20
|
Li Y. Conformational sampling in template-free protein loop structure modeling: an overview. Comput Struct Biotechnol J 2013; 5:e201302003. [PMID: 24688696 PMCID: PMC3962101 DOI: 10.5936/csbj.201302003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Revised: 01/23/2013] [Accepted: 01/28/2013] [Indexed: 01/04/2023] Open
Abstract
Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.
Collapse
Affiliation(s)
- Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
21
|
Miller EB, Murrett CS, Zhu K, Zhao S, Goldfeld DA, Bylund JH, Friesner RA. Prediction of Long Loops with Embedded Secondary Structure using the Protein Local Optimization Program. J Chem Theory Comput 2013; 9:1846-4864. [PMID: 23814507 DOI: 10.1021/ct301083q] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Robust homology modeling to atomic-level accuracy requires in the general case successful prediction of protein loops containing small segments of secondary structure. Further, as loop prediction advances to success with larger loops, the exclusion of loops containing secondary structure becomes awkward. Here, we extend the applicability of the Protein Local Optimization Program (PLOP) to loops up to 17 residues in length that contain either helical or hairpin segments. In general, PLOP hierarchically samples conformational space and ranks candidate loops with a high-quality molecular mechanics force field. For loops identified to possess α-helical segments, we employ an alternative dihedral library composed of (ϕ,ψ) angles commonly found in helices. The alternative library is searched over a user-specified range of residues that define the helical bounds. The source of these helical bounds can be from popular secondary structure prediction software or from analysis of past loop predictions where a propensity to form a helix is observed. Due to the maturity of our energy model, the lowest energy loop across all experiments can be selected with an accuracy of sub-Ångström RMSD in 80% of cases, 1.0 to 1.5 Å RMSD in 14% of cases, and poorer than 1.5 Å RMSD in 6% of cases. The effectiveness of our current methods in predicting hairpin-containing loops is explored with hairpins up to 13 residues in length and again reaching an accuracy of sub-Ångström RMSD in 83% of cases, 1.0 to 1.5 Å RMSD in 10% of cases, and poorer than 1.5 Å RMSD in 7% of cases. Finally, we explore the effect of an imprecise surrounding environment, in which side chains, but not the backbone, are initially in perturbed geometries. In these cases, loops perturbed to 3Å RMSD from the native environment were restored to their native conformation with sub-Ångström RMSD.
Collapse
Affiliation(s)
- Edward B Miller
- Department of Chemistry, Columbia University, New York, New York
| | | | | | | | | | | | | |
Collapse
|
22
|
Cheng J, Eickholt J, Wang Z, Deng X. Recursive protein modeling: a divide and conquer strategy for Protein Structure Prediction and its case study in CASP9. J Bioinform Comput Biol 2012; 10:1242003. [PMID: 22809379 DOI: 10.1142/s0219720012420036] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
After decades of research, protein structure prediction remains a very challenging problem. In order to address the different levels of complexity of structural modeling, two types of modeling techniques--template-based modeling and template-free modeling--have been developed. Template-based modeling can often generate a moderate- to high-resolution model when a similar, homologous template structure is found for a query protein but fails if no template or only incorrect templates are found. Template-free modeling, such as fragment-based assembly, may generate models of moderate resolution for small proteins of low topological complexity. Seldom have the two techniques been integrated together to improve protein modeling. Here we develop a recursive protein modeling approach to selectively and collaboratively apply template-based and template-free modeling methods to model template-covered (i.e. certain) and template-free (i.e. uncertain) regions of a protein. A preliminary implementation of the approach was tested on a number of hard modeling cases during the 9th Critical Assessment of Techniques for Protein Structure Prediction (CASP9) and successfully improved the quality of modeling in most of these cases. Recursive modeling can significantly reduce the complexity of protein structure modeling and integrate template-based and template-free modeling to improve the quality and efficiency of protein structure prediction.
Collapse
Affiliation(s)
- Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA.
| | | | | | | |
Collapse
|
23
|
Abstract
The prediction of loop structures is considered one of the main challenges in the protein folding problem. Regardless of the dependence of the overall algorithm on the protein data bank, the flexibility of loop regions dictates the need for special attention to their structures. In this article, we present algorithms for loop structure prediction with fixed stem and flexible stem geometry. In the flexible stem geometry problem, only the secondary structure of three stem residues on either side of the loop is known. In the fixed stem geometry problem, the structure of the three stem residues on either side of the loop is also known. Initial loop structures are generated using a probability database for the flexible stem geometry problem, and using torsion angle dynamics for the fixed stem geometry problem. Three rotamer optimization algorithms are introduced to alleviate steric clashes between the generated backbone structures and the side chain rotamers. The structures are optimized by energy minimization using an all-atom force field. The optimized structures are clustered using a traveling salesman problem-based clustering algorithm. The structures in the densest clusters are then utilized to refine dihedral angle bounds on all amino acids in the loop. The entire procedure is carried out for a number of iterations, leading to improved structure prediction and refined dihedral angle bounds. The algorithms presented in this article have been tested on 3190 loops from the PDBSelect25 data set and on targets from the recently concluded CASP9 community-wide experiment.
Collapse
Affiliation(s)
- A. Subramani
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
24
|
St-Pierre JF, Mousseau N. Large loop conformation sampling using the activation relaxation technique, ART-nouveau method. Proteins 2012; 80:1883-94. [PMID: 22488731 DOI: 10.1002/prot.24085] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Revised: 03/19/2011] [Accepted: 03/30/2012] [Indexed: 12/25/2022]
Abstract
We present an adaptation of the ART-nouveau energy surface sampling method to the problem of loop structure prediction. This method, previously used to study protein folding pathways and peptide aggregation, is well suited to the problem of sampling the conformation space of large loops by targeting probable folding pathways instead of sampling exhaustively that space. The number of sampled conformations needed by ART nouveau to find the global energy minimum for a loop was found to scale linearly with the sequence length of the loop for loops between 8 and about 20 amino acids. Considering the linear scaling dependence of the computation cost on the loop sequence length for sampling new conformations, we estimate the total computational cost of sampling larger loops to scale quadratically compared to the exponential scaling of exhaustive search methods.
Collapse
Affiliation(s)
- Jean-François St-Pierre
- Département de Physique and Regroupement Québécois sur les Matériaux de Pointe, Université de Montréal, CP 6128, Succursale Centre-Ville, Montréal, Québec, Canada H3C 3J7
| | | |
Collapse
|
25
|
Liang S, Zhang C, Sarmiento J, Standley DM. Protein Loop Modeling with Optimized Backbone Potential Functions. J Chem Theory Comput 2012; 8:1820-7. [PMID: 26593673 DOI: 10.1021/ct300131p] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We represented protein backbone potential as a Fourier series. The parameters of the backbone dihedral potential were initialized to random values and optimized by Monte Carlo simulations so that generated native-like loop decoys had a lower energy than non-native decoys. The low energy regions of the optimized backbone potential were consistent with observed Ramachandran plots derived from crystal structures. The backbone potential was then used for the prediction of loop conformations (OSCAR-loop) combining with the previously described OSCAR force field, which has been shown to be very accurate in side chain modeling. As a result, the accuracy of OSCAR-loop was improved by local energy minimization based on the complete force field. The average accuracies were 0.40, 0.70, 1.10, 2.08, and 3.58 Å for 4, 6, 8, 10, and 12-residue loops, respectively, with each size being represented by 325 to 2809 targets. The accuracy was better than that of other loop modeling algorithms for short loops (<10 residues). For longer loops, the prediction accuracy was improved by concurrently sampling with a fragment-based method, Spanner. OSCAR-loop is available for download at http://sysimm.ifrec.osaka-u.ac.jp/OSCAR/ .
Collapse
Affiliation(s)
- Shide Liang
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| | - Chi Zhang
- School of Biological Sciences, Center for Plant Science and Innovation, University of Nebraska , Lincoln, Nebraska 68588, United States
| | - Jamica Sarmiento
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| | - Daron M Standley
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University , Suita, Osaka, 565-0871, Japan
| |
Collapse
|
26
|
Gipson B, Hsu D, Kavraki LE, Latombe JC. Computational models of protein kinematics and dynamics: beyond simulation. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2012; 5:273-91. [PMID: 22524225 PMCID: PMC4866812 DOI: 10.1146/annurev-anchem-062011-143024] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Physics-based simulation represents a powerful method for investigating the time-varying behavior of dynamic protein systems at high spatial and temporal resolution. Such simulations, however, can be prohibitively difficult or lengthy for large proteins or when probing the lower-resolution, long-timescale behaviors of proteins generally. Importantly, not all questions about a protein system require full space and time resolution to produce an informative answer. For instance, by avoiding the simulation of uncorrelated, high-frequency atomic movements, a larger, domain-level picture of protein dynamics can be revealed. The purpose of this review is to highlight the growing body of complementary work that goes beyond simulation. In particular, this review focuses on methods that address kinematics and dynamics, as well as those that address larger organizational questions and can quickly yield useful information about the long-timescale behavior of a protein.
Collapse
Affiliation(s)
- Bryant Gipson
- Computer Science Department, Rice University, Houston, Texas 77005, USA.
| | | | | | | |
Collapse
|
27
|
|
28
|
Choi Y, Deane CM. Predicting antibody complementarity determining region structures without classification. MOLECULAR BIOSYSTEMS 2011; 7:3327-34. [PMID: 22011953 DOI: 10.1039/c1mb05223c] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
Antibodies are used extensively in medical and biological research. Their complementarity determining regions (CDRs) define the majority of their antigen binding functionality. CDR structures have been intensively studied and classified (canonical structures). Here we show that CDR structure prediction is no different from the standard loop structure prediction problem and predict them without classification. FREAD, a successful database loop prediction technique, is able to produce accurate predictions for all CDR loops (0.81, 0.42, 0.96, 0.98, 0.88 and 2.25 Å RMSD for CDR-L1 to CDR-H3). In order to overcome the relatively poor predictions of CDR-H3, we developed two variants of FREAD, one focused on sequence similarity (FREAD-S) and another which includes contact information (ConFREAD). Both of the methods improve accuracy for CDR-H3 to 1.34 Å and 1.23 Å respectively. The FREAD variants are also tested on homology models and compared to RosettaAntibody (CDR-H3 prediction on models: 1.98 and 2.62 Å for ConFREAD and RosettaAntibody respectively). CDRs are known to change their structural conformations upon binding the antigen. Traditional CDR classifications are based on sequence similarity and do not account for such environment changes. Using a set of antigen-free and antigen-bound structures, we compared our FREAD variants. ConFREAD which includes contact information successfully discriminates the bound and unbound CDR structures and achieves an accuracy of 1.35 Å for bound structures of CDR-H3.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Statistics, Oxford University, 1 South Parks Road, Oxford OX1 3TG, UK
| | | |
Collapse
|
29
|
Joo H, Chavan AG, Day R, Lennox KP, Sukhanov P, Dahl DB, Vannucci M, Tsai J. Near-native protein loop sampling using nonparametric density estimation accommodating sparcity. PLoS Comput Biol 2011; 7:e1002234. [PMID: 22028638 PMCID: PMC3197639 DOI: 10.1371/journal.pcbi.1002234] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2011] [Accepted: 09/01/2011] [Indexed: 11/29/2022] Open
Abstract
Unlike the core structural elements of a protein like regular secondary structure, template based modeling (TBM) has difficulty with loop regions due to their variability in sequence and structure as well as the sparse sampling from a limited number of homologous templates. We present a novel, knowledge-based method for loop sampling that leverages homologous torsion angle information to estimate a continuous joint backbone dihedral angle density at each loop position. The φ,ψ distributions are estimated via a Dirichlet process mixture of hidden Markov models (DPM-HMM). Models are quickly generated based on samples from these distributions and were enriched using an end-to-end distance filter. The performance of the DPM-HMM method was evaluated against a diverse test set in a leave-one-out approach. Candidates as low as 0.45 Å RMSD and with a worst case of 3.66 Å were produced. For the canonical loops like the immunoglobulin complementarity-determining regions (mean RMSD <2.0 Å), the DPM-HMM method performs as well or better than the best templates, demonstrating that our automated method recaptures these canonical loops without inclusion of any IgG specific terms or manual intervention. In cases with poor or few good templates (mean RMSD >7.0 Å), this sampling method produces a population of loop structures to around 3.66 Å for loops up to 17 residues. In a direct test of sampling to the Loopy algorithm, our method demonstrates the ability to sample nearer native structures for both the canonical CDRH1 and non-canonical CDRH3 loops. Lastly, in the realistic test conditions of the CASP9 experiment, successful application of DPM-HMM for 90 loops from 45 TBM targets shows the general applicability of our sampling method in loop modeling problem. These results demonstrate that our DPM-HMM produces an advantage by consistently sampling near native loop structure. The software used in this analysis is available for download at http://www.stat.tamu.edu/~dahl/software/cortorgles/. A protein's structure consists of elements of regular secondary structure connected by less regular stretches of loop segments. The irregularity of the loop structure makes loop modeling quite challenging. More accurate sampling of these loop conformations has a direct impact on protein modeling, design, function classification, as well as protein interactions. A method has been developed that extends a more comprehensive knowledge-based approach to producing models of the loop regions of protein structure. Most physical models cannot adequately sample the large conformational space, while the more discrete knowledge based libraries are conformationally limited. To address both of these problems, we introduce a novel statistical method that produces a continuous yet weighted estimation of loop conformational space from a discrete library of structures by using a Dirichlet process mixture of hidden Markov models (DPM-HMM). Applied to loop structure sampling, the results of a number of tests demonstrate that our approach quickly generates large numbers of candidates with near native loop conformations. Most significantly, in the cases where the template sampling is sparse and/or far from native conformations, the DPM-HMM method samples close to the native space and produces a population of accurate loop structures.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Archana G. Chavan
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Ryan Day
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - Kristin P. Lennox
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Paul Sukhanov
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
| | - David B. Dahl
- Department of Statistics, Texas A&M University, College Station, Texas, United States of America
| | - Marina Vannucci
- Department of Statistics, Rice University, Houston, Texas, United States of America
| | - Jerry Tsai
- Department of Chemistry, University of the Pacific, Stockton, California, United States of America
- * E-mail:
| |
Collapse
|
30
|
Zhao S, Zhu K, Li J, Friesner RA. Progress in super long loop prediction. Proteins 2011; 79:2920-35. [PMID: 21905115 DOI: 10.1002/prot.23129] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Revised: 05/06/2011] [Accepted: 06/15/2011] [Indexed: 11/07/2022]
Abstract
Sampling errors are very common in super long loop (referring here to loops that have more than thirteen residues) prediction, simply because the sampling space is vast. We have developed a dipeptide segment sampling algorithm to solve this problem. As a first step in evaluating the performance of this algorithm, it was applied to the problem of reconstructing loops in native protein structures. With a newly constructed test set of 89 loops ranging from 14 to 17 residues, this method obtains average/median global backbone root-mean-square deviations (RMSDs) to the native structure (superimposing the body of the protein, not the loop itself) of 1.46/0.68 Å. Specifically, results for loops of various lengths are 1.19/0.67 Å for 36 fourteen-residue loops, 1.55/0.75 Å for 30 fifteen-residue loops, 1.43/0.80 Å for 14 sixteen-residue loops, and 2.30/1.92 Å for nine seventeen-residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that the new sampling method is successful and rarely limits prediction accuracy. Median RMSDs are substantially lower than the averages because of a small number of outliers. The causes of these failures are examined in some detail, and some can be attributed to flaws in the energy function, such as π-π interactions are not accurately accounted for by the OPLS-AA force field we employed in this study. By introducing a new energy model which has a superior description of π-π interactions, significantly better results were achieved for quite a few former outliers. Crystal packing is explicitly included in order to provide a fair comparison with crystal structures.
Collapse
Affiliation(s)
- Suwen Zhao
- Department of Chemistry, Columbia University, New York, New York 1027, USA
| | | | | | | |
Collapse
|
31
|
Ko J, Lee D, Park H, Coutsias EA, Lee J, Seok C. The FALC-Loop web server for protein loop modeling. Nucleic Acids Res 2011; 39:W210-4. [PMID: 21576220 PMCID: PMC3125760 DOI: 10.1093/nar/gkr352] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The FALC-Loop web server provides an online interface for protein loop modeling by employing an ab initio loop modeling method called FALC (fragment assembly and analytical loop closure). The server may be used to construct loop regions in homology modeling, to refine unreliable loop regions in experimental structures or to model segments of designed sequences. The FALC method is computationally less expensive than typical ab initio methods because the conformational search space is effectively reduced by the use of fragments derived from a structure database. The analytical loop closure algorithm allows efficient search for loop conformations that fit into the protein framework starting from the fragment-assembled structures. The FALC method shows prediction accuracy comparable to other state-of-the-art loop modeling methods. Top-ranked model structures can be visualized on the web server, and an ensemble of loop structures can be downloaded for further analysis. The web server can be freely accessed at http://falc-loop.seoklab.org/.
Collapse
Affiliation(s)
- Junsu Ko
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | | | | | | | | | | |
Collapse
|
32
|
Liang S, Zhang C, Standley DM. Protein loop selection using orientation-dependent force fields derived by parameter optimization. Proteins 2011; 79:2260-7. [DOI: 10.1002/prot.23051] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Revised: 03/21/2011] [Accepted: 03/31/2011] [Indexed: 12/25/2022]
|
33
|
Cruz V, Ramos J, Martínez-Salazar J. Water-Mediated Conformations of the Alanine Dipeptide as Revealed by Distributed Umbrella Sampling Simulations, Quantum Mechanics Based Calculations, and Experimental Data. J Phys Chem B 2011; 115:4880-6. [DOI: 10.1021/jp2022727] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Víctor Cruz
- BIOPHYM, Instituto de Estructura de la Materia, CSIC, Serrano 113bis, 28006, Madrid, Spain
| | - Javier Ramos
- BIOPHYM, Instituto de Estructura de la Materia, CSIC, Serrano 113bis, 28006, Madrid, Spain
| | | |
Collapse
|
34
|
Cortés J, Barbe S, Erard M, Siméon T. Encoding molecular motions in voxel maps. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:557-563. [PMID: 20421686 DOI: 10.1109/tcbb.2010.23] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This paper builds on the combination of robotic path planning algorithms and molecular modeling methods for computing large-amplitude molecular motions, and introduces voxel maps as a computational tool to encode and to represent such motions. We investigate several applications and show results that illustrate the interest of such representation.
Collapse
Affiliation(s)
- Juan Cortés
- LAAS-CNRS, 7 avenue du Colonel Roche, F-31077 Toulouse, France.
| | | | | | | |
Collapse
|
35
|
Arnautova YA, Abagyan RA, Totrov M. Development of a new physics-based internal coordinate mechanics force field and its application to protein loop modeling. Proteins 2011; 79:477-98. [PMID: 21069716 PMCID: PMC3057902 DOI: 10.1002/prot.22896] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
We report the development of internal coordinate mechanics force field (ICMFF), new force field parameterized using a combination of experimental data for crystals of small molecules and quantum mechanics calculations. The main features of ICMFF include: (a) parameterization for the dielectric constant relevant to the condensed state (ε = 2) instead of vacuum, (b) an improved description of hydrogen-bond interactions using duplicate sets of van der Waals parameters for heavy atom-hydrogen interactions, and (c) improved backbone covalent geometry and energetics achieved using novel backbone torsional potentials and inclusion of the bond angles at the C(α) atoms into the internal variable set. The performance of ICMFF was evaluated through loop modeling simulations for 4-13 residue loops. ICMFF was combined with a solvent-accessible surface area solvation model optimized using a large set of loop decoys. Conformational sampling was carried out using the biased probability Monte Carlo method. Average/median backbone root-mean-square deviations of the lowest energy conformations from the native structures were 0.25/0.21 Å for four residues loops, 0.84/0.46 Å for eight residue loops, and 1.16/0.73 Å for 12 residue loops. To our knowledge, these results are significantly better than or comparable with those reported to date for any loop modeling method that does not take crystal packing into account. Moreover, the accuracy of our method is on par with the best previously reported results obtained considering the crystal environment. We attribute this success to the high accuracy of the new ICM force field achieved by meticulous parameterization, to the optimized solvent model, and the efficiency of the search method.
Collapse
Affiliation(s)
- Yelena A Arnautova
- Molsoft LLC, 3366 North Torrey Pines Court, Suite 300, La Jolla, California 92037, USA
| | | | | |
Collapse
|
36
|
Abstract
Loop modeling is crucial for high-quality homology model construction outside conserved secondary structure elements. Dozens of loop modeling protocols involving a range of database and ab initio search algorithms and a variety of scoring functions have been proposed. Knowledge-based loop modeling methods are very fast and some can successfully and reliably predict loops up to about eight residues long. Several recent ab initio loop simulation methods can be used to construct accurate models of loops up to 12-13 residues long, albeit at a substantial computational cost. Major current challenges are the simulations of loops longer than 12-13 residues, the modeling of multiple interacting flexible loops, and the sensitivity of the loop predictions to the accuracy of the loop environment.
Collapse
|
37
|
Lee J, Lee D, Park H, Coutsias EA, Seok C. Protein loop modeling by using fragment assembly and analytical loop closure. Proteins 2010; 78:3428-36. [PMID: 20872556 PMCID: PMC2976774 DOI: 10.1002/prot.22849] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Revised: 07/16/2010] [Accepted: 07/31/2010] [Indexed: 12/27/2022]
Abstract
Protein loops are often involved in important biological functions such as molecular recognition, signal transduction, or enzymatic action. The three dimensional structures of loops can provide essential information for understanding molecular mechanisms behind protein functions. In this article, we develop a novel method for protein loop modeling, where the loop conformations are generated by fragment assembly and analytical loop closure. The fragment assembly method reduces the conformational space drastically, and the analytical loop closure method finds the geometrically consistent loop conformations efficiently. We also derive an analytic formula for the gradient of any analytical function of dihedral angles in the space of closed loops. The gradient can be used to optimize various restraints derived from experiments or databases, for example restraints for preferential interactions between specific residues or for preferred backbone angles. We demonstrate that the current loop modeling method outperforms previous methods that employ residue-based torsion angle maps or different loop closure strategies when tested on two sets of loop targets of lengths ranging from 4 to 12.
Collapse
Affiliation(s)
- Julian Lee
- Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea
| | - Dongseon Lee
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Hahnbeom Park
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | - Evangelos A. Coutsias
- Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| |
Collapse
|
38
|
Ramya L, Nehru Viji S, Arun Prasad P, Kanagasabai V, Gautham N. MOLS sampling and its applications in structural biophysics. Biophys Rev 2010; 2:169-179. [PMID: 28510038 DOI: 10.1007/s12551-010-0039-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2010] [Accepted: 10/19/2010] [Indexed: 12/01/2022] Open
Abstract
This review describes the MOLS method and its applications. This computational method has been developed in our laboratory primarily to explore the conformational space of small peptides and identify features of interest, particularly the minima, i.e., the low energy conformations. A systematic "brute-force" search through the vast conformational space for such features faces the insurmountable problem of combinatorial explosion, whilst other techniques, e.g., Monte Carlo searches, are somewhat limited in their region of exploration and may be considered inexhaustive. The MOLS method, on the other hand, uses a sampling technique commonly employed in experimental design theory to identify a small sample of the conformational space that nevertheless retains information about the entire space. The information is extracted using a technique that is a variant of the self-consistent mean field technique, which has been used to identify, for example, the optimal set of side-chain conformations in a protein. Applications of the MOLS method to understand peptide structure, predict the structures of loops in proteins, predict three-dimensional structures of small proteins, and arrive at the best conformation, orientation, and positions of a small molecule ligand in a protein receptor site have all yielded satisfactory results.
Collapse
Affiliation(s)
- L Ramya
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Chennai, 600025, India
| | - Shankaran Nehru Viji
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Chennai, 600025, India
| | - Pandurangan Arun Prasad
- Institute of Structural and Molecular Biology and Crystallography, Department of Biological Sciences, Birkbeck College, University of London, London, UK
| | - Vadivel Kanagasabai
- Department of Orthopaedic Surgery, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Namasivayam Gautham
- Centre of Advanced Study in Crystallography and Biophysics, University of Madras, Chennai, 600025, India.
| |
Collapse
|
39
|
Chen S, Yang Z. Molecular Dynamics Simulations of a β-Hairpin Fragment of Protein G by Means of Atom-Bond Electronegativity Equalization Method Fused into Molecular Mechanics (ABEEMδπ/MM). CHINESE J CHEM 2010. [DOI: 10.1002/cjoc.201090350] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
40
|
Crystal structure of the cysteine protease inhibitor 2 from Entamoeba histolytica: functional convergence of a common protein fold. Gene 2010; 471:45-52. [PMID: 20951777 DOI: 10.1016/j.gene.2010.10.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2010] [Revised: 10/08/2010] [Accepted: 10/08/2010] [Indexed: 11/22/2022]
Abstract
Cysteine proteases (CP) are key pathogenesis and virulence determinants of protozoan parasites. Entamoeba histolytica contains at least 50 cysteine proteases; however, only three (EhCP1, EhCP2 and EhCP5) are responsible for approximately 90% of the cysteine protease activity in this parasite. CPs are expressed as inactive zymogens. Because the processed proteases are potentially cytotoxic, protozoan parasites have developed mechanisms to regulate their activity. Inhibitors of cysteine proteases (ICP) of the chagasin-like inhibitor family (MEROPS family I42) were recently identified in bacteria and protozoan parasites. E. histolytica contains two ICP-encoding genes of the chagasin-like inhibitor family. EhICP1 localizes to the cytosol, whereas EhICP2 is targeted to phagosomes. Herein, we report two crystal structures of EhICP2. The overall structure of EhICP2 consists of eight β-strands and closely resembles the immunoglobulin fold. A comparison between the two crystal forms of EhICP2 indicates that the conserved BC, DE and FG loops form a flexible wedge that may block the active site of CPs. The positively charged surface of the wedge-forming loops in EhICP2 contrasts with the neutral surface of the wedge-forming loops in chagasin. We postulate that the flexibility and positive charge observed in the DE and FG loops of EhICP2 may be important to facilitate the initial binding of this inhibitor to the battery of CPs present in E. histolytica.
Collapse
|
41
|
Danielson ML, Lill MA. New computational method for prediction of interacting protein loop regions. Proteins 2010; 78:1748-59. [PMID: 20186974 DOI: 10.1002/prot.22690] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Flexible loop regions of proteins play a crucial role in many biological functions such as protein-ligand recognition, enzymatic catalysis, and protein-protein association. To date, most computational methods that predict the conformational states of loops only focus on individual loop regions. However, loop regions are often spatially in close proximity to one another and their mutual interactions stabilize their conformations. We have developed a new method, titled CorLps, capable of simultaneously predicting such interacting loop regions. First, an ensemble of individual loop conformations is generated for each loop region. The members of the individual ensembles are combined and are accepted or rejected based on a steric clash filter. After a subsequent side-chain optimization step, the resulting conformations of the interacting loops are ranked by the statistical scoring function DFIRE that originated from protein structure prediction. Our results show that predicting interacting loops with CorLps is superior to sequential prediction of the two interacting loop regions, and our method is comparable in accuracy to single loop predictions. Furthermore, improved predictive accuracy of the top-ranked solution is achieved for 12-residue length loop regions by diversifying the initial pool of individual loop conformations using a quality threshold clustering algorithm.
Collapse
Affiliation(s)
- Matthew L Danielson
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
42
|
Cortés J, Le DT, Iehl R, Siméon T. Simulating ligand-induced conformational changes in proteins using a mechanical disassembly method. Phys Chem Chem Phys 2010; 12:8268-76. [PMID: 20526495 DOI: 10.1039/c002811h] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Simulating protein conformational changes induced or required by the internal diffusion of a ligand is important for the understanding of their interaction mechanisms. Such simulations are challenging for currently available computational methods. In this paper, the problem is formulated as a mechanical disassembly problem where the protein and the ligand are modeled like articulated mechanisms, and an efficient method for computing molecular disassembly paths is described. The method extends recent techniques developed in the framework of robot motion planning. Results illustrating the capacities of the approach are presented on two biologically interesting systems involving ligand-induced conformational changes: lactose permease (LacY), and the beta(2)-adrenergic receptor.
Collapse
Affiliation(s)
- Juan Cortés
- CNRS, LAAS, 7 avenue du colonel Roche, F-31077 Toulouse, France.
| | | | | | | |
Collapse
|
43
|
Choi Y, Deane CM. FREAD revisited: Accurate loop structure prediction using a database search algorithm. Proteins 2010; 78:1431-40. [PMID: 20034110 DOI: 10.1002/prot.22658] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Loops are the most variable regions of protein structure and are, in general, the least accurately predicted. Their prediction has been approached in two ways, ab initio and database search. In recent years, it has been thought that ab initio methods are more powerful. In light of the continued rapid expansion in the number of known protein structures, we have re-evaluated FREAD, a database search method and demonstrate that the power of database search methods may have been underestimated. We found that sequence similarity as quantified by environment specific substitution scores can be used to significantly improve prediction. In fact, FREAD performs appreciably better for an identifiable subset of loops (two thirds of shorter loops and half of the longer loops tested) than the ab initio methods of MODELLER, PLOP, and RAPPER. Within this subset, FREAD's predictive ability is length independent, in general, producing results within 2A RMSD, compared to an average of over 10A for loop length 20 for any of the other tested methods. We also benchmarked the prediction protocols on a set of 212 loops from the model structures in CASP 7 and 8. An extended version of FREAD is able to make predictions for 127 of these, it gives the best prediction of the methods tested in 61 of these cases. In examining FREAD's ability to predict in the model environment, we found that whole structure quality did not affect the quality of loop predictions.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Statistics, Oxford University, United Kingdom.
| | | |
Collapse
|
44
|
Lin YL, Gao J. Internal proton transfer in the external pyridoxal 5'-phosphate Schiff base in dopa decarboxylase. Biochemistry 2010; 49:84-94. [PMID: 19938875 DOI: 10.1021/bi901790e] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Combined quantum mechanical and molecular mechanical (QM/MM) simulations of dopa decarboxylase have been carried out to elucidate the factors that contribute to the tautomeric equilibrium of the intramolecular proton transfer in the external PLP-L-dopa Schiff base. The presence of a carboxylate anion on the alpha-carbon of the Schiff base stabilizes the zwitterions and shifts the equilibrium in favor of the oxoenamine tautomer (protonated Schiff base). Moreover, protonation of the PLP pyridine nitrogen further drives the equilibrium toward the oxoenamine direction. On the other hand, solvent effects favor the hydroxyimine configuration, although the equilibrium favors the oxoenamine isomer with a methyl group as the substituent on the imino nitrogen. In dopa decarboxylase, the hydroxyimine form of the PLP(H+)-L-dopa Schiff base is predicted to be the major isomer with a relative free energy of -1.3 kcal/mol over that of the oxoenamine isomer. Both Asp271 and Lys303 stabilize the hydroxyimine configuration through hydrogen-bonding interactions with the pyridine nitrogen of the PLP and the imino nitrogen of the Schiff base, respectively. Interestingly, Thr246 plays a double role in the intramolecular proton transfer process, in which it initially donates a hydrogen bond to the phenolate oxygen in the oxoenamine configuration and then switches to a hydrogen bond acceptor from the phenolic hydroxyl group in the hydroxyimine tautomer.
Collapse
Affiliation(s)
- Yen-lin Lin
- Department of Chemistry and Digital Technology Center, Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | | |
Collapse
|
45
|
Abstract
Functional characterization of a protein is often facilitated by its 3D structure. However, the fraction of experimentally known 3D models is currently less than 1% due to the inherently time-consuming and complicated nature of structure determination techniques. Computational approaches are employed to bridge the gap between the number of known sequences and that of 3D models. Template-based protein structure modeling techniques rely on the study of principles that dictate the 3D structure of natural proteins from the theory of evolution viewpoint. Strategies for template-based structure modeling will be discussed with a focus on comparative modeling, by reviewing techniques available for all the major steps involved in the comparative modeling pipeline.
Collapse
Affiliation(s)
- Andras Fiser
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, NY, USA
| |
Collapse
|
46
|
Liu P, Zhu F, Rassokhin DN, Agrafiotis DK. A self-organizing algorithm for modeling protein loops. PLoS Comput Biol 2009; 5:e1000478. [PMID: 19696883 PMCID: PMC2719875 DOI: 10.1371/journal.pcbi.1000478] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2009] [Accepted: 07/20/2009] [Indexed: 11/19/2022] Open
Abstract
Protein loops, the flexible short segments connecting two stable secondary
structural units in proteins, play a critical role in protein structure and
function. Constructing chemically sensible conformations of protein loops that
seamlessly bridge the gap between the anchor points without introducing any
steric collisions remains an open challenge. A variety of algorithms have been
developed to tackle the loop closure problem, ranging from inverse kinematics to
knowledge-based approaches that utilize pre-existing fragments extracted from
known protein structures. However, many of these approaches focus on the
generation of conformations that mainly satisfy the fixed end point condition,
leaving the steric constraints to be resolved in subsequent post-processing
steps. In the present work, we describe a simple solution that simultaneously
satisfies not only the end point and steric conditions, but also chirality and
planarity constraints. Starting from random initial atomic coordinates, each
individual conformation is generated independently by using a simple alternating
scheme of pairwise distance adjustments of randomly chosen atoms, followed by
fast geometric matching of the conformationally rigid components of the
constituent amino acids. The method is conceptually simple, numerically stable
and computationally efficient. Very importantly, additional constraints, such as
those derived from NMR experiments, hydrogen bonds or salt bridges, can be
incorporated into the algorithm in a straightforward and inexpensive way, making
the method ideal for solving more complex multi-loop problems. The remarkable
performance and robustness of the algorithm are demonstrated on a set of protein
loops of length 4, 8, and 12 that have been used in previous studies. Protein loops play an important role in protein function, such as ligand binding,
recognition, and allosteric regulation. However, due to their flexibility, it is
notoriously difficult to determine their 3D structures using traditional
experimental techniques. As a result, one can often find protein structures with
missing loops in the Protein Data Bank. Their sequence variability also presents
a particular challenge for homology modeling methods, which can only yield good
overall structures given sufficient sequence identity and good experimental
reference structures. Despite extensive research, the construction of protein
loop 3D structures remains an open problem, since a sensible conformation should
seamlessly bridge the anchor points without introducing steric clashes within
the loop itself or between the loop and its surroundings environment. Here, we
present a conceptually simple, mathematically straightforward, numerically
robust and computationally efficient approach for building protein loop
conformations that simultaneously satisfy end-point, steric, planar and chiral
constraints. More importantly, additional constraints derived from experimental
sources can be incorporated in a straightforward manner, allowing the processing
of more complex structures involving multiple interlocking loops.
Collapse
Affiliation(s)
- Pu Liu
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
- * E-mail: (PL); (DKA)
| | - Fangqiang Zhu
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
| | - Dmitrii N. Rassokhin
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
| | - Dimitris K. Agrafiotis
- Johnson & Johnson Pharmaceutical Research and Development, Exton,
Pennsylvania, United States of America
- * E-mail: (PL); (DKA)
| |
Collapse
|
47
|
Montalvao RW, Cavalli A, Salvatella X, Blundell TL, Vendruscolo M. Structure Determination of Protein−Protein Complexes Using NMR Chemical Shifts: Case of an Endonuclease Colicin−Immunity Protein Complex. J Am Chem Soc 2008; 130:15990-6. [DOI: 10.1021/ja805258z] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Rinaldo W. Montalvao
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, U.K
| | - Andrea Cavalli
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, U.K
| | - Xavier Salvatella
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, U.K
| | - Tom L. Blundell
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, U.K
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K., and Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1GA, U.K
| |
Collapse
|
48
|
Yao P, Dhanik A, Marz N, Propper R, Kou C, Liu G, van den Bedem H, Latombe JC, Halperin-Landsberg I, Altman RB. Efficient algorithms to explore conformation spaces of flexible protein loops. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:534-45. [PMID: 18989041 PMCID: PMC2794838 DOI: 10.1109/tcbb.2008.96] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Several applications in biology - e.g., incorporation of protein flexibility in ligand docking algorithms, interpretation of fuzzy X-ray crystallographic data, and homology modeling - require computing the internal parameters of a flexible fragment (usually, a loop) of a protein in order to connect its termini to the rest of the protein without causing any steric clash. One must often sample many such conformations in order to explore and adequately represent the conformational range of the studied loop. While sampling must be fast, it is made difficult by the fact that two conflicting constraints - kinematic closure and clash avoidance - must be satisfied concurrently. This paper describes two efficient and complementary sampling algorithms to explore the space of closed clash-free conformations of a flexible protein loop. The "seed sampling" algorithm samples broadly from this space, while the "deformation sampling" algorithm uses seed conformations as starting points to explore the conformation space around them at a finer grain. Computational results are presented for various loops ranging from 5 to 25 residues. More specific results also show that the combination of the sampling algorithms with a functional site prediction software (FEATURE) makes it possible to compute and recognize calcium-binding loop conformations. The sampling algorithms are implemented in a toolkit (LoopTK), which is available at https://simtk.org/home/looptk.
Collapse
Affiliation(s)
- Peggy Yao
- The Computer Science and Biomedical Informatics Departments, Stanford University, S240 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Ankur Dhanik
- The Computer Science and Mechanical Engineering Departments, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Nathan Marz
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Ryan Propper
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Charles Kou
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Guanfeng Liu
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Henry van den Bedem
- The Stanford Linear Accelerator Center, SSRL/Joint Center for Structural Genomics, MS 69, 2575 Sand Hill Road, Menlo Park, CA 94025.
| | - Jean-Claude Latombe
- The Computer Science Department, Stanford University, S245 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Inbal Halperin-Landsberg
- The Department of Genetics, Stanford University, S240 Clark Center, 318 Campus Drive, Stanford, CA 94305.
| | - Russ Biagio Altman
- The Department of Bioengineering, Stanford University, 318 Campus Drive S172, Stanford, CA 94305-5444.
| |
Collapse
|
49
|
Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using MODELLER. ACTA ACUST UNITED AC 2008; Chapter 2:Unit 2.9. [PMID: 18429317 DOI: 10.1002/0471140864.ps0209s50] [Citation(s) in RCA: 754] [Impact Index Per Article: 47.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Functional characterization of a protein sequence is a common goal in biology, and is usually facilitated by having an accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Narayanan Eswar
- University of California at San Francisco, San Francisco, California, USA
| | | | | | | | | | | | | | | |
Collapse
|
50
|
Abstract
MOTIVATION The 3D structure of a protein sequence can be assembled from the substructures corresponding to small segments of this sequence. For each small sequence segment, there are only a few more likely substructures. We call them the 'structural alphabet' for this segment. Classical approaches such as ROSETTA used sequence profile and secondary structure information, to predict structural fragments. In contrast, we utilize more structural information, such as solvent accessibility and contact capacity, for finding structural fragments. RESULTS Integer linear programming technique is applied to derive the best combination of these sequence and structural information items. This approach generates significantly more accurate and succinct structural alphabets with more than 50% improvement over the previous accuracies. With these novel structural alphabets, we are able to construct more accurate protein structures than the state-of-art ab initio protein structure prediction programs such as ROSETTA. We are also able to reduce the Kolodny's library size by a factor of 8, at the same accuracy. AVAILABILITY The online FRazor server is under construction.
Collapse
Affiliation(s)
- Shuai Cheng Li
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ont. N2L 3G1, Canada.
| | | | | | | | | |
Collapse
|