1
|
Stevenson GA, Kirshner D, Bennion BJ, Yang Y, Zhang X, Zemla A, Torres MW, Epstein A, Jones D, Kim H, Bennett WFD, Wong SE, Allen JE, Lightstone FC. Clustering Protein Binding Pockets and Identifying Potential Drug Interactions: A Novel Ligand-Based Featurization Method. J Chem Inf Model 2023; 63:6655-6666. [PMID: 37847557 PMCID: PMC10647021 DOI: 10.1021/acs.jcim.3c00722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Indexed: 10/18/2023]
Abstract
Protein-ligand interactions are essential to drug discovery and drug development efforts. Desirable on-target or multitarget interactions are the first step in finding an effective therapeutic, while undesirable off-target interactions are the first step in assessing safety. In this work, we introduce a novel ligand-based featurization and mapping of human protein pockets to identify closely related protein targets and to project novel drugs into a hybrid protein-ligand feature space to identify their likely protein interactions. Using structure-based template matches from PDB, protein pockets are featured by the ligands that bind to their best co-complex template matches. The simplicity and interpretability of this approach provide a granular characterization of the human proteome at the protein-pocket level instead of the traditional protein-level characterization by family, function, or pathway. We demonstrate the power of this featurization method by clustering a subset of the human proteome and evaluating the predicted cluster associations of over 7000 compounds.
Collapse
Affiliation(s)
- Garrett A. Stevenson
- Computational
Engineering Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Dan Kirshner
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Brian J. Bennion
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Yue Yang
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Xiaohua Zhang
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Adam Zemla
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Marisa W. Torres
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Aidan Epstein
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Derek Jones
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
- Department
of Computer Science and Engineering, University
of California, San Diego, La Jolla, California 92093, United States
| | - Hyojin Kim
- Center
for Applied Scientific Computing, Lawrence
Livermore National Laboratory, Livermore, California 94550, United States
| | - W. F. Drew Bennett
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Sergio E. Wong
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| | - Jonathan E. Allen
- Global
Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Felice C. Lightstone
- Biosciences
and Biotechnology Division, Lawrence Livermore
National Laboratory, Livermore, California 94550, United States
| |
Collapse
|
2
|
Moussad B, Roche R, Bhattacharya D. The transformative power of transformers in protein structure prediction. Proc Natl Acad Sci U S A 2023; 120:e2303499120. [PMID: 37523536 PMCID: PMC10410766 DOI: 10.1073/pnas.2303499120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 06/27/2023] [Indexed: 08/02/2023] Open
Abstract
Transformer neural networks have revolutionized structural biology with the ability to predict protein structures at unprecedented high accuracy. Here, we report the predictive modeling performance of the state-of-the-art protein structure prediction methods built on transformers for 69 protein targets from the recently concluded 15th Critical Assessment of Structure Prediction (CASP15) challenge. Our study shows the power of transformers in protein structure modeling and highlights future areas of improvement.
Collapse
Affiliation(s)
- Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, VA24061
| | | | | |
Collapse
|
3
|
Sandholtz SH, Drocco JA, Zemla AT, Torres MW, Silva MS, Allen JE. A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2. ACS OMEGA 2023; 8:21871-21884. [PMID: 37309388 PMCID: PMC10254058 DOI: 10.1021/acsomega.3c01621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 05/17/2023] [Indexed: 06/14/2023]
Abstract
Minimizing the human and economic costs of the COVID-19 pandemic and future pandemics requires the ability to develop and deploy effective treatments for novel pathogens as soon as possible after they emerge. To this end, we introduce a new computational pipeline for the rapid identification and characterization of binding sites in viral proteins along with the key chemical features, which we call chemotypes, of the compounds predicted to interact with those same sites. The composition of source organisms for the structural models associated with an individual binding site is used to assess the site's degree of structural conservation across different species, including other viruses and humans. We propose a search strategy for novel therapeutics that involves the selection of molecules preferentially containing the most structurally rich chemotypes identified by our algorithm. While we demonstrate the pipeline on SARS-CoV-2, it is generalizable to any new virus, as long as either experimentally solved structures for its proteins are available or sufficiently accurate predicted structures can be constructed.
Collapse
Affiliation(s)
- Sarah H. Sandholtz
- Biosciences
and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| | - Jeffrey A. Drocco
- Biosciences
and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| | - Adam T. Zemla
- Global
Security Computing Applications Division, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| | - Marisa W. Torres
- Global
Security Computing Applications Division, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| | - Mary S. Silva
- Global
Security Computing Applications Division, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| | - Jonathan E. Allen
- Global
Security Computing Applications Division, Computing Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
of America
| |
Collapse
|
4
|
Zemla AT, Allen JE, Kirshner D, Lightstone FC. PDBspheres: a method for finding 3D similarities in local regions in proteins. NAR Genom Bioinform 2022; 4:lqac078. [PMID: 36225529 PMCID: PMC9549786 DOI: 10.1093/nargab/lqac078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 08/06/2022] [Accepted: 09/29/2022] [Indexed: 11/05/2022] Open
Abstract
We present a structure-based method for finding and evaluating structural similarities in protein regions relevant to ligand binding. PDBspheres comprises an exhaustive library of protein structure regions ('spheres') adjacent to complexed ligands derived from the Protein Data Bank (PDB), along with methods to find and evaluate structural matches between a protein of interest and spheres in the library. PDBspheres uses the LGA (Local-Global Alignment) structure alignment algorithm as the main engine for detecting structural similarities between the protein of interest and template spheres from the library, which currently contains >2 million spheres. To assess confidence in structural matches, an all-atom-based similarity metric takes side chain placement into account. Here, we describe the PDBspheres method, demonstrate its ability to detect and characterize binding sites in protein structures, show how PDBspheres-a strictly structure-based method-performs on a curated dataset of 2528 ligand-bound and ligand-free crystal structures, and use PDBspheres to cluster pockets and assess structural similarities among protein binding sites of 4876 structures in the 'refined set' of the PDBbind 2019 dataset.
Collapse
Affiliation(s)
- Adam T Zemla
- To whom correspondence should be addressed. Tel: +1 925 423 5571; Fax: +1 925 423 6437;
| | - Jonathan E Allen
- Global Security Computing Applications, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Dan Kirshner
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| | - Felice C Lightstone
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA
| |
Collapse
|
5
|
Kinch LN, Pei J, Kryshtafovych A, Schaeffer RD, Grishin NV. Topology evaluation of models for difficult targets in the 14th round of the critical assessment of protein structure prediction. Proteins 2021; 89:1673-1686. [PMID: 34240477 DOI: 10.1002/prot.26172] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Revised: 06/28/2021] [Accepted: 07/01/2021] [Indexed: 12/25/2022]
Abstract
This report describes the tertiary structure prediction assessment of difficult modeling targets in the 14th round of the Critical Assessment of Structure Prediction (CASP14). We implemented an official ranking scheme that used the same scores as the previous CASP topology-based assessment, but combined these scores with one that emphasized physically realistic models. The top performing AlphaFold2 group outperformed the rest of the prediction community on all but two of the difficult targets considered in this assessment. They provided high quality models for most of the targets (86% over GDT_TS 70), including larger targets above 150 residues, and they correctly predicted the topology of almost all the rest. AlphaFold2 performance was followed by two manual Baker methods, a Feig method that refined Zhang-server models, two notable automated Zhang server methods (QUARK and Zhang-server), and a Zhang manual group. Despite the remarkable progress in protein structure prediction of difficult targets, both the prediction community and AlphaFold2, to a lesser extent, faced challenges with flexible regions and obligate oligomeric assemblies. The official ranking of top-performing methods was supported by performance generated PCA and heatmap clusters that gave insight into target difficulties and the most successful state-of-the-art structure prediction methodologies.
Collapse
Affiliation(s)
- Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Jimin Pei
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | | | - R Dustin Schaeffer
- Department of Biophysics and Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, USA.,Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, USA.,Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas, USA
| |
Collapse
|
6
|
Toth JM, DePietro PJ, Haas J, McLaughlin WA. ResiRole: residue-level functional site predictions to gauge the accuracies of protein structure prediction techniques. Bioinformatics 2021; 37:351-359. [PMID: 32780798 PMCID: PMC8058773 DOI: 10.1093/bioinformatics/btaa712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 07/31/2020] [Accepted: 08/05/2020] [Indexed: 11/25/2022] Open
Abstract
Motivation Methods to assess the quality of protein structure models are needed for user applications. To aid with the selection of structure models and further inform the development of structure prediction techniques, we describe the ResiRole method for the assessment of the quality of structure models. Results Structure prediction techniques are ranked according to the results of round-robin, head-to-head comparisons using difference scores. Each difference score was defined as the absolute value of the cumulative probability for a functional site prediction made with the FEATURE program for the reference structure minus that for the structure model. Overall, the difference scores correlate well with other model quality metrics; and based on benchmarking studies with NaïveBLAST, they are found to detect additional local structural similarities between the structure models and reference structures. Availabilityand implementation Automated analyses of models addressed in CAMEO are available via the ResiRole server, URL http://protein.som.geisinger.edu/ResiRole/. Interactive analyses with user-provided models and reference structures are also enabled. Code is available at github.com/wamclaughlin/ResiRole. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joshua M Toth
- Department of Medical Education, Geisinger Commonwealth School of Medicine, Scranton, PA 18510, USA
| | - Paul J DePietro
- Department of Medical Education, Geisinger Commonwealth School of Medicine, Scranton, PA 18510, USA
| | - Juergen Haas
- Biozentrum, University of Basel and SIB Swiss Institute of Bioinformatics, CH-4056 Basel, Switzerland
| | - William A McLaughlin
- Department of Medical Education, Geisinger Commonwealth School of Medicine, Scranton, PA 18510, USA
| |
Collapse
|
7
|
Jones D, Kim H, Zhang X, Zemla A, Stevenson G, Bennett WFD, Kirshner D, Wong SE, Lightstone FC, Allen JE. Improved Protein-Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference. J Chem Inf Model 2021; 61:1583-1592. [PMID: 33754707 DOI: 10.1021/acs.jcim.0c01306] [Citation(s) in RCA: 93] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Predicting accurate protein-ligand binding affinities is an important task in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the application of deep convolutional and graph neural network-based approaches, it remains unclear what the relative advantages of each approach are and how they compare with physics-based methodologies that have found more mainstream success in virtual screening pipelines. We present fusion models that combine features and inference from complementary representations to improve binding affinity prediction. This, to our knowledge, is the first comprehensive study that uses a common series of evaluations to directly compare the performance of three-dimensional (3D)-convolutional neural networks (3D-CNNs), spatial graph neural networks (SG-CNNs), and their fusion. We use temporal and structure-based splits to assess performance on novel protein targets. To test the practical applicability of our models, we examine their performance in cases that assume that the crystal structure is not available. In these cases, binding free energies are predicted using docking pose coordinates as the inputs to each model. In addition, we compare these deep learning approaches to predictions based on docking scores and molecular mechanic/generalized Born surface area (MM/GBSA) calculations. Our results show that the fusion models make more accurate predictions than their constituent neural network models as well as docking scoring and MM/GBSA rescoring, with the benefit of greater computational efficiency than the MM/GBSA method. Finally, we provide the code to reproduce our results and the parameter files of the trained models used in this work. The software is available as open source at https://github.com/llnl/fast. Model parameter files are available at ftp://gdo-bioinformatics.ucllnl.org/fast/pdbbind2016_model_checkpoints/.
Collapse
Affiliation(s)
- Derek Jones
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Hyojin Kim
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Xiaohua Zhang
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Adam Zemla
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Garrett Stevenson
- Computational Engineering Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - W F Drew Bennett
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Daniel Kirshner
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Sergio E Wong
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Felice C Lightstone
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| | - Jonathan E Allen
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, 7000 East Avenue, Livermore, California 94550, United States
| |
Collapse
|
8
|
Kulik M, Mori T, Sugita Y. Multi-Scale Flexible Fitting of Proteins to Cryo-EM Density Maps at Medium Resolution. Front Mol Biosci 2021; 8:631854. [PMID: 33842541 PMCID: PMC8025875 DOI: 10.3389/fmolb.2021.631854] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Accepted: 01/26/2021] [Indexed: 11/13/2022] Open
Abstract
Structure determination using cryo-electron microscopy (cryo-EM) medium-resolution density maps is often facilitated by flexible fitting. Avoiding overfitting, adjusting force constants driving the structure to the density map, and emulating complex conformational transitions are major concerns in the fitting. To address them, we develop a new method based on a three-step multi-scale protocol. First, flexible fitting molecular dynamics (MD) simulations with coarse-grained structure-based force field and replica-exchange scheme between different force constants replicas are performed. Second, fitted Cα atom positions guide the all-atom structure in targeted MD. Finally, the all-atom flexible fitting refinement in implicit solvent adjusts the positions of the side chains in the density map. Final models obtained via the multi-scale protocol are significantly better resolved and more reliable in comparison with long all-atom flexible fitting simulations. The protocol is useful for multi-domain systems with intricate structural transitions as it preserves the secondary structure of single domains.
Collapse
Affiliation(s)
- Marta Kulik
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan
| | - Takaharu Mori
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako-shi, Japan.,RIKEN Center for Computational Science, Kobe, Japan.,RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| |
Collapse
|
9
|
DeepTracer for fast de novo cryo-EM protein structure modeling and special studies on CoV-related complexes. Proc Natl Acad Sci U S A 2021; 118:2017525118. [PMID: 33361332 PMCID: PMC7812826 DOI: 10.1073/pnas.2017525118] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Electron cryomicroscopy (cryo-EM), a 2017 Nobel prize-awarded technology, provides direct 3D maps of macromolecules and explains the shape and interactions of protein complexes such as SARS-CoV-2 viral proteins and human cell receptors. This understanding can be combined with detailed structural information gathered using other technologies to form the basis for modeling course of diseases and for designing therapeutic drugs. However, ab initio modeling of protein complex structure remains a challenging problem. Here, we present DeepTracer, a fully automated and robust tool that determines the all-atom structure of a protein complex based solely on its cryo-EM map and amino acid sequence, with improved accuracy and efficiency compared to previous methods. We also provide a web service for global access. Information about macromolecular structure of protein complexes and related cellular and molecular mechanisms can assist the search for vaccines and drug development processes. To obtain such structural information, we present DeepTracer, a fully automated deep learning-based method for fast de novo multichain protein complex structure determination from high-resolution cryoelectron microscopy (cryo-EM) maps. We applied DeepTracer on a previously published set of 476 raw experimental cryo-EM maps and compared the results with a current state of the art method. The residue coverage increased by over 30% using DeepTracer, and the rmsd value improved from 1.29 Å to 1.18 Å. Additionally, we applied DeepTracer on a set of 62 coronavirus-related cryo-EM maps, among them 10 with no deposited structure available in EMDataResource. We observed an average residue match of 84% with the deposited structures and an average rmsd of 0.93 Å. Additional tests with related methods further exemplify DeepTracer’s competitive accuracy and efficiency of structure modeling. DeepTracer allows for exceptionally fast computations, making it possible to trace around 60,000 residues in 350 chains within only 2 h. The web service is globally accessible at https://deeptracer.uw.edu.
Collapse
|
10
|
Fowler NJ, Sljoka A, Williamson MP. A method for validating the accuracy of NMR protein structures. Nat Commun 2020; 11:6321. [PMID: 33339822 PMCID: PMC7749147 DOI: 10.1038/s41467-020-20177-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 11/13/2020] [Indexed: 01/13/2023] Open
Abstract
We present a method that measures the accuracy of NMR protein structures. It compares random coil index [RCI] against local rigidity predicted by mathematical rigidity theory, calculated from NMR structures [FIRST], using a correlation score (which assesses secondary structure), and an RMSD score (which measures overall rigidity). We test its performance using: structures refined in explicit solvent, which are much better than unrefined structures; decoy structures generated for 89 NMR structures; and conventional predictors of accuracy such as number of restraints per residue, restraint violations, energy of structure, ensemble RMSD, Ramachandran distribution, and clashscore. Restraint violations and RMSD are poor measures of accuracy. Comparisons of NMR to crystal structures show that secondary structure is equally accurate, but crystal structures are typically too rigid in loops, whereas NMR structures are typically too floppy overall. We show that the method is a useful addition to existing measures of accuracy.
Collapse
Affiliation(s)
- Nicholas J Fowler
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK
| | - Adnan Sljoka
- RIKEN Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo, 103-0027, Japan.
- Dept of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, ON, L5L 1C6, Canada.
| | - Mike P Williamson
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK.
| |
Collapse
|
11
|
Amera GM, Khan RJ, Pathak A, Jha RK, Muthukumaran J, Singh AK. Screening of promising molecules against MurG as drug target in multi-drug-resistant-Acinetobacter baumannii - insights from comparative protein modeling, molecular docking and molecular dynamics simulation. J Biomol Struct Dyn 2019; 38:5230-5252. [DOI: 10.1080/07391102.2019.1700167] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Affiliation(s)
- Gizachew Muluneh Amera
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, Uttar Pradesh, India
| | - Rameez Jabeer Khan
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, Uttar Pradesh, India
| | - Amita Pathak
- Department of Chemistry, Indian Institute of Technology Delhi, New Delhi, India
| | - Rajat Kumar Jha
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, Uttar Pradesh, India
| | - Jayaraman Muthukumaran
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, Uttar Pradesh, India
| | - Amit Kumar Singh
- Department of Biotechnology, School of Engineering and Technology, Sharda University, Greater Noida, Uttar Pradesh, India
| |
Collapse
|
12
|
Croll TI, Sammito MD, Kryshtafovych A, Read RJ. Evaluation of template-based modeling in CASP13. Proteins 2019; 87:1113-1127. [PMID: 31407380 PMCID: PMC6851432 DOI: 10.1002/prot.25800] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 07/29/2019] [Accepted: 08/08/2019] [Indexed: 12/12/2022]
Abstract
Performance in the template‐based modeling (TBM) category of CASP13 is assessed here, using a variety of metrics. Performance of the predictor groups that participated is ranked using the primary ranking score that was developed by the assessors for CASP12. This reveals that the best results are obtained by groups that include contact predictions or inter‐residue distance predictions derived from deep multiple sequence alignments. In cases where there is a good homolog in the wwPDB (TBM‐easy category), the best results are obtained by modifying a template. However, for cases with poorer homologs (TBM‐hard), very good results can be obtained without using an explicit template, by deep learning algorithms trained on the wwPDB. Alternative metrics are introduced, to allow testing of aspects of structural models that are not addressed by traditional CASP metrics. These include comparisons to the main‐chain and side‐chain torsion angles of the target, and the utility of models for solving crystal structures by the molecular replacement method. The alternative metrics are poorly correlated with the traditional metrics, and it is proposed that modeling has reached a sufficient level of maturity that the best models should be expected to satisfy this wider range of criteria.
Collapse
Affiliation(s)
- Tristan I Croll
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| | - Massimo D Sammito
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| | | | - Randy J Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| |
Collapse
|
13
|
Heo L, Arbour CF, Feig M. Driven to near-experimental accuracy by refinement via molecular dynamics simulations. Proteins 2019; 87:1263-1275. [PMID: 31197841 DOI: 10.1002/prot.25759] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 06/01/2019] [Accepted: 06/07/2019] [Indexed: 12/17/2022]
Abstract
Protein model refinement has been an essential part of successful protein structure prediction. Molecular dynamics simulation-based refinement methods have shown consistent improvement of protein models. There had been progress in the extent of refinement for a few years since the idea of ensemble averaging of sampled conformations emerged. There was little progress in CASP12 because conformational sampling was not sufficiently diverse due to harmonic restraints. During CASP13, a new refinement method was tested that achieved significant improvements over CASP12. The new method intended to address previous bottlenecks in the refinement problem by introducing new features. Flat-bottom harmonic restraints replaced harmonic restraints, sampling was performed iteratively, and a new scoring function and selection criteria were used. The new protocol expanded conformational sampling at reduced computational costs. In addition to overall improvements, some models were refined significantly to near-experimental accuracy.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan
| | - Collin F Arbour
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan
| |
Collapse
|
14
|
Structure based in-silico study on UDP-N-acetylmuramoyl-L-alanyl-D-glutamate-2,6-diaminopimelate ligase (MurE) from Acinetobacter baumannii as a drug target against nosocomial infections. INFORMATICS IN MEDICINE UNLOCKED 2019. [DOI: 10.1016/j.imu.2019.100216] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
|
15
|
Kryshtafovych A, Adams PD, Lawson CL, Chiu W. Evaluation system and web infrastructure for the second cryo-EM model challenge. J Struct Biol 2018; 204:96-108. [PMID: 30017700 DOI: 10.1016/j.jsb.2018.07.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 07/06/2018] [Accepted: 07/10/2018] [Indexed: 01/01/2023]
Abstract
An evaluation system and a web infrastructure were developed for the second cryo-EM model challenge. The evaluation system includes tools to validate stereo-chemical plausibility of submitted models, check their fit to the corresponding density maps, estimate their overall and per-residue accuracy, and assess their similarity to reference cryo-EM or X-ray structures as well as other models submitted in this challenge. The web infrastructure provides a convenient interface for analyzing models at different levels of detail. It includes interactively sortable tables of evaluation scores for different subsets of models and different sublevels of structure organization, and a suite of visualization tools facilitating model analysis. The results are publicly accessible at http://model-compare.emdatabank.org.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA.
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging, LBNL, CA 94720, USA; Department of Bioengineering, University of California Berkeley, CA 94720, USA
| | - Catherine L Lawson
- Institute for Quantitative Biomedicine and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Wah Chiu
- Departments of Bioengineering and Microbiology & Immunology, Stanford University, Stanford, CA 94305-5447, USA; Division of CryoEM and Bioimaging, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| |
Collapse
|
16
|
Williams CJ, Headd JJ, Moriarty NW, Prisant MG, Videau LL, Deis LN, Verma V, Keedy DA, Hintze BJ, Chen VB, Jain S, Lewis SM, Arendall WB, Snoeyink J, Adams PD, Lovell SC, Richardson JS, Richardson DC. MolProbity: More and better reference data for improved all-atom structure validation. Protein Sci 2018; 27:293-315. [PMID: 29067766 PMCID: PMC5734394 DOI: 10.1002/pro.3330] [Citation(s) in RCA: 2364] [Impact Index Per Article: 394.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 10/19/2017] [Accepted: 10/23/2017] [Indexed: 12/27/2022]
Abstract
This paper describes the current update on macromolecular model validation services that are provided at the MolProbity website, emphasizing changes and additions since the previous review in 2010. There have been many infrastructure improvements, including rewrite of previous Java utilities to now use existing or newly written Python utilities in the open-source CCTBX portion of the Phenix software system. This improves long-term maintainability and enhances the thorough integration of MolProbity-style validation within Phenix. There is now a complete MolProbity mirror site at http://molprobity.manchester.ac.uk. GitHub serves our open-source code, reference datasets, and the resulting multi-dimensional distributions that define most validation criteria. Coordinate output after Asn/Gln/His "flip" correction is now more idealized, since the post-refinement step has apparently often been skipped in the past. Two distinct sets of heavy-atom-to-hydrogen distances and accompanying van der Waals radii have been researched and improved in accuracy, one for the electron-cloud-center positions suitable for X-ray crystallography and one for nuclear positions. New validations include messages at input about problem-causing format irregularities, updates of Ramachandran and rotamer criteria from the million quality-filtered residues in a new reference dataset, the CaBLAM Cα-CO virtual-angle analysis of backbone and secondary structure for cryoEM or low-resolution X-ray, and flagging of the very rare cis-nonProline and twisted peptides which have recently been greatly overused. Due to wide application of MolProbity validation and corrections by the research community, in Phenix, and at the worldwide Protein Data Bank, newly deposited structures have continued to improve greatly as measured by MolProbity's unique all-atom clashscore.
Collapse
Affiliation(s)
| | - Jeffrey J. Headd
- Department of BiochemistryDuke UniversityDurhamNC27710USA
- Present address:
Janssen Research and DevelopmentSpring HousePA19477USA
| | - Nigel W. Moriarty
- Molecular Biosciences and Integrated BioimagingLawrence Berkeley National LaboratoryBerkeleyCA94720USA
| | | | | | - Lindsay N. Deis
- Department of BiochemistryDuke UniversityDurhamNC27710USA
- Present address:
Department of BiochemistryStanford University, StanfordCA95126USA
| | - Vishal Verma
- Department of Computer ScienceUniversity of North CarolinaChapel HillNC27599USA
| | - Daniel A. Keedy
- Department of BiochemistryDuke UniversityDurhamNC27710USA
- Present address:
Structural Biology Initiative and Department of Chemistry & BiochemistryCUNY Advanced Science Research Center, City University of New YorkNew YorkNY10031USA
| | | | | | - Swati Jain
- Department of BiochemistryDuke UniversityDurhamNC27710USA
- Present address:
Department of ChemistryNew York UniversityNew YorkNYUSA
| | - Steven M. Lewis
- Department of BiochemistryDuke UniversityDurhamNC27710USA
- Present address:
Cyrus Biotechnology, 500 Union Street, Suite 320SeattleWA98101USA
| | | | - Jack Snoeyink
- Department of Computer ScienceUniversity of North CarolinaChapel HillNC27599USA
| | - Paul D. Adams
- Molecular Biosciences and Integrated BioimagingLawrence Berkeley National LaboratoryBerkeleyCA94720USA
| | - Simon C. Lovell
- School of Biological SciencesUniversity of ManchesterManchesterM13 9PTUK
| | | | | |
Collapse
|
17
|
Kryshtafovych A, Monastyrskyy B, Fidelis K, Moult J, Schwede T, Tramontano A. Evaluation of the template-based modeling in CASP12. Proteins 2017; 86 Suppl 1:321-334. [PMID: 29159950 DOI: 10.1002/prot.25425] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 10/22/2017] [Accepted: 11/16/2017] [Indexed: 01/29/2023]
Abstract
The article describes results of numerical evaluation of CASP12 models submitted on targets for which structural templates could be identified and for which servers produced models of relatively high accuracy. The emphasis is on analysis of details of models, and how well the models compete with experimental structures. Performance of contributing research groups is measured in terms of backbone accuracy, all-atom local geometry, and the ability to estimate local errors in models. Separate analyses for all participating groups and automatic servers were carried out. Compared with the last CASP, two years ago, there have been significant improvements in a number of areas, particularly the accuracy of protein backbone atoms, accuracy of sequence alignment between models and available structures, increased accuracy over that which can be obtained from simple copying of a closest template, and accuracy of modeling of sub-structures not present in the closest template. These advancements are likely associated with more effective strategies to build non-template regions of the targets ab initio, better algorithms to combine information from multiple templates, enhanced refinement methods, and better methods for estimating model accuracy.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Biochemical Sciences, Sapienza - University of Rome, P. le A. Moro, 5, Rome, 00185
| |
Collapse
|
18
|
Ghosh S, Gadiyaram V, Vishveshwara S. Validation of protein structure models using network similarity score. Proteins 2017; 85:1759-1776. [PMID: 28598579 DOI: 10.1002/prot.25332] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Revised: 05/07/2017] [Accepted: 06/07/2017] [Indexed: 12/27/2022]
Abstract
Accurate structural validation of proteins is of extreme importance in studies like protein structure prediction, analysis of molecular dynamic simulation trajectories and finding subtle changes in very similar structures. The benchmarks for today's structure validation are scoring methods like global distance test-total structure (GDT-TS), TM-score and root mean square deviations (RMSD). However, there is a lack of methods that look at both the protein backbone and side-chain structures at the global connectivity level and provide information about the differences in connectivity. To address this gap, a graph spectral based method (NSS-network similarity score) which has been recently developed to rigorously compare networks in diverse fields, is adopted to compare protein structures both at the backbone and at the side-chain noncovalent connectivity levels. In this study, we validate the performance of NSS by investigating protein structures from X-ray structures, modeling (including CASP models), and molecular dynamics simulations. Further, we systematically identify the local and the global regions of the structures contributing to the difference in NSS, through the components of the score, a feature unique to this spectral based scoring scheme. It is demonstrated that the method can quantify subtle differences in connectivity compared to a reference protein structure and can form a robust basis for protein structure comparison. Additionally, we have also introduced a network-based method to analyze fluctuations in side chain interactions (edge-weights) in an ensemble of structures, which can be an useful tool for the analysis of MD trajectories.
Collapse
Affiliation(s)
- Sambit Ghosh
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India.,Department of Mathematics, Indian Institute of Science, Bangalore, Karnataka, India
| | - Vasundhara Gadiyaram
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India.,Department of Mathematics, Indian Institute of Science, Bangalore, Karnataka, India
| | | |
Collapse
|
19
|
Gadzała M, Kalinowska B, Banach M, Konieczny L, Roterman I. Determining protein similarity by comparing hydrophobic core structure. Heliyon 2017; 3:e00235. [PMID: 28217749 PMCID: PMC5300504 DOI: 10.1016/j.heliyon.2017.e00235] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Revised: 12/06/2016] [Accepted: 01/19/2017] [Indexed: 12/19/2022] Open
Abstract
Formal assessment of structural similarity is - next to protein structure prediction - arguably the most important unsolved problem in proteomics. In this paper we propose a similarity criterion based on commonalities between the proteins' hydrophobic cores. The hydrophobic core emerges as a result of conformational changes through which each residue reaches its intended position in the protein body. A quantitative criterion based on this phenomenon has been proposed in the framework of the CASP challenge. The structure of the hydrophobic core - including the placement and scope of any deviations from the idealized model - may indirectly point to areas of importance from the point of view of the protein's biological function. Our analysis focuses on an arbitrarily selected target from the CASP11 challenge. The proposed measure, while compliant with CASP criteria (70-80% correlation), involves certain adjustments which acknowledge the presence of factors other than simple spatial arrangement of solids.
Collapse
Affiliation(s)
- M. Gadzała
- AGH - Academic Computer Center − Cyfronet, Nawojki 11, Kraków 30-950, Poland
| | - B. Kalinowska
- Faculty of Physics, Astronomy, Applied Computer Science − Jagiellonian University, Łojasiewicza 11, Kraków 30-348, Poland
| | - M. Banach
- Department of Bioinformatics and Telemedicine, Jagiellonian University − Medical College, Łazarza 16, Krakow 31-530, Poland
| | - L. Konieczny
- Chair of Medical Biochemistry, Jagiellonian University − Medical College, Kopernika 7, Kraków 31-034, Poland
| | - I. Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University − Medical College, Łazarza 16, Krakow 31-530, Poland
| |
Collapse
|
20
|
Pang YP. FF12MC: A revised AMBER forcefield and new protein simulation protocol. Proteins 2016; 84:1490-516. [PMID: 27348292 PMCID: PMC5129589 DOI: 10.1002/prot.25094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/16/2016] [Accepted: 06/18/2016] [Indexed: 12/25/2022]
Abstract
Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yuan-Ping Pang
- Computer-Aided Molecular Design Laboratory, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
21
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP11 statistics and the prediction center evaluation system. Proteins 2016; 84 Suppl 1:15-9. [PMID: 26857434 PMCID: PMC5479680 DOI: 10.1002/prot.25005] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 01/18/2016] [Accepted: 02/04/2016] [Indexed: 01/10/2023]
Abstract
We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616.
| |
Collapse
|
22
|
Modi V, Dunbrack RL. Assessment of refinement of template-based models in CASP11. Proteins 2016; 84 Suppl 1:260-81. [PMID: 27081793 DOI: 10.1002/prot.25048] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Revised: 03/13/2016] [Accepted: 04/11/2016] [Indexed: 12/26/2022]
Abstract
CASP11 (the 11th Meeting on the Critical Assessment of Protein Structure Prediction) ran a blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8. As with the previous experiments, the predictors were provided with one starting structure from the server models of each of a selected set of template-based modeling targets and asked to refine the coordinates of the starting structure toward native. We assessed the refined structures with the Z-scores of the standard CASP measures, which compare the model-target similarities of the models from all the predictors. Furthermore, we assessed the refined structures with "relative measures," which compare the improvement in accuracy of each model with respect to the starting structure. The latter provides an assessment of the extent to which each predictor group is able to improve the starting structures toward native. We utilized heat maps to display improvements in the Calpha-Calpha distance matrix for each model. The heat maps labeled with each element of secondary structure helped us to identify regions of refinement toward native in each model. Most positively scoring models show modest improvements in multiple regions of the structure, while in some models we were able to identify significant repositioning of N/C-terminal segments and internal elements of secondary structure. The best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3-5% in the standard CASP measures. Proteins 2016; 84(Suppl 1):260-281. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | |
Collapse
|
23
|
Modi V, Xu Q, Adhikari S, Dunbrack RL. Assessment of template-based modeling of protein structure in CASP11. Proteins 2016; 84 Suppl 1:200-20. [PMID: 27081927 DOI: 10.1002/prot.25049] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2016] [Revised: 04/04/2016] [Accepted: 04/11/2016] [Indexed: 12/27/2022]
Abstract
We present the assessment of predictions submitted in the template-based modeling (TBM) category of CASP11 (Critical Assessment of Protein Structure Prediction). Model quality was judged on the basis of global and local measures of accuracy on all atoms including side chains. The top groups on 39 human-server targets based on model 1 predictions were LEER, Zhang, LEE, MULTICOM, and Zhang-Server. The top groups on 81 targets by server groups based on model 1 predictions were Zhang-Server, nns, BAKER-ROSETTASERVER, QUARK, and myprotein-me. In CASP11, the best models for most targets were equal to or better than the best template available in the Protein Data Bank, even for targets with poor templates. The overall performance in CASP11 is similar to the performance of predictors in CASP10 with slightly better performance on the hardest targets. For most targets, assessment measures exhibited bimodal probability density distributions. Multi-dimensional scaling of an RMSD matrix for each target typically revealed a single cluster with models similar to the target structure, with a mode in the GDT-TS density between 40 and 90, and a wide distribution of models highly divergent from each other and from the experimental structure, with density mode at a GDT-TS value of ∼20. The models in this peak in the density were either compact models with entirely the wrong fold, or highly non-compact models. The results argue for a density-driven approach in future CASP TBM assessments that accounts for the bimodal nature of these distributions instead of Z scores, which assume a unimodal, Gaussian distribution. Proteins 2016; 84(Suppl 1):200-220. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Qifang Xu
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Sam Adhikari
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Roland L Dunbrack
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111.
| |
Collapse
|
24
|
Figueroa M, Sleutel M, Vandevenne M, Parvizi G, Attout S, Jacquin O, Vandenameele J, Fischer AW, Damblon C, Goormaghtigh E, Valerio-Lepiniec M, Urvoas A, Durand D, Pardon E, Steyaert J, Minard P, Maes D, Meiler J, Matagne A, Martial JA, Van de Weerdt C. The unexpected structure of the designed protein Octarellin V.1 forms a challenge for protein structure prediction tools. J Struct Biol 2016; 195:19-30. [PMID: 27181418 DOI: 10.1016/j.jsb.2016.05.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Revised: 04/19/2016] [Accepted: 05/12/2016] [Indexed: 12/26/2022]
Abstract
Despite impressive successes in protein design, designing a well-folded protein of more 100 amino acids de novo remains a formidable challenge. Exploiting the promising biophysical features of the artificial protein Octarellin V, we improved this protein by directed evolution, thus creating a more stable and soluble protein: Octarellin V.1. Next, we obtained crystals of Octarellin V.1 in complex with crystallization chaperons and determined the tertiary structure. The experimental structure of Octarellin V.1 differs from its in silico design: the (αβα) sandwich architecture bears some resemblance to a Rossman-like fold instead of the intended TIM-barrel fold. This surprising result gave us a unique and attractive opportunity to test the state of the art in protein structure prediction, using this artificial protein free of any natural selection. We tested 13 automated webservers for protein structure prediction and found none of them to predict the actual structure. More than 50% of them predicted a TIM-barrel fold, i.e. the structure we set out to design more than 10years ago. In addition, local software runs that are human operated can sample a structure similar to the experimental one but fail in selecting it, suggesting that the scoring and ranking functions should be improved. We propose that artificial proteins could be used as tools to test the accuracy of protein structure prediction algorithms, because their lack of evolutionary pressure and unique sequences features.
Collapse
Affiliation(s)
- Maximiliano Figueroa
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium.
| | - Mike Sleutel
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
| | - Marylene Vandevenne
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Gregory Parvizi
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Sophie Attout
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Olivier Jacquin
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Julie Vandenameele
- Laboratoire d'Enzymologie et Repliement des Protéines, Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Axel W Fischer
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | | | - Erik Goormaghtigh
- Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Marie Valerio-Lepiniec
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Agathe Urvoas
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Dominique Durand
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Els Pardon
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Structural Biology Research Center, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Jan Steyaert
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Structural Biology Research Center, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Philippe Minard
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Dominique Maes
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
| | - Jens Meiler
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - André Matagne
- Laboratoire d'Enzymologie et Repliement des Protéines, Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Joseph A Martial
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Cécile Van de Weerdt
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium.
| |
Collapse
|
25
|
Virrueta A, O'Hern CS, Regan L. Understanding the physical basis for the side‐chain conformational preferences of methionine. Proteins 2016; 84:900-11. [DOI: 10.1002/prot.25026] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Revised: 01/17/2016] [Accepted: 02/03/2016] [Indexed: 01/15/2023]
Affiliation(s)
- Alejandro Virrueta
- Department of Mechanical Engineering & Materials ScienceYale UniversityNew Haven Connecticut
- Integrated Graduate Program in Physical & Engineering BiologyYale UniversityNew Haven Connecticut
| | - Corey S. O'Hern
- Department of Mechanical Engineering & Materials ScienceYale UniversityNew Haven Connecticut
- Integrated Graduate Program in Physical & Engineering BiologyYale UniversityNew Haven Connecticut
- Department of PhysicsYale UniversityNew Haven Connecticut
- Department of Applied PhysicsYale UniversityNew Haven Connecticut
| | - Lynne Regan
- Integrated Graduate Program in Physical & Engineering BiologyYale UniversityNew Haven Connecticut
- Department of Molecular Biophysics & BiochemistryYale UniversityNew Haven Connecticut
- Department of ChemistryYale UniversityNew Haven Connecticut
| |
Collapse
|
26
|
Yang J, Zhang W, He B, Walker SE, Zhang H, Govindarajoo B, Virtanen J, Xue Z, Shen HB, Zhang Y. Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade. Proteins 2015; 84 Suppl 1:233-46. [PMID: 26343917 DOI: 10.1002/prot.24918] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Revised: 08/13/2015] [Accepted: 08/31/2015] [Indexed: 01/26/2023]
Abstract
We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 Å. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. Proteins 2016; 84(Suppl 1):233-246. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jianyi Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Wenxuan Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Sara Elizabeth Walker
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Brandon Govindarajoo
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jouko Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Zhidong Xue
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hong-Bin Shen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109.
| |
Collapse
|
27
|
Joo K, Joung I, Lee SY, Kim JY, Cheng Q, Manavalan B, Joung JY, Heo S, Lee J, Nam M, Lee IH, Lee SJ, Lee J. Template based protein structure modeling by global optimization in CASP11. Proteins 2015; 84 Suppl 1:221-32. [PMID: 26329522 DOI: 10.1002/prot.24917] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/04/2015] [Accepted: 08/21/2015] [Indexed: 11/11/2022]
Abstract
For the template-based modeling (TBM) of CASP11 targets, we have developed three new protein modeling protocols (nns for server prediction and LEE and LEER for human prediction) by improving upon our previous CASP protocols (CASP7 through CASP10). We applied the powerful global optimization method of conformational space annealing to three stages of optimization, including multiple sequence-structure alignment, three-dimensional (3D) chain building, and side-chain remodeling. For more successful fold recognition, a new alignment method called CRFalign was developed. It can incorporate sensitive positional and environmental dependence in alignment scores as well as strong nonlinear correlations among various features. Modifications and adjustments were made to the form of the energy function and weight parameters pertaining to the chain building procedure. For the side-chain remodeling step, residue-type dependence was introduced to the cutoff value that determines the entry of a rotamer to the side-chain modeling library. The improved performance of the nns server method is attributed to successful fold recognition achieved by combining several methods including CRFalign and to the current modeling formulation that can incorporate native-like structural aspects present in multiple templates. The LEE protocol is identical to the nns one except that CASP11-released server models are used as templates. The success of LEE in utilizing CASP11 server models indicates that proper template screening and template clustering assisted by appropriate cluster ranking promises a new direction to enhance protein 3D modeling. Proteins 2016; 84(Suppl 1):221-232. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Keehyoung Joo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - InSuk Joung
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Sun Young Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Yun Kim
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Qianyi Cheng
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Balachandran Manavalan
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Young Joung
- School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Seungryong Heo
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Juyong Lee
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20852
| | - Mikyung Nam
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - In-Ho Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Korea Research Institute of Standards and Science (KRISS), Seoul, 305-600, Korea
| | - Sung Jong Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea.,Department of Physics, University of Suwon, Hwaseong-Si, Gyeonggi-Do, 445-743, Korea
| | - Jooyoung Lee
- Center for in Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea. .,School of Computational Sciences, Korea Institute for Advanced Study, Seoul, 130-722, Korea.
| |
Collapse
|
28
|
Aibara S, Valkov E, Lamers MH, Dimitrova L, Hurt E, Stewart M. Structural characterization of the principal mRNA-export factor Mex67-Mtr2 from Chaetomium thermophilum. Acta Crystallogr F Struct Biol Commun 2015; 71:876-88. [PMID: 26144233 PMCID: PMC4498709 DOI: 10.1107/s2053230x15008766] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 05/05/2015] [Indexed: 12/11/2022] Open
Abstract
Members of the Mex67-Mtr2/NXF-NXT1 family are the principal mediators of the nuclear export of mRNA. Mex67/NXF1 has a modular structure based on four domains (RRM, LRR, NTF2-like and UBA) that are thought to be present across species, although the level of sequence conservation between organisms, especially in lower eukaryotes, is low. Here, the crystal structures of these domains from the thermophilic fungus Chaetomium thermophilum are presented together with small-angle X-ray scattering (SAXS) and in vitro RNA-binding data that indicate that, not withstanding the limited sequence conservation between different NXF family members, the molecules retain similar structural and RNA-binding properties. Moreover, the resolution of crystal structures obtained with the C. thermophilum domains was often higher than that obtained previously and, when combined with solution and biochemical studies, provided insight into the structural organization, self-association and RNA-binding properties of Mex67-Mtr2 that facilitate mRNA nuclear export.
Collapse
Affiliation(s)
- Shintaro Aibara
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Eugene Valkov
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Meindert H. Lamers
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| | - Lyudmila Dimitrova
- Biochemie-Zentrum der Universität Heidelberg, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany
| | - Ed Hurt
- Biochemie-Zentrum der Universität Heidelberg, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany
| | - Murray Stewart
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, England
| |
Collapse
|
29
|
Huang YJ, Mao B, Aramini JM, Montelione GT. Assessment of template-based protein structure predictions in CASP10. Proteins 2014; 82 Suppl 2:43-56. [PMID: 24323734 DOI: 10.1002/prot.24488] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Revised: 11/10/2013] [Accepted: 11/19/2013] [Indexed: 12/27/2022]
Abstract
Template-based modeling (TBM) is a major component of the critical assessment of protein structure prediction (CASP). In CASP10, some 41,740 predicted models submitted by 150 predictor groups were assessed as TBM predictions. The accuracy of protein structure prediction was assessed by geometric comparison with experimental X-ray crystal and NMR structures using a composite score that included both global alignment metrics and distance-matrix-based metrics. These included GDT-HA and GDC-all global alignment scores, and the superimposition-independent LDDT distance-matrix-based score. In addition, a superimposition-independent RPF metric, similar to that described previously for comparing protein models against experimental NMR data, was used for comparing predicted protein structure models against experimental protein structures. To score well on all four of these metrics, models must feature accurate predictions of both backbone and side-chain conformations. Performance rankings were determined independently for server and the combined server plus human-curated predictor groups. Final rankings were made using paired head-to-head Student's t-test analysis of raw metric scores among the top 25 performing groups in each category.
Collapse
Affiliation(s)
- Yuanpeng J Huang
- Center for Advanced Biotechnology and Medicine and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854; Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, New Jersey, 08854
| | | | | | | |
Collapse
|
30
|
Zhou AQ, Caballero D, O'Hern CS, Regan L. New insights into the interdependence between amino acid stereochemistry and protein structure. Biophys J 2014; 105:2403-11. [PMID: 24268152 DOI: 10.1016/j.bpj.2013.09.018] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2013] [Revised: 07/30/2013] [Accepted: 09/16/2013] [Indexed: 12/29/2022] Open
Abstract
To successfully design new proteins and understand the effects of mutations in natural proteins, we must understand the geometric and physicochemical principles underlying protein structure. The side chains of amino acids in peptides and proteins adopt specific dihedral angle combinations; however, we still do not have a fundamental quantitative understanding of why some side-chain dihedral angle combinations are highly populated and others are not. Here we employ a hard-sphere plus stereochemical constraint model of dipeptide mimetics to enumerate the side-chain dihedral angles of leucine (Leu) and isoleucine (Ile), and identify those conformations that are sterically allowed versus those that are not as a function of the backbone dihedral angles ϕ and ψ. We compare our results with the observed distributions of side-chain dihedral angles in proteins of known structure. With the hard-sphere plus stereochemical constraint model, we obtain agreement between the model predictions and the observed side-chain dihedral angle distributions for Leu and Ile. These results quantify the extent to which local, geometrical constraints determine protein side-chain conformations.
Collapse
Affiliation(s)
- Alice Qinhua Zhou
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut; Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, Connecticut
| | | | | | | |
Collapse
|
31
|
Nugent T, Cozzetto D, Jones DT. Evaluation of predictions in the CASP10 model refinement category. Proteins 2014; 82 Suppl 2:98-111. [PMID: 23900810 PMCID: PMC4282348 DOI: 10.1002/prot.24377] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Revised: 06/19/2013] [Accepted: 06/28/2013] [Indexed: 12/24/2022]
Abstract
Here we report on the assessment results of the third experiment to evaluate the state of the art in protein model refinement, where participants were invited to improve the accuracy of initial protein models for 27 targets. Using an array of complementary evaluation measures, we find that five groups performed better than the naïve (null) method—a marked improvement over CASP9, although only three were significantly better. The leading groups also demonstrated the ability to consistently improve both backbone and side chain positioning, while other groups reliably enhanced other aspects of protein physicality. The top-ranked group succeeded in improving the backbone conformation in almost 90% of targets, suggesting a strategy that for the first time in CASP refinement is successful in a clear majority of cases. A number of issues remain unsolved: the majority of groups still fail to improve the quality of the starting models; even successful groups are only able to make modest improvements; and no prediction is more similar to the native structure than to the starting model. Successful refinement attempts also often go unrecognized, as suggested by the relatively larger improvements when predictions not submitted as model 1 are also considered. Proteins 2014; 82(Suppl 2):98–111.
Collapse
Affiliation(s)
- Timothy Nugent
- Department of Computer Science Bioinformatics Group, University College London, London, WC1E 6BT, United Kingdom
| | | | | |
Collapse
|
32
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP prediction center infrastructure and evaluation measures in CASP10 and CASP ROLL. Proteins 2013; 82 Suppl 2:7-13. [PMID: 24038551 DOI: 10.1002/prot.24399] [Citation(s) in RCA: 74] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Revised: 08/08/2013] [Accepted: 08/14/2013] [Indexed: 12/27/2022]
Abstract
The Protein Structure Prediction Center at the University of California, Davis, supports the CASP experiments by identifying prediction targets, accepting predictions, performing standard evaluations, assisting independent CASP assessors, presenting and archiving results, and facilitating information exchange relating to CASP and structure prediction in general. We provide an overview of the CASP infrastructure implemented at the Center, and summarize standard measures used for evaluating predictions in the latest round of CASP. Several components were introduced or significantly redesigned for CASP10, in particular an improved assessors' common web-workspace; a Sphere Grinder visualization tool for analyzing local accuracy of predictions; brand new blocks for evaluation contact prediction and contact-assisted structure prediction; expanded evaluation and visualization tools for tertiary structure, refinement and quality assessment. Technical aspects of conducting the CASP10 and CASP ROLL experiments and relevant statistics are also provided.
Collapse
|
33
|
Taylor TJ, Bai H, Tai CH, Lee B. Assessment of CASP10 contact-assisted predictions. Proteins 2013; 82 Suppl 2:84-97. [PMID: 23873510 DOI: 10.1002/prot.24367] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2013] [Accepted: 07/09/2013] [Indexed: 11/08/2022]
Abstract
In CASP10, for the first time, contact-assisted structure predictions have been assessed. Sets of pairs of contacting residues from target structures were provided to predictors for a second round of prediction after the initial round in which they were given only sequences. The objective of the experiment was to measure model quality improvement resulting from the added contact information and thereby assess and help develop so-called hybrid prediction methods--methods where some experimentally determined distance constraints are used to augment de novo computational prediction methods. The results of the experiment were, overall, quite promising.
Collapse
Affiliation(s)
- Todd J Taylor
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| | | | | | | |
Collapse
|
34
|
Song Y, DiMaio F, Wang RYR, Kim D, Miles C, Brunette T, Thompson J, Baker D. High-resolution comparative modeling with RosettaCM. Structure 2013; 21:1735-42. [PMID: 24035711 DOI: 10.1016/j.str.2013.08.005] [Citation(s) in RCA: 808] [Impact Index Per Article: 73.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Revised: 07/28/2013] [Accepted: 08/02/2013] [Indexed: 10/26/2022]
Abstract
We describe an improved method for comparative modeling, RosettaCM, which optimizes a physically realistic all-atom energy function over the conformational space defined by homologous structures. Given a set of sequence alignments, RosettaCM assembles topologies by recombining aligned segments in Cartesian space and building unaligned regions de novo in torsion space. The junctions between segments are regularized using a loop closure method combining fragment superposition with gradient-based minimization. The energies of the resulting models are optimized by all-atom refinement, and the most representative low-energy model is selected. The CASP10 experiment suggests that RosettaCM yields models with more accurate side-chain and backbone conformations than other methods when the sequence identity to the templates is greater than ∼15%.
Collapse
Affiliation(s)
- Yifan Song
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Mariani V, Biasini M, Barbato A, Schwede T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 2013; 29:2722-8. [PMID: 23986568 PMCID: PMC3799472 DOI: 10.1093/bioinformatics/btt473] [Citation(s) in RCA: 500] [Impact Index Per Article: 45.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Motivation: The assessment of protein structure prediction techniques requires objective criteria to measure the similarity between a computational model and the experimentally determined reference structure. Conventional similarity measures based on a global superposition of carbon α atoms are strongly influenced by domain motions and do not assess the accuracy of local atomic details in the model. Results: The Local Distance Difference Test (lDDT) is a superposition-free score that evaluates local distance differences of all atoms in a model, including validation of stereochemical plausibility. The reference can be a single structure, or an ensemble of equivalent structures. We demonstrate that lDDT is well suited to assess local model quality, even in the presence of domain movements, while maintaining good correlation with global measures. These properties make lDDT a robust tool for the automated assessment of structure prediction servers without manual intervention. Availability and implementation: Source code, binaries for Linux and MacOSX, and an interactive web server are available at http://swissmodel.expasy.org/lddt Contact:torsten.schwede@unibas.ch Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Valerio Mariani
- Biozentrum, Universität Basel, Klingelbergstrasse 50-70 and Computational Structural Biology, SIB Swiss Institute of Bioinformatics, 4056 Basel, Switzerland
| | | | | | | |
Collapse
|
36
|
Bhattacharya D, Cheng J. i3Drefine software for protein 3D structure refinement and its assessment in CASP10. PLoS One 2013; 8:e69648. [PMID: 23894517 PMCID: PMC3716612 DOI: 10.1371/journal.pone.0069648] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 06/13/2013] [Indexed: 12/25/2022] Open
Abstract
Protein structure refinement refers to the process of improving the qualities of protein structures during structure modeling processes to bring them closer to their native states. Structure refinement has been drawing increasing attention in the community-wide Critical Assessment of techniques for Protein Structure prediction (CASP) experiments since its addition in 8th CASP experiment. During the 9th and recently concluded 10th CASP experiments, a consistent growth in number of refinement targets and participating groups has been witnessed. Yet, protein structure refinement still remains a largely unsolved problem with majority of participating groups in CASP refinement category failed to consistently improve the quality of structures issued for refinement. In order to alleviate this need, we developed a completely automated and computationally efficient protein 3D structure refinement method, i3Drefine, based on an iterative and highly convergent energy minimization algorithm with a powerful all-atom composite physics and knowledge-based force fields and hydrogen bonding (HB) network optimization technique. In the recent community-wide blind experiment, CASP10, i3Drefine (as ‘MULTICOM-CONSTRUCT’) was ranked as the best method in the server section as per the official assessment of CASP10 experiment. Here we provide the community with free access to i3Drefine software and systematically analyse the performance of i3Drefine in strict blind mode on the refinement targets issued in CASP10 refinement category and compare with other state-of-the-art refinement methods participating in CASP10. Our analysis demonstrates that i3Drefine is only fully-automated server participating in CASP10 exhibiting consistent improvement over the initial structures in both global and local structural quality metrics. Executable version of i3Drefine is freely available at http://protein.rnet.missouri.edu/i3drefine/.
Collapse
Affiliation(s)
- Debswapna Bhattacharya
- Department of Computer Science, University of Missouri, Columbia, Missouri, United States of America
| | - Jianlin Cheng
- Department of Computer Science, Informatics Institute, Bond Life Science Center, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
37
|
Heo L, Park H, Seok C. GalaxyRefine: Protein structure refinement driven by side-chain repacking. Nucleic Acids Res 2013; 41:W384-8. [PMID: 23737448 PMCID: PMC3692086 DOI: 10.1093/nar/gkt458] [Citation(s) in RCA: 663] [Impact Index Per Article: 60.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The quality of model structures generated by contemporary protein structure prediction methods strongly depends on the degree of similarity between the target and available template structures. Therefore, the importance of improving template-based model structures beyond the accuracy available from template information has been emphasized in the structure prediction community. The GalaxyRefine web server, freely available at http://galaxy.seoklab.org/refine, is based on a refinement method that has been successfully tested in CASP10. The method first rebuilds side chains and performs side-chain repacking and subsequent overall structure relaxation by molecular dynamics simulation. According to the CASP10 assessment, this method showed the best performance in improving the local structure quality. The method can improve both global and local structure quality on average, when used for refining the models generated by state-of-the-art protein structure prediction servers.
Collapse
Affiliation(s)
- Lim Heo
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | | | | |
Collapse
|
38
|
Lukasiak P, Antczak M, Ratajczak T, Bujnicki JM, Szachniuk M, Adamiak RW, Popenda M, Blazewicz J. RNAlyzer--novel approach for quality analysis of RNA structural models. Nucleic Acids Res 2013; 41:5978-90. [PMID: 23620294 PMCID: PMC3695499 DOI: 10.1093/nar/gkt318] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The continuously increasing amount of RNA sequence and experimentally determined 3D structure data drives the development of computational methods supporting exploration of these data. Contemporary functional analysis of RNA molecules, such as ribozymes or riboswitches, covers various issues, among which tertiary structure modeling becomes more and more important. A growing number of tools to model and predict RNA structure calls for an evaluation of these tools and the quality of outcomes their produce. Thus, the development of reliable methods designed to meet this need is relevant in the context of RNA tertiary structure analysis and can highly influence the quality and usefulness of RNA tertiary structure prediction in the nearest future. Here, we present RNAlyzer—a computational method for comparison of RNA 3D models with the reference structure and for discrimination between the correct and incorrect models. Our approach is based on the idea of local neighborhood, defined as a set of atoms included in the sphere centered around a user-defined atom. A unique feature of the RNAlyzer is the simultaneous visualization of the model-reference structure distance at different levels of detail, from the individual residues to the entire molecules.
Collapse
Affiliation(s)
- Piotr Lukasiak
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland.
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Day R, Joo H, Chavan AC, Lennox KP, Chen YA, Dahl DB, Vannucci M, Tsai JW. Understanding the general packing rearrangements required for successful template based modeling of protein structure from a CASP experiment. Comput Biol Chem 2013; 42:40-8. [DOI: 10.1016/j.compbiolchem.2012.10.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Revised: 10/30/2012] [Accepted: 10/31/2012] [Indexed: 11/16/2022]
|
40
|
Bhattacharya D, Cheng J. 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 2013; 81:119-31. [PMID: 22927229 PMCID: PMC3634918 DOI: 10.1002/prot.24167] [Citation(s) in RCA: 130] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/26/2012] [Accepted: 08/17/2012] [Indexed: 12/27/2022]
Abstract
One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low-resolution predicted models to high-resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two-step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/.
Collapse
Affiliation(s)
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
- Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
41
|
Crystal structure and computational modeling of the fab fragment from a protective anti-ricin monoclonal antibody. PLoS One 2012; 7:e52613. [PMID: 23285112 PMCID: PMC3526572 DOI: 10.1371/journal.pone.0052613] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 11/20/2012] [Indexed: 12/04/2022] Open
Abstract
Background Many antibody crystal structures have been solved. Structural modeling programs have been developed that utilize this information to predict 3-D structures of an antibody based upon its sequence. Because of the problem of self-reference, the accuracy and utility of these predictions can only be tested when a new structure has not yet been deposited in the Protein Data Bank. Methods We have solved the crystal structure of the Fab fragment of RAC18, a protective anti-ricin mAb, to 1.9 Å resolution. We have also modeled the Fv structure of RAC18 using publicly available Ab modeling tools Prediction of Immunoglobulin Structures (PIGS), RosettaAntibody, and Web Antibody Modeling (WAM). The model structures underwent energy minimization. We compared results to the crystal structure on the basis of root-mean-square deviation (RMSD), template modeling score (TM-score), Z-score, and MolProbity analysis. Findings The crystal structure showed a pocket formed mainly by AA residues in each of the heavy chain complementarity determining regions (CDRs). There were differences between the crystal structure and structures predicted by the modeling tools, particularly in the CDRs. There were also differences among the predicted models, although the differences were small and within experimental error. No one modeling program was clearly superior to the others. In some cases, choosing structures based only on sequence homology to the crystallized Ab yielded RMSDs comparable to the models. Conclusions Molecular modeling programs accurately predict the structure of most regions of antibody variable domains of RAC18. The hypervariable CDRs proved most difficult to model, particularly H chain CDR3. Because CDR3 is most often involved in contact with antigen, this defect must be considered when using models to identify potential contacts between antibody and antigen. Because this study represents only a single case, the results cannot be generalized. Rather they highlight the utility and limitations of modeling programs.
Collapse
|
42
|
Olechnovič K, Kulberkytė E, Venclovas C. CAD-score: a new contact area difference-based function for evaluation of protein structural models. Proteins 2012; 81:149-62. [PMID: 22933340 DOI: 10.1002/prot.24172] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Revised: 08/09/2012] [Accepted: 08/25/2012] [Indexed: 12/17/2022]
Abstract
Evaluation of protein models against the native structure is essential for the development and benchmarking of protein structure prediction methods. Although a number of evaluation scores have been proposed to date, many aspects of model assessment still lack desired robustness. In this study we present CAD-score, a new evaluation function quantifying differences between physical contacts in a model and the reference structure. The new score uses the concept of residue-residue contact area difference (CAD) introduced by Abagyan and Totrov (J Mol Biol 1997; 268:678-685). Contact areas, the underlying basis of the score, are derived using the Voronoi tessellation of protein structure. The newly introduced CAD-score is a continuous function, confined within fixed limits, free of any arbitrary thresholds or parameters. The built-in logic for treatment of missing residues allows consistent ranking of models of any degree of completeness. We tested CAD-score on a large set of diverse models and compared it to GDT-TS, a widely accepted measure of model accuracy. Similarly to GDT-TS, CAD-score showed a robust performance on single-domain proteins, but displayed a stronger preference for physically more realistic models. Unlike GDT-TS, the new score revealed a balanced assessment of domain rearrangement, removing the necessity for different treatment of single-domain, multi-domain, and multi-subunit structures. Moreover, CAD-score makes it possible to assess the accuracy of inter-domain or inter-subunit interfaces directly. In addition, the approach offers an alternative to the superposition-based model clustering. The CAD-score implementation is available both as a web server and a standalone software package at http://www.ibt.lt/bioinformatics/cad-score/.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Vilnius University, Graičiūno 8, LT-02241 Vilnius, Lithuania
| | | | | |
Collapse
|
43
|
Ko J, Park H, Heo L, Seok C. GalaxyWEB server for protein structure prediction and refinement. Nucleic Acids Res 2012; 40:W294-7. [PMID: 22649060 PMCID: PMC3394311 DOI: 10.1093/nar/gks493] [Citation(s) in RCA: 522] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Three-dimensional protein structures provide invaluable information for understanding and regulating biological functions of proteins. The GalaxyWEB server predicts protein structure from sequence by template-based modeling and refines loop or terminus regions by ab initio modeling. This web server is based on the method tested in CASP9 (9th Critical Assessment of techniques for protein Structure Prediction) as ‘Seok-server’, which was assessed to be among top performing template-based modeling servers. The method generates reliable core structures from multiple templates and re-builds unreliable loops or termini by using an optimization-based refinement method. In addition to structure prediction, a user can also submit a refinement only job by providing a starting model structure and locations of loops or termini to refine. The web server can be freely accessed at http://galaxy.seoklab.org/.
Collapse
Affiliation(s)
- Junsu Ko
- Department of Chemistry, Seoul National University, Seoul 151-747, Korea
| | | | | | | |
Collapse
|
44
|
Atomic-level protein structure refinement using fragment-guided molecular dynamics conformation sampling. Structure 2012; 19:1784-95. [PMID: 22153501 DOI: 10.1016/j.str.2011.09.022] [Citation(s) in RCA: 248] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 09/19/2011] [Accepted: 09/24/2011] [Indexed: 11/22/2022]
Abstract
One of critical difficulties of molecular dynamics (MD) simulations in protein structure refinement is that the physics-based energy landscape lacks a middle-range funnel to guide nonnative conformations toward near-native states. We propose to use the target model as a probe to identify fragmental analogs from PDB. The distance maps are then used to reshape the MD energy funnel. The protocol was tested on 181 benchmarking and 26 CASP targets. It was found that structure models of correct folds with TM-score >0.5 can be often pulled closer to native with higher GDT-HA score, but improvement for the models of incorrect folds (TM-score <0.5) are much less pronounced. These data indicate that template-based fragmental distance maps essentially reshaped the MD energy landscape from golf-course-like to funnel-like ones in the successfully refined targets with a radius of TM-score ∼0.5. These results demonstrate a new avenue to improve high-resolution structures by combining knowledge-based template information with physics-based MD simulations.
Collapse
|
45
|
Cadag E, Vitalis E, Lennox KP, Zhou CLE, Zemla AT. Computational analysis of pathogen-borne metallo β-lactamases reveals discriminating structural features between B1 types. BMC Res Notes 2012; 5:96. [PMID: 22333139 PMCID: PMC3293060 DOI: 10.1186/1756-0500-5-96] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2011] [Accepted: 02/14/2012] [Indexed: 01/25/2023] Open
Abstract
Background Genes conferring antibiotic resistance to groups of bacterial pathogens are cause for considerable concern, as many once-reliable antibiotics continue to see a reduction in efficacy. The recent discovery of the metallo β-lactamase blaNDM-1 gene, which appears to grant antibiotic resistance to a variety of Enterobacteriaceae via a mobile plasmid, is one example of this distressing trend. The following work describes a computational analysis of pathogen-borne MBLs that focuses on the structural aspects of characterized proteins. Results Using both sequence and structural analyses, we examine residues and structural features specific to various pathogen-borne MBL types. This analysis identifies a linker region within MBL-like folds that may act as a discriminating structural feature between these proteins, and specifically resistance-associated acquirable MBLs. Recently released crystal structures of the newly emerged NDM-1 protein were aligned against related MBL structures using a variety of global and local structural alignment methods, and the overall fold conformation is examined for structural conservation. Conservation appears to be present in most areas of the protein, yet is strikingly absent within a linker region, making NDM-1 unique with respect to a linker-based classification scheme. Variability analysis of the NDM-1 crystal structure highlights unique residues in key regions as well as identifying several characteristics shared with other transferable MBLs. Conclusions A discriminating linker region identified in MBL proteins is highlighted and examined in the context of NDM-1 and primarily three other MBL types: IMP-1, VIM-2 and ccrA. The presence of an unusual linker region variant and uncommon amino acid composition at specific structurally important sites may help to explain the unusually broad kinetic profile of NDM-1 and may aid in directing research attention to areas of this protein, and possibly other MBLs, that may be targeted for inactivation or attenuation of enzymatic activity.
Collapse
Affiliation(s)
- Eithon Cadag
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, 94550 CA, USA.
| | | | | | | | | |
Collapse
|
46
|
Xu D, Zhang Y. Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimization. Biophys J 2011; 101:2525-34. [PMID: 22098752 DOI: 10.1016/j.bpj.2011.10.024] [Citation(s) in RCA: 700] [Impact Index Per Article: 53.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2011] [Revised: 09/20/2011] [Accepted: 10/21/2011] [Indexed: 11/15/2022] Open
Abstract
Most protein structural prediction algorithms assemble structures as reduced models that represent amino acids by a reduced number of atoms to speed up the conformational search. Building accurate full-atom models from these reduced models is a necessary step toward a detailed function analysis. However, it is difficult to ensure that the atomic models retain the desired global topology while maintaining a sound local atomic geometry because the reduced models often have unphysical local distortions. To address this issue, we developed a new program, called ModRefiner, to construct and refine protein structures from Cα traces based on a two-step, atomic-level energy minimization. The main-chain structures are first constructed from initial Cα traces and the side-chain rotamers are then refined together with the backbone atoms with the use of a composite physics- and knowledge-based force field. We tested the method by performing an atomic structure refinement of 261 proteins with the initial models constructed from both ab initio and template-based structure assemblies. Compared with other state-of-art programs, ModRefiner shows improvements in both global and local structures, which have more accurate side-chain positions, better hydrogen-bonding networks, and fewer atomic overlaps. ModRefiner is freely available at http://zhanglab.ccmb.med.umich.edu/ModRefiner.
Collapse
Affiliation(s)
- Dong Xu
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | | |
Collapse
|
47
|
Mariani V, Kiefer F, Schmidt T, Haas J, Schwede T. Assessment of template based protein structure predictions in CASP9. Proteins 2011; 79 Suppl 10:37-58. [PMID: 22002823 DOI: 10.1002/prot.23177] [Citation(s) in RCA: 132] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2011] [Revised: 09/01/2011] [Accepted: 09/04/2011] [Indexed: 12/29/2022]
Abstract
In the Ninth Edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP9), 61,665 models submitted by 176 groups were assessed for their accuracy in the template based modeling category. The models were evaluated numerically in comparison to their experimental control structures using two global measures (GDT and GDC), and a novel local score evaluating the correct modeling of local interactions (lDDT). Overall, the state of the art of template based modeling in CASP9 is high, with many groups performing well. Among the methods registered as prediction "servers", six independent groups are performing on average better than the rest. The submissions by "human" groups are dominated by meta-predictors, with one group performing noticeably better than the others. Most of the participating groups failed to assign realistic confidence estimates to their predictions, and only a very small fraction of the assessed methods have provided highly accurate models and realistic error estimates at the same time. Also, the accuracy of predictions for homo-oligomeric assemblies was overall poor, and only one group performed better than a naïve control predictor. Here, we present the results of our assessment of the CASP9 predictions in the category of template based modeling, documenting the state of the art and highlighting areas for future developments.
Collapse
Affiliation(s)
- Valerio Mariani
- Biozentrum University of Basel, Switzerland; SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | | | | | | | | |
Collapse
|
48
|
Almagro JC, Beavers MP, Hernandez-Guzman F, Maier J, Shaulsky J, Butenhof K, Labute P, Thorsteinson N, Kelly K, Teplyakov A, Luo J, Sweet R, Gilliland GL. Antibody modeling assessment. Proteins 2011; 79:3050-66. [PMID: 21935986 DOI: 10.1002/prot.23130] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2011] [Revised: 06/09/2011] [Accepted: 07/03/2011] [Indexed: 12/27/2022]
Abstract
A blinded study to assess the state of the art in three-dimensional structure modeling of the variable region (Fv) of antibodies was conducted. Nine unpublished high-resolution x-ray Fab crystal structures covering a wide range of antigen-binding site conformations were used as benchmark to compare Fv models generated by four structure prediction methodologies. The methodologies included two homology modeling strategies independently developed by CCG (Chemical Computer Group) and Accerlys Inc, and two fully automated antibody modeling servers: PIGS (Prediction of ImmunoGlobulin Structure), based on the canonical structure model, and Rosetta Antibody Modeling, based on homology modeling and Rosetta structure prediction methodology. The benchmark structure sequences were submitted to Accelrys and CCG and a set of models for each of the nine antibody structures were generated. PIGS and Rosetta models were obtained using the default parameters of the servers. In most cases, we found good agreement between the models and x-ray structures. The average rmsd (root mean square deviation) values calculated over the backbone atoms between the models and structures were fairly consistent, around 1.2 Å. Average rmsd values of the framework and hypervariable loops with canonical structures (L1, L2, L3, H1, and H2) were close to 1.0 Å. H3 prediction yielded rmsd values around 3.0 Å for most of the models. Quality assessment of the models and the relative strengths and weaknesses of the methods are discussed. We hope this initiative will serve as a model of scientific partnership and look forward to future antibody modeling assessments.
Collapse
|
49
|
MacCallum JL, Pérez A, Schnieders MJ, Hua L, Jacobson MP, Dill KA. Assessment of protein structure refinement in CASP9. Proteins 2011; 79 Suppl 10:74-90. [PMID: 22069034 DOI: 10.1002/prot.23131] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Revised: 06/15/2011] [Accepted: 07/03/2011] [Indexed: 11/06/2022]
Abstract
We assess performance in the structure refinement category in CASP9. Two years after CASP8, the performance of the best groups has not improved. There are few groups that improve any of our assessment scores with statistical significance. Some predictors, however, are able to consistently improve the physicality of the models. Although we cannot identify any clear bottleneck in improving refinement, several points arise: (1) The refinement portion of CASP has too few targets to make many statistically meaningful conclusions. (2) Predictors are usually very conservative, limiting the possibility of large improvements in models. (3) No group is actually able to correctly rank their five submissions-indicating that potentially better models may be discarded. (4) Different sampling strategies work better for different refinement problems; there is no single strategy that works on all targets. In general, conservative strategies do better, while the greatest improvements come from more adventurous sampling-at the cost of consistency. Comparison with experimental data reveals aspects not captured by comparison to a single structure. In particular, we show that improvement in backbone geometry does not always mean better agreement with experimental data. Finally, we demonstrate that even given the current challenges facing refinement, the refined models are useful for solving the crystallographic phase problem through molecular replacement. Proteins 2011;. © 2011 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Justin L MacCallum
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.
| | | | | | | | | | | |
Collapse
|
50
|
Kalia M, Kukol A. Structure and dynamics of the kinase IKK-β--A key regulator of the NF-kappa B transcription factor. J Struct Biol 2011; 176:133-42. [PMID: 21820058 DOI: 10.1016/j.jsb.2011.07.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 07/19/2011] [Accepted: 07/20/2011] [Indexed: 12/29/2022]
Abstract
The inhibitor κB kinase-β (IKK-β) phosphorylates the NF-κB inhibitor protein IκB leading to the translocation of the transcription factor NF-κB to the nucleus. The transcription factor NF-κB and consequently IKK-β are central to signal transduction pathways of mammalian cells. The purpose of this research was to develop a 3D structural model of the IKK-β kinase domain with its ATP cofactor and investigate its dynamics and ligand binding potential. Through a combination of comparative modelling and simulated heating/annealing molecular dynamics (SAMD) simulation in explicit water the model accuracy could be substantially improved compared to comparative modelling on its own as shown by model validation measures. The structure revealed the details of ATP/Mg(2+) binding indicating hydrophobic interactions with the adenine base and a significant contribution of Mg(2+) as a bridge between ATP phosphate groups and negatively charged side chains. The molecular dynamics trajectories of the ATP-bound and free enzyme showed two conformations in each case, which contributed to the majority of the trajectory. The ATP-free enzyme revealed a novel binding site distant from the ATP binding site that was not encountered in the ATP bound enzyme. Based on the overall structural flexibility, it is suggested that a truncated version of the kinase domain from Ala14 to Leu265 should be subjected to crystallisation trials. The 3D structure of this enzyme will enable rational design of new ligands and analysis of protein-protein interactions. Furthermore, our results may provide a new impetus for wet-lab based structural investigation focussing on a truncated kinase domain.
Collapse
Affiliation(s)
- Munishikha Kalia
- School of Life Sciences, University of Hertfordshire, Hatfield AL10 9AB, United Kingdom
| | | |
Collapse
|