1
|
Jiang D, Zhang J, Shen W, Sun Y, Wang Z, Wang J, Zhang J, Zhang G, Zhang G, Wang Y, Cai S, Zhang J, Wang Y, Liu R, Bai T, Sun Y, Yang S, Ma Z, Li Z, Li J, Ma C, Cheng L, Sun B, Yang K. DNA Vaccines Encoding HTNV GP-Derived Th Epitopes Benefited from a LAMP-Targeting Strategy and Established Cellular Immunoprotection. Vaccines (Basel) 2024; 12:928. [PMID: 39204051 PMCID: PMC11359959 DOI: 10.3390/vaccines12080928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2024] [Revised: 08/06/2024] [Accepted: 08/14/2024] [Indexed: 09/03/2024] Open
Abstract
Vaccines has long been the focus of antiviral immunotherapy research. Viral epitopes are thought to be useful biomarkers for immunotherapy (both antibody-based and cellular). In this study, we designed a novel vaccine molecule, the Hantaan virus (HTNV) glycoprotein (GP) tandem Th epitope molecule (named the Gnc molecule), in silico. Subsequently, computer analysis was used to conduct a comprehensive and in-depth study of the various properties of the molecule and its effects as a vaccine molecule in the body. The Gnc molecule was designed for DNA vaccines and optimized with a lysosomal-targeting membrane protein (LAMP) strategy. The effects of GP-derived Th epitopes and multiepitope vaccines were initially verified in animals. Our research has resulted in the design of two vaccines based on effective antiviral immune targets. The effectiveness of molecular therapies has also been preliminarily demonstrated in silico and in laboratory animals, which lays a foundation for the application of a vaccines strategy in the field of antivirals.
Collapse
Affiliation(s)
- Dongbo Jiang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
- Department of Microbiology, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China;
| | - Junqi Zhang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Wenyang Shen
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Yubo Sun
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Zhenjie Wang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Jiawei Wang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Jinpeng Zhang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Guanwen Zhang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Gefei Zhang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Yueyue Wang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Sirui Cai
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Jiaxing Zhang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Yongkai Wang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Ruibo Liu
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Tianyuan Bai
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Yuanjie Sun
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Shuya Yang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Zilu Ma
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Zhikui Li
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Jijin Li
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Chenjin Ma
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| | - Linfeng Cheng
- Department of Microbiology, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China;
| | - Baozeng Sun
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
- Yingtan Detachment, Jiangxi General Hospital, Chinese People’s Armed Police Force, Nanchang 330001, China
| | - Kun Yang
- Department of Immunology, The Key Laboratory of Bio-Hazard Damage and Prevention Medicine, Basic Medicine School, Air-Force Medical University (the Fourth Military Medical University), Xi’an 710032, China; (D.J.); (J.Z.); (W.S.); (Y.S.); (Z.W.); (J.W.); (J.Z.); (G.Z.); (G.Z.); (Y.W.); (S.C.); (J.Z.); (Y.W.); (R.L.); (T.B.); (Y.S.); (S.Y.); (Z.M.); (Z.L.); (J.L.); (C.M.)
| |
Collapse
|
2
|
De Salis SKF, Chen JZ, Skarratt KK, Fuller SJ, Balle T. Deep learning structural insights into heterotrimeric alternatively spliced P2X7 receptors. Purinergic Signal 2024; 20:431-447. [PMID: 38032425 DOI: 10.1007/s11302-023-09978-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 10/31/2023] [Indexed: 12/01/2023] Open
Abstract
P2X7 receptors (P2X7Rs) are membrane-bound ATP-gated ion channels that are composed of three subunits. Different subunit structures may be expressed due to alternative splicing of the P2RX7 gene, altering the receptor's function when combined with the wild-type P2X7A subunits. In this study, the application of the deep-learning method, AlphaFold2-Multimer (AF2M), for the generation of trimeric P2X7Rs was validated by comparing an AF2M-generated rat wild-type P2X7A receptor with a structure determined by cryogenic electron microscopy (cryo-EM) (Protein Data Bank Identification: 6U9V). The results suggested AF2M could firstly, accurately predict the structures of P2X7Rs and secondly, accurately identify the highest quality model through the ranking system. Subsequently, AF2M was used to generate models of heterotrimeric alternatively spliced P2X7Rs consisting of one or two wild-type P2X7A subunits in combination with one or two P2X7B, P2X7E, P2X7J, and P2X7L splice variant subunits. The top-ranking models were deemed valid based on AF2M's confidence measures, stability in molecular dynamics simulations, and consistent flexibility of the conserved regions between the models. The structure of the heterotrimeric receptors, which were missing key residues in the ATP binding sites and carboxyl terminal domains (CTDs) compared to the wild-type receptor, help to explain their observed functions. Overall, the models produced in this study (available as supplementary material) unlock the possibility of structure-based studies into the heterotrimeric P2X7Rs.
Collapse
Affiliation(s)
- Sophie K F De Salis
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Jake Zheng Chen
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia
| | - Kristen K Skarratt
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Stephen J Fuller
- The University of Sydney, Nepean Clinical School, Kingswood, NSW, 2747, Australia
| | - Thomas Balle
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW, 2050, Australia.
- Sydney Pharmacy School, The University of Sydney, Camperdown, NSW, 2050, Australia.
| |
Collapse
|
3
|
Simpkin AJ, Mesdaghi S, Sánchez Rodríguez F, Elliott L, Murphy DL, Kryshtafovych A, Keegan RM, Rigden DJ. Tertiary structure assessment at CASP15. Proteins 2023; 91:1616-1635. [PMID: 37746927 PMCID: PMC10792517 DOI: 10.1002/prot.26593] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/25/2023] [Accepted: 09/07/2023] [Indexed: 09/26/2023]
Abstract
The results of tertiary structure assessment at CASP15 are reported. For the first time, recognizing the outstanding performance of AlphaFold 2 (AF2) at CASP14, all single-chain predictions were assessed together, irrespective of whether a template was available. At CASP15, there was no single stand-out group, with most of the best-scoring groups-led by PEZYFoldings, UM-TBM, and Yang Server-employing AF2 in one way or another. Many top groups paid special attention to generating deep Multiple Sequence Alignments (MSAs) and testing variant MSAs, thereby allowing them to successfully address some of the hardest targets. Such difficult targets, as well as lacking templates, were typically proteins with few homologues. Local divergence between prediction and target correlated with localization at crystal lattice or chain interfaces, and with regions exhibiting high B-factor factors in crystal structure targets, and should not necessarily be considered as representing error in the prediction. However, analysis of exposed and buried side chain accuracy showed room for improvement even in the latter. Nevertheless, a majority of groups produced high-quality predictions for most targets, which are valuable for experimental structure determination, functional analysis, and many other tasks across biology. These include those applying methods similar to those used to generate major resources such as the AlphaFold Protein Structure Database and the ESM Metagenomic atlas: the confidence estimates of the former were also notably accurate.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - Shahram Mesdaghi
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Computational Biology Facility, MerseyBio, University of LiverpoolLiverpoolUK
| | - Filomeno Sánchez Rodríguez
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
- Life Science, Diamond Light Source, Harwell Science and Innovation CampusOxfordshireUK
- Department of Chemistry, York Structural Biology LaboratoryUniversity of YorkYorkUK
| | - Luc Elliott
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | - David L. Murphy
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| | | | - Ronan M. Keegan
- UKRI‐STFC, Rutherford Appleton Laboratory, Research Complex at HarwellDidcotUK
| | - Daniel J. Rigden
- Department of Biochemistry, Cell and Systems BiologyInstitute of Structural, Molecular and Integrative Biology, University of LiverpoolLiverpoolUK
| |
Collapse
|
4
|
Roy RS, Liu J, Giri N, Guo Z, Cheng J. Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15. Proteins 2023; 91:1889-1902. [PMID: 37357816 PMCID: PMC10749984 DOI: 10.1002/prot.26542] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Revised: 06/07/2023] [Accepted: 06/08/2023] [Indexed: 06/27/2023]
Abstract
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and performed very well in estimating the global structure accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analyzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA.
Collapse
Affiliation(s)
- Raj S. Roy
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
5
|
Huang GJ, Parry TK, McLaughlin WA. Assessment of the Performances of the Protein Modeling Techniques Participating in CASP15 Using a Structure-Based Functional Site Prediction Approach: ResiRole. Bioengineering (Basel) 2023; 10:1377. [PMID: 38135968 PMCID: PMC10740689 DOI: 10.3390/bioengineering10121377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/27/2023] [Accepted: 11/28/2023] [Indexed: 12/24/2023] Open
Abstract
BACKGROUND Model quality assessments via computational methods which entail comparisons of the modeled structures to the experimentally determined structures are essential in the field of protein structure prediction. The assessments provide means to benchmark the accuracies of the modeling techniques and to aid with their development. We previously described the ResiRole method to gauge model quality principally based on the preservation of the structural characteristics described in SeqFEATURE functional site prediction models. METHODS We apply ResiRole to benchmark modeling group performances in the Critical Assessment of Structure Prediction experiment, round 15. To gauge model quality, a normalized Predicted Functional site Similarity Score (PFSS) was calculated as the average of one minus the absolute values of the differences of the functional site prediction probabilities, as found for the experimental structures versus those found at the corresponding sites in the structure models. RESULTS The average PFSS per modeling group (gPFSS) correlates with standard quality metrics, and can effectively be used to rank the accuracies of the groups. For the free modeling (FM) category, correlation coefficients of the Local Distance Difference Test (LDDT) and Global Distance Test-Total Score (GDT-TS) metrics with gPFSS were 0.98239 and 0.87691, respectively. An example finding for a specific group is that the gPFSS for EMBER3D was higher than expected based on the predictive relationship between gPFSS and LDDT. We infer the result is due to the use of constraints imprinted by function that are a part of the EMBER3D methodology. Also, we find functional site predictions that may guide further functional characterizations of the respective proteins. CONCLUSION The gPFSS metric provides an effective means to assess and rank the performances of the structure prediction techniques according to their abilities to accurately recount the structural features at predicted functional sites.
Collapse
Affiliation(s)
| | | | - William A. McLaughlin
- Department of Medical Education, Geisinger Commonwealth School of Medicine, 525 Pine Street, Scranton, PA 18509, USA (T.K.P.)
| |
Collapse
|
6
|
Roy RS, Liu J, Giri N, Guo Z, Cheng J. Combining pairwise structural similarity and deep learning interface contact prediction to estimate protein complex model accuracy in CASP15. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.08.531814. [PMID: 36945536 PMCID: PMC10028888 DOI: 10.1101/2023.03.08.531814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Estimating the accuracy of quaternary structural models of protein complexes and assemblies (EMA) is important for predicting quaternary structures and applying them to studying protein function and interaction. The pairwise similarity between structural models is proven useful for estimating the quality of protein tertiary structural models, but it has been rarely applied to predicting the quality of quaternary structural models. Moreover, the pairwise similarity approach often fails when many structural models are of low quality and similar to each other. To address the gap, we developed a hybrid method (MULTICOM_qa) combining a pairwise similarity score (PSS) and an interface contact probability score (ICPS) based on the deep learning inter-chain contact prediction for estimating protein complex model accuracy. It blindly participated in the 15th Critical Assessment of Techniques for Protein Structure Prediction (CASP15) in 2022 and ranked first out of 24 predictors in estimating the global accuracy of assembly models. The average per-target correlation coefficient between the model quality scores predicted by MULTICOM_qa and the true quality scores of the models of CASP15 assembly targets is 0.66. The average per-target ranking loss in using the predicted quality scores to rank the models is 0.14. It was able to select good models for most targets. Moreover, several key factors (i.e., target difficulty, model sampling difficulty, skewness of model quality, and similarity between good/bad models) for EMA are identified and analayzed. The results demonstrate that combining the multi-model method (PSS) with the complementary single-model method (ICPS) is a promising approach to EMA. The source code of MULTICOM_qa is available at https://github.com/BioinfoMachineLearning/MULTICOM_qa .
Collapse
Affiliation(s)
- Raj S. Roy
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jian Liu
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Zhiye Guo
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, NextGen Precision Health, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
7
|
Moafinejad SN, Pandaranadar Jeyeram IPN, Jaryani F, Shirvanizadeh N, Baulin EF, Bujnicki JM. 1D2DSimScore: A novel method for comparing contacts in biomacromolecules and their complexes. Protein Sci 2023; 32:e4503. [PMID: 36369832 PMCID: PMC9795538 DOI: 10.1002/pro.4503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 10/28/2022] [Accepted: 11/01/2022] [Indexed: 11/13/2022]
Abstract
The biologically relevant structures of proteins and nucleic acids and their complexes are dynamic. They include a combination of regions ranging from rigid structural segments to structural switches to regions that are almost always disordered, which interact with each other in various ways. Comparing conformational changes and variation in contacts between different conformational states is essential to understand the biological functions of proteins, nucleic acids, and their complexes. Here, we describe a new computational tool, 1D2DSimScore, for comparing contacts and contact interfaces in all kinds of macromolecules and macromolecular complexes, including proteins, nucleic acids, and other molecules. 1D2DSimScore can be used to compare structural features of macromolecular models between alternative structures obtained in a particular experiment or to score various predictions against a defined "ideal" reference structure. Comparisons at the level of contacts are particularly useful for flexible molecules, for which comparisons in 3D that require rigid-body superpositions are difficult, and in biological systems where the formation of specific inter-residue contacts is more relevant for the biological function than the maintenance of a specific global 3D structure. Similarity/dissimilarity scores calculated by 1D2DSimScore can be used to complement scores describing 3D structural similarity measures calculated by the existing tools.
Collapse
Affiliation(s)
- S. Naeim Moafinejad
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | | | - Farhang Jaryani
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Niloofar Shirvanizadeh
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Eugene F. Baulin
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| | - Janusz M. Bujnicki
- Laboratory of Bioinformatics and Protein EngineeringInternational Institute of Molecular and Cell Biology in WarsawWarsawPoland
| |
Collapse
|
8
|
Roy RS, Quadir F, Soltanikazemi E, Cheng J. OUP accepted manuscript. Bioinformatics 2022; 38:1904-1910. [PMID: 35134816 PMCID: PMC8963319 DOI: 10.1093/bioinformatics/btac063] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 01/17/2022] [Accepted: 01/31/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation Deep learning has revolutionized protein tertiary structure prediction recently. The cutting-edge deep learning methods such as AlphaFold can predict high-accuracy tertiary structures for most individual protein chains. However, the accuracy of predicting quaternary structures of protein complexes consisting of multiple chains is still relatively low due to lack of advanced deep learning methods in the field. Because interchain residue–residue contacts can be used as distance restraints to guide quaternary structure modeling, here we develop a deep dilated convolutional residual network method (DRCon) to predict interchain residue–residue contacts in homodimers from residue–residue co-evolutionary signals derived from multiple sequence alignments of monomers, intrachain residue–residue contacts of monomers extracted from true/predicted tertiary structures or predicted by deep learning, and other sequence and structural features. Results Tested on three homodimer test datasets (Homo_std dataset, DeepHomo dataset and CASP-CAPRI dataset), the precision of DRCon for top L/5 interchain contact predictions (L: length of monomer in a homodimer) is 43.46%, 47.10% and 33.50% respectively at 6 Å contact threshold, which is substantially better than DeepHomo and DNCON2_inter and similar to Glinter. Moreover, our experiments demonstrate that using predicted tertiary structure or intrachain contacts of monomers in the unbound state as input, DRCon still performs well, even though its accuracy is lower than using true tertiary structures in the bound state are used as input. Finally, our case study shows that good interchain contact predictions can be used to build high-accuracy quaternary structure models of homodimers. Availability and implementation The source code of DRCon is available at https://github.com/jianlin-cheng/DRCon. The datasets are available at https://zenodo.org/record/5998532#.YgF70vXMKsB. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raj S Roy
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Farhan Quadir
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Elham Soltanikazemi
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, USA
| | | |
Collapse
|
9
|
Cragnolini T, Kryshtafovych A, Topf M. Cryo-EM targets in CASP14. Proteins 2021; 89:1949-1958. [PMID: 34398978 PMCID: PMC8630773 DOI: 10.1002/prot.26216] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/27/2021] [Accepted: 08/06/2021] [Indexed: 11/22/2022]
Abstract
Structures of seven CASP14 targets were determined using cryo-electron microscopy (cryo-EM) technique with resolution between 2.1 and 3.8 Å. We provide an evaluation of the submitted models versus the experimental data (cryo-EM density maps) and experimental reference structures built into the maps. The accuracy of models is measured in terms of coordinate-to-density and coordinate-to-coordinate fit. A-posteriori refinement of the most accurate models in their corresponding cryo-EM density resulted in structures that are close to the reference structure, including some regions with better fit to the density. Regions that were found to be less "refineable" correlate well with regions of high diversity between the CASP models and low goodness-of-fit to density in the reference structure.
Collapse
Affiliation(s)
- Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck, University College London, London, UK
| | | | - Maya Topf
- Center for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| |
Collapse
|
10
|
Simpkin AJ, Rodríguez FS, Mesdaghi S, Kryshtafovych A, Rigden DJ. Evaluation of model refinement in CASP14. Proteins 2021; 89:1852-1869. [PMID: 34288138 PMCID: PMC8616799 DOI: 10.1002/prot.26185] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 07/11/2021] [Indexed: 12/15/2022]
Abstract
We report here an assessment of the model refinement category of the 14th round of Critical Assessment of Structure Prediction (CASP14). As before, predictors submitted up to five ranked refinements, along with associated residue-level error estimates, for targets that had a wide range of starting quality. The ability of groups to accurately rank their submissions and to predict coordinate error varied widely. Overall, only four groups out-performed a "naïve predictor" corresponding to the resubmission of the starting model. Among the top groups, there are interesting differences of approach and in the spread of improvements seen: some methods are more conservative, others more adventurous. Some targets were "double-barreled" for which predictors were offered a high-quality AlphaFold 2 (AF2)-derived prediction alongside another of lower quality. The AF2-derived models were largely unimprovable, many of their apparent errors being found to reside at domain and, especially, crystal lattice contacts. Refinement is shown to have a mixed impact overall on structure-based function annotation methods to predict nucleic acid binding, spot catalytic sites, and dock protein structures.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
- Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0DE, England
| | - Shahram Mesdaghi
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | | | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
11
|
Kryshtafovych A, Moult J, Billings WM, Della Corte D, Fidelis K, Kwon S, Olechnovič K, Seok C, Venclovas Č, Won J. Modeling SARS-CoV-2 proteins in the CASP-commons experiment. Proteins 2021; 89:1987-1996. [PMID: 34462960 PMCID: PMC8616790 DOI: 10.1002/prot.26231] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/23/2021] [Accepted: 08/26/2021] [Indexed: 01/21/2023]
Abstract
Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
Collapse
Affiliation(s)
| | - John Moult
- Department of Cell Biology and Molecular genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, USA
| | - Wendy M Billings
- Department of Physics & Astronomy, Brigham Young University, Provo, Utah, USA
| | - Dennis Della Corte
- Department of Physics & Astronomy, Brigham Young University, Provo, Utah, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, Davis, California, USA
| | - Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | | |
Collapse
|
12
|
Kwon S, Won J, Kryshtafovych A, Seok C. Assessment of protein model structure accuracy estimation in CASP14: Old and new challenges. Proteins 2021; 89:1940-1948. [PMID: 34324227 DOI: 10.1002/prot.26192] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 07/17/2021] [Accepted: 07/22/2021] [Indexed: 12/27/2022]
Abstract
In CASP, blind testing of model accuracy estimation methods has been conducted on models submitted by tertiary structure prediction servers. In CASP14, model accuracy estimation results were evaluated in terms of both global and local structure accuracy, as in the previous CASPs. Unlike the previous CASPs that did not show pronounced improvements in performance, the best single-model method (from the Baker group) showed an improved performance in CASP14, particularly in evaluating global structure accuracy when compared to both the best single-model methods in previous CASPs and the best multi-model methods in the current CASP. Although the CASP14 experiment on model accuracy estimation did not deal with the structures generated by AlphaFold2, new challenges that have arisen due to the success of AlphaFold2 are discussed.
Collapse
Affiliation(s)
- Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea.,Galux Inc., Seoul, Republic of Korea
| | | | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea.,Galux Inc., Seoul, Republic of Korea
| |
Collapse
|
13
|
Pereira J, Simpkin AJ, Hartmann MD, Rigden DJ, Keegan RM, Lupas AN. High-accuracy protein structure prediction in CASP14. Proteins 2021; 89:1687-1699. [PMID: 34218458 DOI: 10.1002/prot.26171] [Citation(s) in RCA: 182] [Impact Index Per Article: 60.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/16/2021] [Accepted: 06/23/2021] [Indexed: 12/25/2022]
Abstract
The application of state-of-the-art deep-learning approaches to the protein modeling problem has expanded the "high-accuracy" category in CASP14 to encompass all targets. Building on the metrics used for high-accuracy assessment in previous CASPs, we evaluated the performance of all groups that submitted models for at least 10 targets across all difficulty classes, and judged the usefulness of those produced by AlphaFold2 (AF2) as molecular replacement search models with AMPLE. Driven by the qualitative diversity of the targets submitted to CASP, we also introduce DipDiff as a new measure for the improvement in backbone geometry provided by a model versus available templates. Although a large leap in high-accuracy is seen due to AF2, the second-best method in CASP14 out-performed the best in CASP13, illustrating the role of community-based benchmarking in the development and evolution of the protein structure prediction field.
Collapse
Affiliation(s)
- Joana Pereira
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Adam J Simpkin
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Marcus D Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Daniel J Rigden
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, UK
| | - Ronan M Keegan
- Department of Scientific Computing, Science and Technologies Facilities Council, UK Research and Innovation, Didcot, Oxfordshire, UK
| | - Andrei N Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| |
Collapse
|
14
|
Shuvo MH, Gulfam M, Bhattacharya D. DeepRefiner: high-accuracy protein structure refinement by deep network calibration. Nucleic Acids Res 2021; 49:W147-W152. [PMID: 33999209 PMCID: PMC8262753 DOI: 10.1093/nar/gkab361] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 04/18/2021] [Accepted: 04/23/2021] [Indexed: 12/20/2022] Open
Abstract
The DeepRefiner webserver, freely available at http://watson.cse.eng.auburn.edu/DeepRefiner/, is an interactive and fully configurable online system for high-accuracy protein structure refinement. Fuelled by deep learning, DeepRefiner offers the ability to leverage cutting-edge deep neural network architectures which can be calibrated for on-demand selection of adventurous or conservative refinement modes targeted at degree or consistency of refinement. The method has been extensively tested in the Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments under the group name 'Bhattacharya-Server' and was officially ranked as the No. 2 refinement server in CASP13 (second only to 'Seok-server' and outperforming all other refinement servers) and No. 2 refinement server in CASP14 (second only to 'FEIG-S' and outperforming all other refinement servers including 'Seok-server'). The DeepRefiner web interface offers a number of convenient features, including (i) fully customizable refinement job submission and validation; (ii) automated job status update, tracking, and notifications; (ii) interactive and interpretable web-based results retrieval with quantitative and visual analysis and (iv) extensive help information on job submission and results interpretation via web-based tutorial and help tooltips.
Collapse
Affiliation(s)
- Md Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
| | - Muhammad Gulfam
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL 36849, USA
- Department of Biological Sciences, Auburn University, Auburn, AL 36849, USA
| |
Collapse
|
15
|
Runthala A. Probabilistic divergence of a template-based modelling methodology from the ideal protocol. J Mol Model 2021; 27:25. [PMID: 33411019 DOI: 10.1007/s00894-020-04640-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 12/09/2020] [Indexed: 12/27/2022]
Abstract
Protein structural information is essential for the detailed mapping of a functional protein network. For a higher modelling accuracy and quicker implementation, template-based algorithms have been extensively deployed and redefined. The methods only assess the predicted structure against its native state/template and do not estimate the accuracy for each modelling step. A divergence measure is therefore postulated to estimate the modelling accuracy against its theoretical optimal benchmark. By freezing the domain boundaries, the divergence measures are predicted for the most crucial steps of a modelling algorithm. To precisely refine the score using weighting constants, big data analysis could further be deployed.
Collapse
Affiliation(s)
- Ashish Runthala
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh, 522502, India.
| |
Collapse
|
16
|
Lawson CL, Kryshtafovych A, Adams PD, Afonine PV, Baker ML, Barad BA, Bond P, Burnley T, Cao R, Cheng J, Chojnowski G, Cowtan K, Dill KA, DiMaio F, Farrell DP, Fraser JS, Herzik MA, Hoh SW, Hou J, Hung LW, Igaev M, Joseph AP, Kihara D, Kumar D, Mittal S, Monastyrskyy B, Olek M, Palmer CM, Patwardhan A, Perez A, Pfab J, Pintilie GD, Richardson JS, Rosenthal PB, Sarkar D, Schäfer LU, Schmid MF, Schröder GF, Shekhar M, Si D, Singharoy A, Terashi G, Terwilliger TC, Vaiana A, Wang L, Wang Z, Wankowicz SA, Williams CJ, Winn M, Wu T, Yu X, Zhang K, Berman HM, Chiu W. Cryo-EM model validation recommendations based on outcomes of the 2019 EMDataResource challenge. Nat Methods 2021; 18:156-164. [PMID: 33542514 PMCID: PMC7864804 DOI: 10.1038/s41592-020-01051-w] [Citation(s) in RCA: 66] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Accepted: 12/21/2020] [Indexed: 01/30/2023]
Abstract
This paper describes outcomes of the 2019 Cryo-EM Model Challenge. The goals were to (1) assess the quality of models that can be produced from cryogenic electron microscopy (cryo-EM) maps using current modeling software, (2) evaluate reproducibility of modeling results from different software developers and users and (3) compare performance of current metrics used for model evaluation, particularly Fit-to-Map metrics, with focus on near-atomic resolution. Our findings demonstrate the relatively high accuracy and reproducibility of cryo-EM models derived by 13 participating teams from four benchmark maps, including three forming a resolution series (1.8 to 3.1 Å). The results permit specific recommendations to be made about validating near-atomic cryo-EM structures both in the context of individual experiments and structure data archives such as the Protein Data Bank. We recommend the adoption of multiple scoring parameters to provide full and objective annotation and assessment of the model, reflective of the observed cryo-EM map density.
Collapse
Affiliation(s)
- Catherine L. Lawson
- grid.430387.b0000 0004 1936 8796Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ USA
| | - Andriy Kryshtafovych
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Paul D. Adams
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA ,grid.47840.3f0000 0001 2181 7878Department of Bioengineering, University of California Berkeley, Berkeley, CA USA
| | - Pavel V. Afonine
- grid.184769.50000 0001 2231 4551Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA USA
| | - Matthew L. Baker
- grid.267308.80000 0000 9206 2401Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX USA
| | - Benjamin A. Barad
- grid.214007.00000000122199231Department of Integrated Computational Structural Biology, The Scripps Research Institute, La Jolla, CA USA
| | - Paul Bond
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom Burnley
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Renzhi Cao
- grid.261584.c0000 0001 0492 9915Department of Computer Science, Pacific Lutheran University, Tacoma, WA USA
| | - Jianlin Cheng
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Grzegorz Chojnowski
- grid.475756.20000 0004 0444 5410European Molecular Biology Laboratory, c/o DESY, Hamburg, Germany
| | - Kevin Cowtan
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Ken A. Dill
- grid.36425.360000 0001 2216 9681Laufer Center, Stony Brook University, Stony Brook, NY USA
| | - Frank DiMaio
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - Daniel P. Farrell
- grid.34477.330000000122986657Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA USA
| | - James S. Fraser
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA
| | - Mark A. Herzik
- grid.266100.30000 0001 2107 4242Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA USA
| | - Soon Wen Hoh
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- grid.262962.b0000 0004 1936 9342Department of Computer Science, Saint Louis University, St. Louis, MO USA
| | - Li-Wei Hung
- grid.148313.c0000 0004 0428 3079Los Alamos National Laboratory, Los Alamos, NM USA
| | - Maxim Igaev
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Agnel P. Joseph
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Daisuke Kihara
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.169077.e0000 0004 1937 2197Department of Computer Science, Purdue University, West Lafayette, IN USA
| | - Dilip Kumar
- grid.39382.330000 0001 2160 926XVerna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX USA
| | - Sumit Mittal
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.411530.20000 0001 0694 3745School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Bohdan Monastyrskyy
- grid.27860.3b0000 0004 1936 9684Genome Center, University of California, Davis, CA USA
| | - Mateusz Olek
- grid.5685.e0000 0004 1936 9668York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Colin M. Palmer
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Ardan Patwardhan
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Alberto Perez
- grid.15276.370000 0004 1936 8091Department of Chemistry, University of Florida, Gainesville, FL USA
| | - Jonas Pfab
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Grigore D. Pintilie
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Jane S. Richardson
- grid.26009.3d0000 0004 1936 7961Department of Biochemistry, Duke University, Durham, NC USA
| | - Peter B. Rosenthal
- grid.451388.30000 0004 1795 1830Structural Biology of Cells and Viruses Laboratory, Francis Crick Institute, London, UK
| | - Daipayan Sarkar
- grid.169077.e0000 0004 1937 2197Department of Biological Sciences, Purdue University, West Lafayette, IN USA ,grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Luisa U. Schäfer
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Michael F. Schmid
- grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| | - Gunnar F. Schröder
- grid.8385.60000 0001 2297 375XInstitute of Biological Information Processing (IBI-7: Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany ,grid.411327.20000 0001 2176 9917Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA ,grid.66859.34Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Dong Si
- grid.462982.30000 0000 8883 2602Division of Computing & Software Systems, University of Washington, Bothell, WA USA
| | - Abishek Singharoy
- grid.215654.10000 0001 2151 2636Biodesign Institute, Arizona State University, Tempe, AZ USA
| | - Genki Terashi
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | | | - Andrea Vaiana
- grid.418140.80000 0001 2104 4211Theoretical and Computational Biophysics, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Liguo Wang
- grid.34477.330000000122986657Department of Biological Structure, University of Washington, Seattle, WA USA
| | - Zhe Wang
- grid.225360.00000 0000 9709 7726The European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Stephanie A. Wankowicz
- grid.266102.10000 0001 2297 6811Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA USA ,grid.266102.10000 0001 2297 6811Biophysics Graduate Program, University of California, San Francisco, CA USA
| | | | - Martyn Winn
- grid.465239.fScientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Tianqi Wu
- grid.134936.a0000 0001 2162 3504Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO USA
| | - Xiaodi Yu
- grid.497530.c0000 0004 0389 4927SMPS, Janssen Research and Development, Spring House, PA USA
| | - Kaiming Zhang
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA
| | - Helen M. Berman
- grid.430387.b0000 0004 1936 8796Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ USA ,grid.42505.360000 0001 2156 6853Department of Biological Sciences and Bridge Institute, University of Southern California, Los Angeles, CA USA
| | - Wah Chiu
- grid.168010.e0000000419368956Department of Bioengineering, Stanford University, Stanford, CA USA ,grid.168010.e0000000419368956Division of CryoEM and Biomaging, SSRL, SLAC National Accelerator Laboratory, Stanford University, Menlo Park, CA USA
| |
Collapse
|
17
|
Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J 2020; 18:3494-3506. [PMID: 33304450 PMCID: PMC7695898 DOI: 10.1016/j.csbj.2020.11.007] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/12/2022] Open
Abstract
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.
Collapse
Affiliation(s)
- Tareq Hameduh
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
| | - Yazan Haddad
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Vojtech Adam
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Zbynek Heger
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| |
Collapse
|
18
|
Alapati R, Shuvo MH, Bhattacharya D. SPECS: Integration of side-chain orientation and global distance-based measures for improved evaluation of protein structural models. PLoS One 2020; 15:e0228245. [PMID: 32053611 PMCID: PMC7018003 DOI: 10.1371/journal.pone.0228245] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 01/11/2020] [Indexed: 12/23/2022] Open
Abstract
Significant advancements in the field of protein structure prediction have necessitated the need for objective and robust evaluation of protein structural models by comparing predicted models against the experimentally determined native structures to quantitate their structural similarities. Existing protein model versus native similarity metrics either consider the distances between alpha carbon (Cα) or side-chain atoms for computing the similarity. However, side-chain orientation of a protein plays a critical role in defining its conformation at the atomic-level. Despite its importance, inclusion of side-chain orientation in structural similarity evaluation has not yet been addressed. Here, we present SPECS, a side-chain-orientation-included protein model-native similarity metric for improved evaluation of protein structural models. SPECS combines side-chain orientation and global distance based measures in an integrated framework using the united-residue model of polypeptide conformation for computing model-native similarity. Experimental results demonstrate that SPECS is a reliable measure for evaluating structural similarity at the global level including and beyond the accuracy of Cα positioning. Moreover, SPECS delivers superior performance in capturing local quality aspect compared to popular global Cα positioning-based metrics ranging from models at near-experimental accuracies to models with correct overall folds-making it a robust measure suitable for both high- and moderate-resolution models. Finally, SPECS is sensitive to minute variations in side-chain χ angles even for models with perfect Cα trace, revealing the power of including side-chain orientation. Collectively, SPECS is a versatile evaluation metric covering a wide spectrum of protein modeling scenarios and simultaneously captures complementary aspects of structural similarities at multiple levels of granularities. SPECS is freely available at http://watson.cse.eng.auburn.edu/SPECS/.
Collapse
Affiliation(s)
- Rahul Alapati
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Md. Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama, United States of America
- Department of Biological Sciences, Auburn University, Auburn, Alabama, United States of America
| |
Collapse
|
19
|
Yazhini A, Srinivasan N. How good are comparative models in the understanding of protein dynamics? Proteins 2020; 88:874-888. [PMID: 31999374 DOI: 10.1002/prot.25879] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 01/04/2020] [Accepted: 01/25/2020] [Indexed: 12/27/2022]
Abstract
The 3D structure of a protein is essential to understand protein dynamics. If experimentally determined structure is unavailable, comparative models could be used to infer dynamics. However, the effectiveness of comparative models, compared to experimental structures, in inferring dynamics is not clear. To address this, we compared dynamics features of ~800 comparative models with their crystal structures using normal mode analysis. Average similarity in magnitude, direction, and correlation of residue motions is >0.8 (where value 1 is identical) indicating that the dynamics of models and crystal structures are highly similar. Accuracy of 3D structure and dynamics is significantly higher for models built on multiple and/or high sequence identity templates (>40%). Three-dimensional (3D) structure and residue fluctuations of models are closer to that of crystal structures than to templates (TM score 0.9 vs 0.7 and square inner product 0.92 vs 0.88). Furthermore, long-range molecular dynamics simulations on comparative models of RNase 1 and Angiogenin showed significant differences in the conformational sampling of conserved active-site residues that characterize differences in their activity levels. Similar analyses on two EGFR kinase variant models highlight the effect of mutations on the functional state-specific αC helix motions and these results corroborate with the previous experimental observations. Thus, our study adds confidence to the use of comparative models in understanding protein dynamics.
Collapse
Affiliation(s)
- Arangasamy Yazhini
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | | |
Collapse
|
20
|
Olechnovič K, Venclovas Č. Contact Area-Based Structural Analysis of Proteins and Their Complexes Using CAD-Score. Methods Mol Biol 2020; 2112:75-90. [PMID: 32006279 DOI: 10.1007/978-1-0716-0270-6_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Quantifying discrepancies between computationally derived and native (reference) structures is an essential step in the development and comparison of protein modeling and protein-protein docking methods. Measuring conformational differences of proteins or protein complexes is also important in other areas of structural biology such as molecular dynamics and crystallography. There are multiple scores to do that. However, nearly all of them, whether superposition-based (e.g., RMSD) or superposition-free, use distances to measure similarity. CAD-score is conceptually different as it uses physical contacts represented as contact areas. Such representation makes it possible to quantify differences of both structures and surfaces (e.g., protein-protein interfaces and binding sites) using the same framework. A number of studies have found CAD-score to be among the most robust scores. The method is implemented both as a web server and as standalone software available at http://bioinformatics.lt/software/cad-score . Here, we describe how to use the standalone CAD-score software for comparison and analysis of protein structures, interfaces, and binding sites.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania.
| |
Collapse
|
21
|
Olechnovič K, Monastyrskyy B, Kryshtafovych A, Venclovas Č. Comparative analysis of methods for evaluation of protein models against native structures. Bioinformatics 2019; 35:937-944. [PMID: 30169622 DOI: 10.1093/bioinformatics/bty760] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 08/04/2018] [Accepted: 08/28/2018] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION Measuring discrepancies between protein models and native structures is at the heart of development of protein structure prediction methods and comparison of their performance. A number of different evaluation methods have been developed; however, their comprehensive and unbiased comparison has not been performed. RESULTS We carried out a comparative analysis of several popular model assessment methods (RMSD, TM-score, GDT, QCS, CAD-score, LDDT, SphereGrinder and RPF) to reveal their relative strengths and weaknesses. The analysis, performed on a large and diverse model set derived in the course of three latest community-wide CASP experiments (CASP10-12), had two major directions. First, we looked at general differences between the scores by analyzing distribution, correspondence and correlation of their values as well as differences in selecting best models. Second, we examined the score differences taking into account various structural properties of models (stereochemistry, hydrogen bonds, packing of domains and chain fragments, missing residues, protein length and secondary structure). Our results provide a solid basis for an informed selection of the most appropriate score or combination of scores depending on the task at hand. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology Life Sciences Center Vilnius University, Saulėtekio 7, Vilnius, Lithuania
| | | | | | - Česlovas Venclovas
- Institute of Biotechnology Life Sciences Center Vilnius University, Saulėtekio 7, Vilnius, Lithuania
| |
Collapse
|
22
|
Kryshtafovych A, Malhotra S, Monastyrskyy B, Cragnolini T, Joseph AP, Chiu W, Topf M. Cryo-electron microscopy targets in CASP13: Overview and evaluation of results. Proteins 2019; 87:1128-1140. [PMID: 31576602 PMCID: PMC7197460 DOI: 10.1002/prot.25817] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2019] [Revised: 08/30/2019] [Accepted: 09/13/2019] [Indexed: 11/07/2022]
Abstract
Structures of seven CASP13 targets were determined using cryo-electron microscopy (cryo-EM) technique with resolution between 3.0 and 4.0 Å. We provide an overview of the experimentally derived structures and describe results of the numerical evaluation of the submitted models. The evaluation is carried out by comparing coordinates of models to those of reference structures (CASP-style evaluation), as well as checking goodness-of-fit of modeled structures to the cryo-EM density maps. The performance of contributing research groups in the CASP-style evaluation is measured in terms of backbone accuracy, all-atom local geometry and similarity of inter-subunit interfaces. The results on the cryo-EM targets are compared with those on the whole set of eighty CASP13 targets. A posteriori refinement of the best models in their corresponding cryo-EM density maps resulted in structures that are very close to the reference structure, including some regions with better fit to the density.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Sony Malhotra
- Institute of Structural and Molecular Biology, Birkbeck, University College London, Malet Street, London WC1E 7HX, UK
| | - Bohdan Monastyrskyy
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Tristan Cragnolini
- Institute of Structural and Molecular Biology, Birkbeck, University College London, Malet Street, London WC1E 7HX, UK
| | - Agnel-Praveen Joseph
- Institute of Structural and Molecular Biology, Birkbeck, University College London, Malet Street, London WC1E 7HX, UK
| | - Wah Chiu
- Department of Bioengineering, Microbiology and Immunology and Photon Science, Stanford University, James H. Clark Center, MC5447, 318 Campus Drive, Stanford, CA 94305, USA
| | - Maya Topf
- Institute of Structural and Molecular Biology, Birkbeck, University College London, Malet Street, London WC1E 7HX, UK
| |
Collapse
|
23
|
Heo L, Feig M. High-accuracy protein structures by combining machine-learning with physics-based refinement. Proteins 2019; 88:637-642. [PMID: 31693199 DOI: 10.1002/prot.25847] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 10/05/2019] [Accepted: 11/03/2019] [Indexed: 12/16/2022]
Abstract
Protein structure prediction has long been available as an alternative to experimental structure determination, especially via homology modeling based on templates from related sequences. Recently, models based on distance restraints from coevolutionary analysis via machine learning to have significantly expanded the ability to predict structures for sequences without templates. One such method, AlphaFold, also performs well on sequences where templates are available but without using such information directly. Here we show that combining machine-learning based models from AlphaFold with state-of-the-art physics-based refinement via molecular dynamics simulations further improves predictions to outperform any other prediction method tested during the latest round of CASP. The resulting models have highly accurate global and local structures, including high accuracy at functionally important interface residues, and they are highly suitable as initial models for crystal structure determination via molecular replacement.
Collapse
Affiliation(s)
- Lim Heo
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan
| |
Collapse
|
24
|
Sala D, Huang YJ, Cole CA, Snyder DA, Liu G, Ishida Y, Swapna GVT, Brock KP, Sander C, Fidelis K, Kryshtafovych A, Inouye M, Tejero R, Valafar H, Rosato A, Montelione GT. Protein structure prediction assisted with sparse NMR data in CASP13. Proteins 2019; 87:1315-1332. [PMID: 31603581 DOI: 10.1002/prot.25837] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 09/26/2019] [Accepted: 09/27/2019] [Indexed: 01/05/2023]
Abstract
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15 N-1 H residual dipolar coupling data, typical of that obtained for 15 N,13 C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CASP13 also assessed whether incorporation of sparse NMR data improves the accuracy of protein structure prediction relative to nonassisted regular methods. In most cases, incorporation of sparse, noisy NMR data results in models with higher accuracy. The best NMR-assisted models were also compared with the best regular predictions of any CASP13 group for the same target. For six of 13 targets, the most accurate model provided by any NMR-assisted prediction group was more accurate than the most accurate model provided by any regular prediction group; however, for the remaining seven targets, one or more regular prediction method provided a more accurate model than even the best NMR-assisted model. These results suggest a novel approach for protein structure determination, in which advanced prediction methods are first used to generate structural models, and sparse NMR data is then used to validate and/or refine these models.
Collapse
Affiliation(s)
- Davide Sala
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| | - Casey A Cole
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - David A Snyder
- Department of Chemistry, College of Science and Health, William Paterson University, Wayne, New Jersey
| | - Gaohua Liu
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Nexomics Biosciences, Bordentown, New Jersey
| | - Yojiro Ishida
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - G V T Swapna
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts.,cBio Center, Dana-Farber Cancer Institute, Boston, Massachusetts
| | | | | | - Masayori Inouye
- Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Roberto Tejero
- Departamento de Quimica Fisica, Universidad de Valencia, Valencia, Spain
| | - Homayoun Valafar
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - Antonio Rosato
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| |
Collapse
|
25
|
Mirzaei S, Sidi T, Keasar C, Crivelli S. Purely Structural Protein Scoring Functions Using Support Vector Machine and Ensemble Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1515-1523. [PMID: 28113636 DOI: 10.1109/tcbb.2016.2602269] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The function of a protein is determined by its structure, which creates a need for efficient methods of protein structure determination to advance scientific and medical research. Because current experimental structure determination methods carry a high price tag, computational predictions are highly desirable. Given a protein sequence, computational methods produce numerous 3D structures known as decoys. Selection of the best quality decoys is both challenging and essential as the end users can handle only a few ones. Therefore, scoring functions are central to decoy selection. They combine measurable features into a single number indicator of decoy quality. Unfortunately, current scoring functions do not consistently select the best decoys. Machine learning techniques offer great potential to improve decoy scoring. This paper presents two machine-learning based scoring functions to predict the quality of proteins structures, i.e., the similarity between the predicted structure and the experimental one without knowing the latter. We use different metrics to compare these scoring functions against three state-of-the-art scores. This is a first attempt at comparing different scoring functions using the same non-redundant dataset for training and testing and the same features. The results show that adding informative features may be more significant than the method used.
Collapse
|
26
|
Won J, Baek M, Monastyrskyy B, Kryshtafovych A, Seok C. Assessment of protein model structure accuracy estimation in CASP13: Challenges in the era of deep learning. Proteins 2019; 87:1351-1360. [PMID: 31436360 DOI: 10.1002/prot.25804] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 08/08/2019] [Accepted: 08/19/2019] [Indexed: 12/20/2022]
Abstract
Scoring model structure is an essential component of protein structure prediction that can affect the prediction accuracy tremendously. Users of protein structure prediction results also need to score models to select the best models for their application studies. In Critical Assessment of techniques for protein Structure Prediction (CASP), model accuracy estimation methods have been tested in a blind fashion by providing models submitted by the tertiary structure prediction servers for scoring. In CASP13, model accuracy estimation results were evaluated in terms of both global and local structure accuracy. Global structure accuracy estimation was evaluated by the quality of the models selected by the global structure scores and by the absolute estimates of the global scores. Residue-wise, local structure accuracy estimations were evaluated by three different measures. A new measure introduced in CASP13 evaluates the ability to predict inaccurately modeled regions that may be improved by refinement. An intensive comparative analysis on CASP13 and the previous CASPs revealed that the tertiary structure models generated by the CASP13 servers show very distinct features. Higher consensus toward models of higher global accuracy appeared even for free modeling targets, and many models of high global accuracy were not well optimized at the atomic level. This is related to the new technology in CASP13, deep learning for tertiary contact prediction. The tertiary model structures generated by deep learning pose a new challenge for EMA (estimation of model accuracy) method developers. Model accuracy estimation itself is also an area where deep learning can potentially have an impact, although current EMA methods have not fully explored that direction.
Collapse
Affiliation(s)
- Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | | | | | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
27
|
Read RJ, Sammito MD, Kryshtafovych A, Croll TI. Evaluation of model refinement in CASP13. Proteins 2019; 87:1249-1262. [PMID: 31365160 PMCID: PMC6851427 DOI: 10.1002/prot.25794] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/03/2019] [Accepted: 07/27/2019] [Indexed: 12/25/2022]
Abstract
Performance in the model refinement category of the 13th round of Critical Assessment of Structure Prediction (CASP13) is assessed, showing that some groups consistently improve most starting models whereas the majority of participants continue to degrade the starting model on average. Using the ranking formula developed for CASP12, it is shown that only 7 of 32 groups perform better than a “naïve predictor” who just submits the starting model. Common features in their approaches include a dependence on physics‐based force fields to judge alternative conformations and the use of molecular dynamics to relax models to local minima, usually with some restraints to prevent excessively large movements. In addition to the traditional CASP metrics that focus largely on the quality of the overall fold, alternative metrics are evaluated, including comparisons of the main‐chain and side‐chain torsion angles, and the utility of the models for solving crystal structures by the molecular replacement method. It is proposed that the introduction of these metrics, as well as consideration of the accuracy of coordinate error estimates, would improve the discrimination between good and very good models.
Collapse
Affiliation(s)
- Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | - Massimo D Sammito
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Tristan I Croll
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| |
Collapse
|
28
|
Croll TI, Sammito MD, Kryshtafovych A, Read RJ. Evaluation of template-based modeling in CASP13. Proteins 2019; 87:1113-1127. [PMID: 31407380 PMCID: PMC6851432 DOI: 10.1002/prot.25800] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 07/29/2019] [Accepted: 08/08/2019] [Indexed: 12/12/2022]
Abstract
Performance in the template‐based modeling (TBM) category of CASP13 is assessed here, using a variety of metrics. Performance of the predictor groups that participated is ranked using the primary ranking score that was developed by the assessors for CASP12. This reveals that the best results are obtained by groups that include contact predictions or inter‐residue distance predictions derived from deep multiple sequence alignments. In cases where there is a good homolog in the wwPDB (TBM‐easy category), the best results are obtained by modifying a template. However, for cases with poorer homologs (TBM‐hard), very good results can be obtained without using an explicit template, by deep learning algorithms trained on the wwPDB. Alternative metrics are introduced, to allow testing of aspects of structural models that are not addressed by traditional CASP metrics. These include comparisons to the main‐chain and side‐chain torsion angles of the target, and the utility of models for solving crystal structures by the molecular replacement method. The alternative metrics are poorly correlated with the traditional metrics, and it is proposed that modeling has reached a sufficient level of maturity that the best models should be expected to satisfy this wider range of criteria.
Collapse
Affiliation(s)
- Tristan I Croll
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| | - Massimo D Sammito
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| | | | - Randy J Read
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Cambridge, UK
| |
Collapse
|
29
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
30
|
Kryshtafovych A, Monastyrskyy B, Adams PD, Lawson CL, Chiu W. Distribution of evaluation scores for the models submitted to the second cryo-EM model challenge. Data Brief 2018; 20:1629-1638. [PMID: 30263915 PMCID: PMC6157618 DOI: 10.1016/j.dib.2018.08.214] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Revised: 08/24/2018] [Accepted: 08/31/2018] [Indexed: 01/02/2023] Open
Abstract
142 protein structure models were submitted to second Cryo-EM model challenge (2015–2016). Accuracy of the models was evaluated with 54 evaluation scores. Results of the descriptive statistical analysis of the scores are provided in this article.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Bohdan Monastyrskyy
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging, LBNL, CA 94720, USA.,Department of Bioengineering, University of California Berkeley, CA 94720, USA
| | - Catherine L Lawson
- Institute for Quantitative Biomedicine and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Wah Chiu
- Department of Bioengineering, Microbiology and Immunology and Photon Science, Stanford University, James H. Clark Center, MC5447, 318 Campus Drive, Stanford, CA 94305-5447, USA
| |
Collapse
|
31
|
Deng H, Jia Y, Zhang Y. Protein structure prediction. INTERNATIONAL JOURNAL OF MODERN PHYSICS. B 2018; 32:1840009. [PMID: 30853739 PMCID: PMC6407873 DOI: 10.1142/s021797921840009x] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Predicting 3D structure of protein from its amino acid sequence is one of the most important unsolved problems in biophysics and computational biology. This paper attempts to give a comprehensive introduction of the most recent effort and progress on protein structure prediction. Following the general flowchart of structure prediction, related concepts and methods are presented and discussed. Moreover, brief introductions are made to several widely-used prediction methods and the community-wide critical assessment of protein structure prediction (CASP) experiments.
Collapse
Affiliation(s)
- Haiyou Deng
- College of Science, Huazhong Agricultural University, Wuhan 4R0070, P. R. China
| | - Ya Jia
- College of Physical Science and Technology, Central China Normal University, Wuhan 430079, P. R. China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA
| |
Collapse
|
32
|
Kryshtafovych A, Adams PD, Lawson CL, Chiu W. Evaluation system and web infrastructure for the second cryo-EM model challenge. J Struct Biol 2018; 204:96-108. [PMID: 30017700 DOI: 10.1016/j.jsb.2018.07.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2018] [Revised: 07/06/2018] [Accepted: 07/10/2018] [Indexed: 01/01/2023]
Abstract
An evaluation system and a web infrastructure were developed for the second cryo-EM model challenge. The evaluation system includes tools to validate stereo-chemical plausibility of submitted models, check their fit to the corresponding density maps, estimate their overall and per-residue accuracy, and assess their similarity to reference cryo-EM or X-ray structures as well as other models submitted in this challenge. The web infrastructure provides a convenient interface for analyzing models at different levels of detail. It includes interactively sortable tables of evaluation scores for different subsets of models and different sublevels of structure organization, and a suite of visualization tools facilitating model analysis. The results are publicly accessible at http://model-compare.emdatabank.org.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, CA 95616, USA.
| | - Paul D Adams
- Molecular Biophysics & Integrated Bioimaging, LBNL, CA 94720, USA; Department of Bioengineering, University of California Berkeley, CA 94720, USA
| | - Catherine L Lawson
- Institute for Quantitative Biomedicine and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 174 Frelinghuysen Road, Piscataway, NJ 08854, USA
| | - Wah Chiu
- Departments of Bioengineering and Microbiology & Immunology, Stanford University, Stanford, CA 94305-5447, USA; Division of CryoEM and Bioimaging, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA
| |
Collapse
|
33
|
Kryshtafovych A, Albrecht R, Baslé A, Bule P, Caputo AT, Carvalho AL, Chao KL, Diskin R, Fidelis K, Fontes CMGA, Fredslund F, Gilbert HJ, Goulding CW, Hartmann MD, Hayes CS, Herzberg O, Hill JC, Joachimiak A, Kohring GW, Koning RI, Lo Leggio L, Mangiagalli M, Michalska K, Moult J, Najmudin S, Nardini M, Nardone V, Ndeh D, Nguyen TH, Pintacuda G, Postel S, van Raaij MJ, Roversi P, Shimon A, Singh AK, Sundberg EJ, Tars K, Zitzmann N, Schwede T. Target highlights from the first post-PSI CASP experiment (CASP12, May-August 2016). Proteins 2018; 86 Suppl 1:27-50. [PMID: 28960539 PMCID: PMC5820184 DOI: 10.1002/prot.25392] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 09/19/2017] [Accepted: 09/25/2017] [Indexed: 12/27/2022]
Abstract
The functional and biological significance of the selected CASP12 targets are described by the authors of the structures. The crystallographers discuss the most interesting structural features of the target proteins and assess whether these features were correctly reproduced in the predictions submitted to the CASP12 experiment.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California, 95616
| | - Reinhard Albrecht
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| | - Arnaud Baslé
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Pedro Bule
- CIISA - Faculdade de Medicina Veterinária, Universidade de Lisboa, Avenida da Universidade Técnica, 1300-477, Portugal, Lisboa
| | - Alessandro T Caputo
- Oxford Glycobiology Institute, Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, England, United Kingdom
| | - Ana Luisa Carvalho
- UCIBIO, REQUIMTE, Departamento de Química, Faculdade de Cien⁁cias e Tecnologia, Universidade Nova de Lisboa, Caparica, 2829-516, Portugal
| | - Kinlin L Chao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, 20850
| | - Ron Diskin
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, 451 Health Sciences Drive, Davis, California, 95616
| | - Carlos M G A Fontes
- CIISA - Faculdade de Medicina Veterinária, Universidade de Lisboa, Avenida da Universidade Técnica, 1300-477, Portugal, Lisboa
| | - Folmer Fredslund
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen Ø, Denmark
| | - Harry J Gilbert
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Celia W Goulding
- Department of Molecular Biology and Biochemistry/Pharmaceutical Sciences, University of California Irvine, Irvine, California, 92697
| | - Marcus D Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| | - Christopher S Hayes
- Department of Molecular, Cellular and Developmental Biology/Biomolecular Science and Engineering Program, University of California, Santa Barbara, Santa Barbara, California, 93106
| | - Osnat Herzberg
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, 20850
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, 20742
| | - Johan C Hill
- Oxford Glycobiology Institute, Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, England, United Kingdom
| | - Andrzej Joachimiak
- Argonne National Laboratory, Midwest Center for Structural Genomics/Structural Biology Center, Biosciences Division, Argonne, Illinois, 60439
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois, 60637
| | - Gert-Wieland Kohring
- Microbiology, Saarland University, Campus Building A1.5, Saarbrücken, Saarland, D-66123, Germany
| | - Roman I Koning
- Netherlands Centre for Electron Nanoscopy, Institute of Biology Leiden, Leiden University, 2333, CC Leiden, The Netherlands
- Department of Molecular Cell Biology, Leiden University Medical Center, 2300 RC, Leiden, The Netherlands
| | - Leila Lo Leggio
- Department of Chemistry, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen Ø, Denmark
| | - Marco Mangiagalli
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milano, 20126, Italy
| | - Karolina Michalska
- Argonne National Laboratory, Midwest Center for Structural Genomics/Structural Biology Center, Biosciences Division, Argonne, Illinois, 60439
| | - John Moult
- Department of Cell Biology and Molecular genetics, University of Maryland, 9600 Gudelsky Drive, Institute for Bioscience and Biotechnology Research, Rockville, Maryland, 20850
| | - Shabir Najmudin
- CIISA - Faculdade de Medicina Veterinária, Universidade de Lisboa, Avenida da Universidade Técnica, 1300-477, Portugal, Lisboa
| | - Marco Nardini
- Department of Biosciences, University of Milano, Milano, 20133, Italy
| | - Valentina Nardone
- Department of Biosciences, University of Milano, Milano, 20133, Italy
| | - Didier Ndeh
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Thanh-Hong Nguyen
- Department of Macromolecular Structures, Centro Nacional de Biotecnologia (CSIC), calle Darwin 3, Madrid, 28049, Spain
| | - Guido Pintacuda
- Université de Lyon, Centre de RMN à Très Hauts Champs, Institut des Sciences Analytiques (UMR 5280 - CNRS, ENS Lyon, UCB Lyon 1), Villeurbanne, 69100, France
| | - Sandra Postel
- University of Maryland School of Medicine, Institute of Human Virology, Baltimore, Maryland, 21201
| | - Mark J van Raaij
- Department of Macromolecular Structures, Centro Nacional de Biotecnologia (CSIC), calle Darwin 3, Madrid, 28049, Spain
| | - Pietro Roversi
- Oxford Glycobiology Institute, Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, England, United Kingdom
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Henry Wellcome Building, University Road, Leicester, LE1 7RN, UK
| | - Amir Shimon
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Abhimanyu K Singh
- School of Biosciences, University of Kent, Canterbury, Kent, CT2 7NJ, United Kingdom
| | - Eric J Sundberg
- Department of Medicine and Department of Microbiology and Immunology, University of Maryland School of Medicine, Institute of Human Virology, Baltimore, Maryland, 21201
| | - Kaspars Tars
- Latvian Biomedical Research and Study Center, Rātsupītes 1, Riga, LV1067, Latvia
- Faculty of Biology, Department of Molecular Biology, University of Latvia, Jelgavas 1, Riga, LV-1004, Latvia
| | - Nicole Zitzmann
- Oxford Glycobiology Institute, Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU, England, United Kingdom
| | - Torsten Schwede
- Biozentrum/SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 50, Basel, 4056, Switzerland
| |
Collapse
|
34
|
Kryshtafovych A, Monastyrskyy B, Fidelis K, Moult J, Schwede T, Tramontano A. Evaluation of the template-based modeling in CASP12. Proteins 2017; 86 Suppl 1:321-334. [PMID: 29159950 DOI: 10.1002/prot.25425] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 10/22/2017] [Accepted: 11/16/2017] [Indexed: 01/29/2023]
Abstract
The article describes results of numerical evaluation of CASP12 models submitted on targets for which structural templates could be identified and for which servers produced models of relatively high accuracy. The emphasis is on analysis of details of models, and how well the models compete with experimental structures. Performance of contributing research groups is measured in terms of backbone accuracy, all-atom local geometry, and the ability to estimate local errors in models. Separate analyses for all participating groups and automatic servers were carried out. Compared with the last CASP, two years ago, there have been significant improvements in a number of areas, particularly the accuracy of protein backbone atoms, accuracy of sequence alignment between models and available structures, increased accuracy over that which can be obtained from simple copying of a closest template, and accuracy of modeling of sub-structures not present in the closest template. These advancements are likely associated with more effective strategies to build non-template regions of the targets ab initio, better algorithms to combine information from multiple templates, enhanced refinement methods, and better methods for estimating model accuracy.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Biochemical Sciences, Sapienza - University of Rome, P. le A. Moro, 5, Rome, 00185
| |
Collapse
|
35
|
Hovan L, Oleinikovas V, Yalinca H, Kryshtafovych A, Saladino G, Gervasio FL. Assessment of the model refinement category in CASP12. Proteins 2017; 86 Suppl 1:152-167. [PMID: 29071750 DOI: 10.1002/prot.25409] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Revised: 10/03/2017] [Accepted: 10/24/2017] [Indexed: 01/07/2023]
Abstract
We here report on the assessment of the model refinement predictions submitted to the 12th Experiment on the Critical Assessment of Protein Structure Prediction (CASP12). This is the fifth refinement experiment since CASP8 (2008) and, as with the previous experiments, the predictors were invited to refine selected server models received in the regular (nonrefinement) stage of the CASP experiment. We assessed the submitted models using a combination of standard CASP measures. The coefficients for the linear combination of Z-scores (the CASP12 score) have been obtained by a machine learning algorithm trained on the results of visual inspection. We identified eight groups that improve both the backbone conformation and the side chain positioning for the majority of targets. Albeit the top methods adopted distinctively different approaches, their overall performance was almost indistinguishable, with each of them excelling in different scores or target subsets. What is more, there were a few novel approaches that, while doing worse than average in most cases, provided the best refinements for a few targets, showing significant latitude for further innovation in the field.
Collapse
Affiliation(s)
- Ladislav Hovan
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Havva Yalinca
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | | | - Giorgio Saladino
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom
| | - Francesco Luigi Gervasio
- Department of Chemistry, University College London, WC1E 6BT, United Kingdom.,Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, United Kingdom
| |
Collapse
|
36
|
Lee GR, Heo L, Seok C. Simultaneous refinement of inaccurate local regions and overall structure in the CASP12 protein model refinement experiment. Proteins 2017; 86 Suppl 1:168-176. [PMID: 29044810 DOI: 10.1002/prot.25404] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 10/09/2017] [Accepted: 10/11/2017] [Indexed: 12/15/2022]
Abstract
Advances in protein model refinement techniques are required as diverse sources of protein structure information are available from low-resolution experiments or informatics-based computations such as cryo-EM, NMR, homology models, or predicted residue contacts. Given semi-reliable or incomplete structural information, structure quality of a protein model has to be improved by ab initio methods such as energy-based simulation. In this study, we describe a new automatic refinement server method designed to improve locally inaccurate regions and overall structure simultaneously. Locally inaccurate regions may occur in protein structures due to non-convergent or missing information in template structures used in homology modeling or due to intrinsic structural flexibilities not resolved by experimental techniques. However, such variable or dynamic regions often play important functional roles by participating in interactions with other biomolecules or in transitions between different functional states. The new refinement method introduced here utilizes diverse types of geometric operators which drive both local and global changes, and the effect of structure changes and relaxations are accumulated. This resulted in consistent refinement of both local and global structural features. Performance of this method in CASP12 is discussed.
Collapse
Affiliation(s)
- Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
37
|
Kryshtafovych A, Monastyrskyy B, Fidelis K, Schwede T, Tramontano A. Assessment of model accuracy estimations in CASP12. Proteins 2017; 86 Suppl 1:345-360. [PMID: 28833563 DOI: 10.1002/prot.25371] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 07/28/2017] [Accepted: 08/14/2017] [Indexed: 12/27/2022]
Abstract
The record high 42 model accuracy estimation methods were tested in CASP12. The paper presents results of the assessment of these methods in the whole-model and per-residue accuracy modes. Scores from four different model evaluation packages were used as the "ground truth" for assessing accuracy of methods' estimates. They include a rigid-body score-GDT_TS, and three local-structure based scores-LDDT, CAD and SphereGrinder. The ability of methods to identify best models from among several available, predict model's absolute accuracy score, distinguish between good and bad models, predict accuracy of the coordinate error self-estimates, and discriminate between reliable and unreliable regions in the models was assessed. Single-model methods advanced to the point where they are better than clustering methods in picking the best models from decoy sets. On the other hand, consensus methods, taking advantage of the availability of large number of models for the same target protein, are still better in distinguishing between good and bad models and predicting local accuracy of models. The best accuracy estimation methods were shown to perform better with respect to the frozen in time reference clustering method and the results of the best method in the corresponding class of methods from the previous CASP. Top performing single-model methods were shown to do better than all but three CASP12 tertiary structure predictors when evaluated as model selectors.
Collapse
Affiliation(s)
| | | | | | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Stuctural Biology Group, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Physics, Sapienza University of Rome, Rome, Italy
| |
Collapse
|
38
|
Terashi G, Kihara D. Protein structure model refinement in CASP12 using short and long molecular dynamics simulations in implicit solvent. Proteins 2017; 86 Suppl 1:189-201. [PMID: 28833585 DOI: 10.1002/prot.25373] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 08/01/2017] [Accepted: 08/18/2017] [Indexed: 12/21/2022]
Abstract
Protein structure prediction has matured over years, particularly those which use structure templates for building a model. It can build a model with correct overall conformation in cases where appropriate templates are available. Models with the correct topology can be practically useful for limited purposes that need residue-level accuracy, but further improvement of the models can allow the models to be used in tasks that need detailed structures, such as molecular replacement in X-ray crystallography or structure-based drug screening. Thus, model refinement is an important final step in protein structure prediction to bridge predictions to real-life applications. Model refinement is one of the categories in recent rounds of critical assessment of techniques in protein structure prediction (CASP) and has recently been drawing more attention due to its realized importance. Here we report our group's performance in the refinement category in CASP12. Our method is based on inexpensive short molecular dynamics (MD) simulations in implicit solvent. Our performance in CASP12 was among the top, which was consistent with the previous round, CASP11. Our method with short MD runs achieved comparable performance with other methods that used longer simulations. Detailed analyses found that improvements typically occurred in entire regions of a structure rather than only in flexible loop regions. The remaining challenge in the structure refinement includes large conformational refinement which involves substantial motions of secondary structure elements or domains.
Collapse
Affiliation(s)
- Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907.,Department of Computer Science, Purdue University, West Lafayette, Indiana, 47907
| |
Collapse
|
39
|
Park H, Bradley P, Greisen P, Liu Y, Mulligan VK, Kim DE, Baker D, DiMaio F. Simultaneous Optimization of Biomolecular Energy Functions on Features from Small Molecules and Macromolecules. J Chem Theory Comput 2016; 12:6201-6212. [PMID: 27766851 PMCID: PMC5515585 DOI: 10.1021/acs.jctc.6b00819] [Citation(s) in RCA: 302] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Most biomolecular modeling energy functions for structure prediction, sequence design, and molecular docking have been parametrized using existing macromolecular structural data; this contrasts molecular mechanics force fields which are largely optimized using small-molecule data. In this study, we describe an integrated method that enables optimization of a biomolecular modeling energy function simultaneously against small-molecule thermodynamic data and high-resolution macromolecular structural data. We use this approach to develop a next-generation Rosetta energy function that utilizes a new anisotropic implicit solvation model, and an improved electrostatics and Lennard-Jones model, illustrating how energy functions can be considerably improved in their ability to describe large-scale energy landscapes by incorporating both small-molecule and macromolecule data. The energy function improves performance in a wide range of protein structure prediction challenges, including monomeric structure prediction, protein-protein and protein-ligand docking, protein sequence design, and prediction of the free energy changes by mutation, while reasonably recapitulating small-molecule thermodynamic properties.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Philip Bradley
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue N., Seattle, Washington 98019, USA
| | - Per Greisen
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Yuan Liu
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - Vikram Khipple Mulligan
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| | - David E. Kim
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, Washington 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, Washington 98195, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
- Institute for Protein Design, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
40
|
Pang YP. FF12MC: A revised AMBER forcefield and new protein simulation protocol. Proteins 2016; 84:1490-516. [PMID: 27348292 PMCID: PMC5129589 DOI: 10.1002/prot.25094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/16/2016] [Accepted: 06/18/2016] [Indexed: 12/25/2022]
Abstract
Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yuan-Ping Pang
- Computer-Aided Molecular Design Laboratory, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
41
|
Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: Assessment of the CASP11 results. Proteins 2016; 84 Suppl 1:131-44. [PMID: 26474083 PMCID: PMC4834069 DOI: 10.1002/prot.24943] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 09/15/2015] [Accepted: 10/11/2015] [Indexed: 12/27/2022]
Abstract
This article provides a report on the state-of-the-art in the prediction of intra-molecular residue-residue contacts in proteins based on the assessment of the predictions submitted to the CASP11 experiment. The assessment emphasis is placed on the accuracy in predicting long-range contacts. Twenty-nine groups participated in contact prediction in CASP11. At least eight of them used the recently developed evolutionary coupling techniques, with the top group (CONSIP2) reaching precision of 27% on target proteins that could not be modeled by homology. This result indicates a breakthrough in the development of methods based on the correlated mutation approach. Successful prediction of contacts was shown to be practically helpful in modeling three-dimensional structures; in particular target T0806 was modeled exceedingly well with accuracy not yet seen for ab initio targets of this size (>250 residues). Proteins 2016; 84(Suppl 1):131-144. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | - Daniel D'Andrea
- Department of Physics, Sapienza-University of Rome, Rome, 00185, Italy
| | | | - Anna Tramontano
- Department of Physics, Sapienza-University of Rome, Rome, 00185, Italy
- Istituto Pasteur-Fondazione Cenci Bolognetti-University of Rome, Rome, 00185, Italy
| | | |
Collapse
|
42
|
Lensink MF, Velankar S, Kryshtafovych A, Huang SY, Schneidman-Duhovny D, Sali A, Segura J, Fernandez-Fuentes N, Viswanath S, Elber R, Grudinin S, Popov P, Neveu E, Lee H, Baek M, Park S, Heo L, Rie Lee G, Seok C, Qin S, Zhou HX, Ritchie DW, Maigret B, Devignes MD, Ghoorah A, Torchala M, Chaleil RAG, Bates PA, Ben-Zeev E, Eisenstein M, Negi SS, Weng Z, Vreven T, Pierce BG, Borrman TM, Yu J, Ochsenbein F, Guerois R, Vangone A, Rodrigues JPGLM, van Zundert G, Nellen M, Xue L, Karaca E, Melquiond ASJ, Visscher K, Kastritis PL, Bonvin AMJJ, Xu X, Qiu L, Yan C, Li J, Ma Z, Cheng J, Zou X, Shen Y, Peterson LX, Kim HR, Roy A, Han X, Esquivel-Rodriguez J, Kihara D, Yu X, Bruce NJ, Fuller JC, Wade RC, Anishchenko I, Kundrotas PJ, Vakser IA, Imai K, Yamada K, Oda T, Nakamura T, Tomii K, Pallara C, Romero-Durana M, Jiménez-García B, Moal IH, Férnandez-Recio J, Joung JY, Kim JY, Joo K, Lee J, Kozakov D, Vajda S, Mottarella S, Hall DR, Beglov D, Mamonov A, Xia B, Bohnuud T, Del Carpio CA, Ichiishi E, Marze N, Kuroda D, Roy Burman SS, Gray JJ, Chermak E, Cavallo L, Oliva R, Tovchigrechko A, Wodak SJ. Prediction of homoprotein and heteroprotein complexes by protein docking and template-based modeling: A CASP-CAPRI experiment. Proteins 2016; 84 Suppl 1:323-48. [PMID: 27122118 PMCID: PMC5030136 DOI: 10.1002/prot.25007] [Citation(s) in RCA: 119] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 12/30/2015] [Accepted: 02/02/2016] [Indexed: 12/26/2022]
Abstract
We present the results for CAPRI Round 30, the first joint CASP-CAPRI experiment, which brought together experts from the protein structure prediction and protein-protein docking communities. The Round comprised 25 targets from amongst those submitted for the CASP11 prediction experiment of 2014. The targets included mostly homodimers, a few homotetramers, and two heterodimers, and comprised protein chains that could readily be modeled using templates from the Protein Data Bank. On average 24 CAPRI groups and 7 CASP groups submitted docking predictions for each target, and 12 CAPRI groups per target participated in the CAPRI scoring experiment. In total more than 9500 models were assessed against the 3D structures of the corresponding target complexes. Results show that the prediction of homodimer assemblies by homology modeling techniques and docking calculations is quite successful for targets featuring large enough subunit interfaces to represent stable associations. Targets with ambiguous or inaccurate oligomeric state assignments, often featuring crystal contact-sized interfaces, represented a confounding factor. For those, a much poorer prediction performance was achieved, while nonetheless often providing helpful clues on the correct oligomeric state of the protein. The prediction performance was very poor for genuine tetrameric targets, where the inaccuracy of the homology-built subunit models and the smaller pair-wise interfaces severely limited the ability to derive the correct assembly mode. Our analysis also shows that docking procedures tend to perform better than standard homology modeling techniques and that highly accurate models of the protein components are not always required to identify their association modes with acceptable accuracy. Proteins 2016; 84(Suppl 1):323-348. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Marc F Lensink
- University Lille, CNRS UMR8576 UGSF, Lille, F-59000, France.
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | | | - Shen-You Huang
- Research Support Computing, University of Missouri Bioinformatics Consortium, and Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Dina Schneidman-Duhovny
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, 94158
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, 94158
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, 94158
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, 94158
- California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, California, 94158
| | - Joan Segura
- GN7 of the National Institute for Bioinformatics (INB) and Biocomputing Unit, National Center of Biotechnology (CSIC), Madrid, 28049, Spain
| | - Narcis Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences (IBERS), Aberystwyth University, Aberystwyth, SY233FG, United Kingdom
| | - Shruthi Viswanath
- Department of Computer Science, University of Texas at Austin, Austin, Texas, 78712
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas, 78712
| | - Ron Elber
- Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin, Texas, 78712
- Department of Chemistry, University of Texas at Austin, Austin, Texas, 78712
| | - Sergei Grudinin
- LJK, University Grenoble Alpes, CNRS, Grenoble, 38000, France
- INRIA, Grenoble, 38000, France
| | - Petr Popov
- LJK, University Grenoble Alpes, CNRS, Grenoble, 38000, France
- INRIA, Grenoble, 38000, France
- Moscow Institute of Physics and Technology, Dolgoprudniy, Russia
| | - Emilie Neveu
- LJK, University Grenoble Alpes, CNRS, Grenoble, 38000, France
- INRIA, Grenoble, 38000, France
| | - Hasup Lee
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Sangwoo Park
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, 151-747, Republic of Korea
| | - Sanbo Qin
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, Florida, 32306, USA
| | - Huan-Xiang Zhou
- Department of Physics and Institute of Molecular Biophysics, Florida State University, Tallahassee, Florida, 32306, USA
| | | | - Bernard Maigret
- CNRS, LORIA, Campus Scientifique, BP 239, Vandœuvre-lès-Nancy, 54506, France
| | | | - Anisah Ghoorah
- Department of Computer Science and Engineering, University of Mauritius, Reduit, Mauritius
| | - Mieczyslaw Torchala
- Biomolecular Modelling Laboratory, the Francis Crick Institute, Lincoln's Inn Fields Laboratory, London, WC2A 3LY, United Kingdom
| | - Raphaël A G Chaleil
- Biomolecular Modelling Laboratory, the Francis Crick Institute, Lincoln's Inn Fields Laboratory, London, WC2A 3LY, United Kingdom
| | - Paul A Bates
- Biomolecular Modelling Laboratory, the Francis Crick Institute, Lincoln's Inn Fields Laboratory, London, WC2A 3LY, United Kingdom
| | - Efrat Ben-Zeev
- G-INCPM, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Miriam Eisenstein
- Department of Chemical Research Support, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Surendra S Negi
- Sealy Center for Structural Biology and Molecular Biophysics, University of Texas Medical Branch, 301 University Boulevard, Galveston, Texas, 77555-0857
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Brian G Pierce
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Tyler M Borrman
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | - Jinchao Yu
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette, 91191, France
| | - Françoise Ochsenbein
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette, 91191, France
| | - Raphaël Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University Paris-Saclay, CEA-Saclay, Gif-sur-Yvette, 91191, France
| | - Anna Vangone
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - João P G L M Rodrigues
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Gydo van Zundert
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Mehdi Nellen
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Li Xue
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Ezgi Karaca
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Adrien S J Melquiond
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Koen Visscher
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Panagiotis L Kastritis
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, 65211
| | - Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, 65211
| | - Chengfei Yan
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, 65211
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, 65211
| | - Jilong Li
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Zhiwei Ma
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, 65211
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, 65211
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
- Informatics Institute, University of Missouri, Columbia, Missouri, 65211
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri, 65211
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri, 65211
- Informatics Institute, University of Missouri, Columbia, Missouri, 65211
- Department of Biochemistry, University of Missouri, Columbia, Missouri, 65211
| | - Yang Shen
- Toyota Technological Institute at Chicago, 6045 S Kenwood Avenue, Chicago, Illinois, 60637
| | - Lenna X Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
| | - Hyung-Rae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
| | - Amit Roy
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
- Bioinformatics and Computational Biosciences Branch, Rocky Mountain Laboratories, National Institutes of Health, Hamilton, Montano 59840
| | - Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
| | | | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, 47907
- Department of Computer Science, Purdue University, West Lafayette, IN, USA, 47907
| | - Xiaofeng Yu
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | - Neil J Bruce
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | - Jonathan C Fuller
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | - Rebecca C Wade
- Molecular and Cellular Modeling Group, Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
- Center for Molecular Biology (ZMBH), DKFZ-ZMBH Alliance, Heidelberg University, Heidelberg, Germany
- Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
| | - Ivan Anishchenko
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66047
- Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66047
| | - Kenichiro Imai
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-Ku, Japan
| | - Kazunori Yamada
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-Ku, Japan
| | - Toshiyuki Oda
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-Ku, Japan
| | - Tsukasa Nakamura
- Graduate School of Frontier Sciences, the University of Tokyo, Kashiwa, Japan
| | - Kentaro Tomii
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), Koto-Ku, Japan
- Graduate School of Frontier Sciences, the University of Tokyo, Kashiwa, Japan
| | - Chiara Pallara
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, C/Jordi Girona 29, Barcelona, 08034, Spain
| | - Miguel Romero-Durana
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, C/Jordi Girona 29, Barcelona, 08034, Spain
| | - Brian Jiménez-García
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, C/Jordi Girona 29, Barcelona, 08034, Spain
| | - Iain H Moal
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, C/Jordi Girona 29, Barcelona, 08034, Spain
| | - Juan Férnandez-Recio
- Joint BSC-CRG-IRB Research Program in Computational Biology, Barcelona Supercomputing Center, C/Jordi Girona 29, Barcelona, 08034, Spain
| | - Jong Young Joung
- Center for in-Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jong Yun Kim
- Center for in-Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Keehyoung Joo
- Center for in-Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Jooyoung Lee
- Center for in-Silico Protein Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
- School of Computational Science, Korea Institute for Advanced Study, Seoul, 130-722, Korea
| | - Dima Kozakov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
- Department of Chemistry, Boston University, Boston, Massachusetts
| | - Scott Mottarella
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - David R Hall
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Dmitri Beglov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Artem Mamonov
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Bing Xia
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Tanggis Bohnuud
- Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Carlos A Del Carpio
- Institute of Biological Diversity, International Pacific Institute of Indiana, Bloomington, Indiana, 47401
- Drosophila Genetic Resource Center, Kyoto Institute of Technology, Ukyo-Ku, 616-8354, Japan
| | - Eichiro Ichiishi
- International University of Health and Welfare Hospital (IUHW Hospital), Asushiobara-City, Tochigi Prefecture, 329-2763, Japan
| | - Nicholas Marze
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, 21218
| | - Daisuke Kuroda
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, 21218
| | - Shourya S Roy Burman
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, 21218
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, 21218
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, 21218
| | - Edrisse Chermak
- King Abdullah University of Science and Technology, Saudi Arabia
| | - Luigi Cavallo
- King Abdullah University of Science and Technology, Saudi Arabia
| | - Romina Oliva
- University of Naples "Parthenope", Napoli, Italy
| | - Andrey Tovchigrechko
- J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, Maryland, 20850
| | - Shoshana J Wodak
- Departments of Biochemistry and Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
- VIB Structural Biology Research Center, VUB Pleinlaan 2, Brussels, 1050, Belgium.
| |
Collapse
|
43
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP11 statistics and the prediction center evaluation system. Proteins 2016; 84 Suppl 1:15-9. [PMID: 26857434 PMCID: PMC5479680 DOI: 10.1002/prot.25005] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 01/18/2016] [Accepted: 02/04/2016] [Indexed: 01/10/2023]
Abstract
We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616.
| |
Collapse
|
44
|
Modi V, Dunbrack RL. Assessment of refinement of template-based models in CASP11. Proteins 2016; 84 Suppl 1:260-81. [PMID: 27081793 DOI: 10.1002/prot.25048] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2015] [Revised: 03/13/2016] [Accepted: 04/11/2016] [Indexed: 12/26/2022]
Abstract
CASP11 (the 11th Meeting on the Critical Assessment of Protein Structure Prediction) ran a blind experiment in the refinement of protein structure predictions, the fourth such experiment since CASP8. As with the previous experiments, the predictors were provided with one starting structure from the server models of each of a selected set of template-based modeling targets and asked to refine the coordinates of the starting structure toward native. We assessed the refined structures with the Z-scores of the standard CASP measures, which compare the model-target similarities of the models from all the predictors. Furthermore, we assessed the refined structures with "relative measures," which compare the improvement in accuracy of each model with respect to the starting structure. The latter provides an assessment of the extent to which each predictor group is able to improve the starting structures toward native. We utilized heat maps to display improvements in the Calpha-Calpha distance matrix for each model. The heat maps labeled with each element of secondary structure helped us to identify regions of refinement toward native in each model. Most positively scoring models show modest improvements in multiple regions of the structure, while in some models we were able to identify significant repositioning of N/C-terminal segments and internal elements of secondary structure. The best groups were able to improve more than 70% of the targets from the starting models, and by an average of 3-5% in the standard CASP measures. Proteins 2016; 84(Suppl 1):260-281. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, 19111
| | | |
Collapse
|
45
|
Modi V, Xu Q, Adhikari S, Dunbrack RL. Assessment of template-based modeling of protein structure in CASP11. Proteins 2016; 84 Suppl 1:200-20. [PMID: 27081927 DOI: 10.1002/prot.25049] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2016] [Revised: 04/04/2016] [Accepted: 04/11/2016] [Indexed: 12/27/2022]
Abstract
We present the assessment of predictions submitted in the template-based modeling (TBM) category of CASP11 (Critical Assessment of Protein Structure Prediction). Model quality was judged on the basis of global and local measures of accuracy on all atoms including side chains. The top groups on 39 human-server targets based on model 1 predictions were LEER, Zhang, LEE, MULTICOM, and Zhang-Server. The top groups on 81 targets by server groups based on model 1 predictions were Zhang-Server, nns, BAKER-ROSETTASERVER, QUARK, and myprotein-me. In CASP11, the best models for most targets were equal to or better than the best template available in the Protein Data Bank, even for targets with poor templates. The overall performance in CASP11 is similar to the performance of predictors in CASP10 with slightly better performance on the hardest targets. For most targets, assessment measures exhibited bimodal probability density distributions. Multi-dimensional scaling of an RMSD matrix for each target typically revealed a single cluster with models similar to the target structure, with a mode in the GDT-TS density between 40 and 90, and a wide distribution of models highly divergent from each other and from the experimental structure, with density mode at a GDT-TS value of ∼20. The models in this peak in the density were either compact models with entirely the wrong fold, or highly non-compact models. The results argue for a density-driven approach in future CASP TBM assessments that accounts for the bimodal nature of these distributions instead of Z scores, which assume a unimodal, Gaussian distribution. Proteins 2016; 84(Suppl 1):200-220. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Qifang Xu
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Sam Adhikari
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Roland L Dunbrack
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111.
| |
Collapse
|
46
|
Addressing the Role of Conformational Diversity in Protein Structure Prediction. PLoS One 2016; 11:e0154923. [PMID: 27159429 PMCID: PMC4861349 DOI: 10.1371/journal.pone.0154923] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 04/21/2016] [Indexed: 11/19/2022] Open
Abstract
Computational modeling of tertiary structures has become of standard use to study proteins that lack experimental characterization. Unfortunately, 3D structure prediction methods and model quality assessment programs often overlook that an ensemble of conformers in equilibrium populates the native state of proteins. In this work we collected sets of publicly available protein models and the corresponding target structures experimentally solved and studied how they describe the conformational diversity of the protein. For each protein, we assessed the quality of the models against known conformers by several standard measures and identified those models ranked best. We found that model rankings are defined by both the selected target conformer and the similarity measure used. 70% of the proteins in our datasets show that different models are structurally closest to different conformers of the same protein target. We observed that model building protocols such as template-based or ab initio approaches describe in similar ways the conformational diversity of the protein, although for template-based methods this description may depend on the sequence similarity between target and template sequences. Taken together, our results support the idea that protein structure modeling could help to identify members of the native ensemble, highlight the importance of considering conformational diversity in protein 3D quality evaluations and endorse the study of the variability of the native structure for a meaningful biological analysis.
Collapse
|
47
|
Kinch LN, Li W, Monastyrskyy B, Kryshtafovych A, Grishin NV. Evaluation of free modeling targets in CASP11 and ROLL. Proteins 2016; 84 Suppl 1:51-66. [PMID: 26677002 DOI: 10.1002/prot.24973] [Citation(s) in RCA: 65] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 12/12/2015] [Indexed: 12/25/2022]
Abstract
We present an assessment of 'template-free modeling' (FM) in CASP11and ROLL. Community-wide server performance suggested the use of automated scores similar to previous CASPs would provide a good system of evaluating performance, even in the absence of comprehensive manual assessment. The CASP11 FM category included several outstanding examples, including successful prediction by the Baker group of a 256-residue target (T0806-D1) that lacked sequence similarity to any existing template. The top server model prediction by Zhang's Quark, which was apparently selected and refined by several manual groups, encompassed the entire fold of target T0837-D1. Methods from the same two groups tended to dominate overall CASP11 FM and ROLL rankings. Comparison of top FM predictions with those from the previous CASP experiment revealed progress in the category, particularly reflected in high prediction accuracy for larger protein domains. FM prediction models for two cases were sufficient to provide functional insights that were otherwise not obtainable by traditional sequence analysis methods. Importantly, CASP11 abstracts revealed that alignment-based contact prediction methods brought about much of the CASP11 progress, producing both of the functionally relevant models as well as several of the other outstanding structure predictions. These methodological advances enabled de novo modeling of much larger domain structures than was previously possible and allowed prediction of functional sites. Proteins 2016; 84(Suppl 1):51-66. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050.
| | - Wenlin Li
- Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050
| | - Bohdan Monastyrskyy
- Genome Center, University of California, 451 Health Sciences Drive, Davis, California 95616
| | - Andriy Kryshtafovych
- Genome Center, University of California, 451 Health Sciences Drive, Davis, California 95616
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050.,Department of Biophysics and Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, 6001 Forest Park Road, Dallas, Texas 75390-9050
| |
Collapse
|
48
|
Kryshtafovych A, Moult J, Baslé A, Burgin A, Craig TK, Edwards RA, Fass D, Hartmann MD, Korycinski M, Lewis RJ, Lorimer D, Lupas AN, Newman J, Peat TS, Piepenbrink KH, Prahlad J, van Raaij MJ, Rohwer F, Segall AM, Seguritan V, Sundberg EJ, Singh AK, Wilson MA, Schwede T. Some of the most interesting CASP11 targets through the eyes of their authors. Proteins 2015; 84 Suppl 1:34-50. [PMID: 26473983 PMCID: PMC4834066 DOI: 10.1002/prot.24942] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 09/17/2015] [Accepted: 10/11/2015] [Indexed: 11/17/2022]
Abstract
The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34–50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
| | - John Moult
- Department of Cell Biology and Molecular Genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, 20850
| | - Arnaud Baslé
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, NE2 4HH, United Kingdom
| | - Alex Burgin
- Broad Institute, Cambridge, Massachusetts, 02142
| | | | - Robert A Edwards
- Department of Biology, San Diego State University, San Diego, California, 92182.,Department of Computer Science, San Diego State University, San Diego, California, 92182
| | - Deborah Fass
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, 76100, Israel
| | - Marcus D Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| | - Mateusz Korycinski
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| | - Richard J Lewis
- Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, NE2 4HH, United Kingdom
| | | | - Andrei N Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
| | - Janet Newman
- Biomedical Manufacturing Program, CSIRO, Parkville, VIC, Australia
| | - Thomas S Peat
- Biomedical Manufacturing Program, CSIRO, Parkville, VIC, Australia
| | - Kurt H Piepenbrink
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, 21201
| | - Janani Prahlad
- Department of Biochemistry and Redox Biology Center, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588
| | - Mark J van Raaij
- Centro Nactional De Biotecnologia (CNB-CSIC), Madrid, E-28049, Spain
| | - Forest Rohwer
- Department of Biology and Viral Information Institute, San Diego State University, San Diego, California, 92182
| | - Anca M Segall
- Department of Biology and Viral Information Institute, San Diego State University, San Diego, California, 92182
| | | | - Eric J Sundberg
- Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, 21201.,Department of Medicine, University of Maryland School of Medicine, Baltimore, Maryland, 21201.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, 21201
| | - Abhimanyu K Singh
- School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
| | - Mark A Wilson
- Department of Biochemistry and Redox Biology Center, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, 4056, Switzerland. .,SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland.
| |
Collapse
|
49
|
Iacoangeli A, Marcatili P, Tramontano A. Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures. J Chem Theory Comput 2015; 11:5045-51. [DOI: 10.1021/acs.jctc.5b00371] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Alfredo Iacoangeli
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Paolo Marcatili
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Anna Tramontano
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
- Istituto
Pasteur Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le
A. Moro 4, 00185 Rome, Italy
| |
Collapse
|
50
|
Cao R, Bhattacharya D, Adhikari B, Li J, Cheng J. Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11. Proteins 2015; 84 Suppl 1:247-59. [PMID: 26369671 DOI: 10.1002/prot.24924] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2015] [Revised: 08/21/2015] [Accepted: 09/10/2015] [Indexed: 12/28/2022]
Abstract
Model evaluation and selection is an important step and a big challenge in template-based protein structure prediction. Individual model quality assessment methods designed for recognizing some specific properties of protein structures often fail to consistently select good models from a model pool because of their limitations. Therefore, combining multiple complimentary quality assessment methods is useful for improving model ranking and consequently tertiary structure prediction. Here, we report the performance and analysis of our human tertiary structure predictor (MULTICOM) based on the massive integration of 14 diverse complementary quality assessment methods that was successfully benchmarked in the 11th Critical Assessment of Techniques of Protein Structure prediction (CASP11). The predictions of MULTICOM for 39 template-based domains were rigorously assessed by six scoring metrics covering global topology of Cα trace, local all-atom fitness, side chain quality, and physical reasonableness of the model. The results show that the massive integration of complementary, diverse single-model and multi-model quality assessment methods can effectively leverage the strength of single-model methods in distinguishing quality variation among similar good models and the advantage of multi-model quality assessment methods of identifying reasonable average-quality models. The overall excellent performance of the MULTICOM predictor demonstrates that integrating a large number of model quality assessment methods in conjunction with model clustering is a useful approach to improve the accuracy, diversity, and consequently robustness of template-based protein structure prediction. Proteins 2016; 84(Suppl 1):247-259. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Renzhi Cao
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | | | - Badri Adhikari
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Jilong Li
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, Missouri, 65211. .,Informatics Institute, University of Missouri, Columbia, Missouri, 65211.
| |
Collapse
|