1
|
Ibrahim MT, Lee J, Tao P. Homology modeling of Forkhead box protein C2: identification of potential inhibitors using ligand and structure-based virtual screening. Mol Divers 2023; 27:1661-1674. [PMID: 36048303 PMCID: PMC9975119 DOI: 10.1007/s11030-022-10519-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2022] [Accepted: 08/19/2022] [Indexed: 12/01/2022]
Abstract
Overexpression of Forkhead box protein C2 (FOXC2) has been associated with different types of carcinomas. FOXC2 plays an important role in the initiation and maintenance of the epithelial-mesenchymal transition (EMT) process, which is essential for the development of higher-grade tumors with an enhanced ability for metastasis. Thus, FOXC2 has become a therapeutic target for the development of anticancer drugs. MC-1-F2, the only identified experimental inhibitor of FOXC2, interacts with the full length of FOXC2. However, only the DNA-binding domain (DBD) of FOXC2 has resolved crystal structure. In this work, a three-dimensional (3D) structure of the full-length FOXC2 using homology modeling was developed and used for structure-based drug design (SBDD). The quality of this 3D model of the full-length FOXC2 was evaluated using MolProbity, ERRAT, and ProSA modules. Molecular dynamics (MD) simulation was also carried out to verify its stability. Ligand-based drug design (LBDD) was carried out to identify similar analogues for MC-1-F2 against 15 million compounds from ChEMBL and ZINC databases. 792 molecules were retrieved from this similarity search. De novo SBDD was performed against the full-length 3D structure of FOXC2 through homology modeling to identify novel inhibitors. The combination of LBDD and SBDD helped in gaining a better insight into the binding of MC-1-F2 and its analogues against the full length of the FOXC2. The binding free energy of the top hits was further investigated using MD simulations and MM/GBSA calculations to result in eight promising hits as lead compounds targeting FOXC2.
Collapse
Affiliation(s)
- Mayar Tarek Ibrahim
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, USA
| | - Jiyong Lee
- Department of Chemistry and Biochemistry, The University of Texas at Tyler, Tyler, TX, USA
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, TX, USA.
| |
Collapse
|
2
|
Cai H, Yu J, Qiao Y, Ma Y, Zheng J, Lin M, Yan Q, Huang L. Effect of the Type VI Secretion System Secreted Protein Hcp on the Virulence of Aeromonas salmonicida. Microorganisms 2022; 10:microorganisms10122307. [PMID: 36557560 PMCID: PMC9784854 DOI: 10.3390/microorganisms10122307] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/16/2022] [Accepted: 11/19/2022] [Indexed: 11/23/2022] Open
Abstract
Aeromonas salmonicida, a psychrophilic bacterial pathogen, is widely distributed in marine freshwater, causing serious economic losses to major salmon farming areas in the world. At present, it is still one of the most important pathogens threatening salmon farming. Hcp (haemolysin-coregulated protein) is an effector protein in the type-VI secretion system (T6SS), which is secreted by T6SS and functions as its structural component. The results of our previous genomic sequencing showed that hcp existed in the mesophilic A. salmonicida SRW-OG1 isolated from naturally infected Epinephelus coioides. To further explore the role of Hcp in A. salmonicida SRW-OG1, we constructed an hcp-RNAi strain and verified its effect on the virulence of A. salmonicida. The results showed that compared with the wild strain, the hcp-RNAi strain suffered from different degrees of decreased adhesion, growth, biofilm formation, extracellular product secretion, and virulence. It was suggested that hcp may be an important virulence gene of A. salmonicida SRW-OG1.
Collapse
Affiliation(s)
- Hongyan Cai
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
| | - Jiaying Yu
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
| | - Ying Qiao
- Fourth Institute of Oceanography, Ministry of Natural Resources, No. 26, New Century Avenue, Beihai 536000, China
| | - Ying Ma
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
| | - Jiang Zheng
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
| | - Mao Lin
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
| | - Qingpi Yan
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
- Correspondence: (Q.Y.); (L.H.)
| | - Lixing Huang
- Key Laboratory of Healthy Mariculture for the East China Sea, Fisheries College, Ministry of Agriculture, Jimei University, Xiamen 361021, China
- Correspondence: (Q.Y.); (L.H.)
| |
Collapse
|
3
|
Yadav NS, Kumar P, Singh I. Structural and functional analysis of protein. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00026-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
4
|
Peng CX, Zhou XG, Zhang GJ. De novo Protein Structure Prediction by Coupling Contact With Distance Profile. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:395-406. [PMID: 32750861 DOI: 10.1109/tcbb.2020.3000758] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
De novo protein structure prediction is a challenging problem that requires both an accurate energy function and an efficient conformation sampling method. In this study, a de novo structure prediction method, named CoDiFold, is proposed. In CoDiFold, contacts and distance profiles are organically combined into the Rosetta low-resolution energy function to improve the accuracy of energy function. As a result, the correlation between energy and root mean square deviation (RMSD) is improved. In addition, a population-based multi-mutation strategy is designed to balance the exploration and exploitation of conformation space sampling. The average RMSD of the models generated by the proposed protocol is decreased by 49.24 and 45.21 percent in the test set with 43 proteins compared with those of Rosetta and QUARK de novo protocols, respectively. The results also demonstrate that the structures predicted by proposed CoDiFold are comparable to the state-of-the-art methods for the 10 FM targets of CASP13. The source code and executable versions are freely available at http://github.com/iobio-zjut/CoDiFold.
Collapse
|
5
|
Behera SK, Sabarinath T, Mishra PKK, Deneke Y, Kumar A, ChandraSekar S, Senthilkumar K, Verma M, Ganesh B, Gurav A, Hota A. Immunoinformatic Study of Recombinant LigA/BCon1-5 Antigen and Evaluation of Its Diagnostic Potential in Primary and Secondary Binding Tests for Serodiagnosis of Porcine Leptospirosis. Pathogens 2021; 10:1082. [PMID: 34578116 PMCID: PMC8466556 DOI: 10.3390/pathogens10091082] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/05/2021] [Accepted: 08/16/2021] [Indexed: 11/20/2022] Open
Abstract
Leptospirosis is responsible for hampering the productivity of swine husbandry worldwide. The aim of this study was to assess the efficacy of bioinformatics tools in predicting the three-dimensional structure and immunogenicity of recombinant LigBCon1-5 (rLigBCon1-5) antigen. A battery of bioinformatics tools such as I-TASSER, ProSA and SAVES v6.0 were used for the prediction and assessment of the predicted structure of rLigBCon1-5 antigen. Bepipred-2.0, DiscoTope v2.0 and ElliPro servers were used to predict linear and conformational epitopes while T-cell epitopes were predicted using NetMHCpan 4.1 and IEDB recommended 2.22 method for MHC Class I and II peptides respectively. The results obtained using various in silico methods were then compared with wet lab experiments comprising of both primary (IgG Dot ELISA Dipstick test) and secondary-binding assays (Latex Agglutination Test [LAT]) to screen 1153 porcine serum samples. The three-dimensional structure of rLigA/BCon1-5 protein as predicted by I-TASSER was found to be reliable by Ramachandran Plot and ProSA. The ElliPro server suggested 10 and three potential linear and conformational B-cell-epitopes, respectively, on the peptide backbone of the rLigA/BCon1-5 protein. The DiscoTope prediction server suggested 47 amino acid residues to be part of B-cell antigen. Ten of the most efficient peptides for MHC-I and II grooves were predicted by NetMHCpan 4.1 and IEDB recommended 2.22 method, respectively. Of these, three peptides can serve dual functions as it can fit both MHC I and II grooves, thereby eliciting both humoral-and cell-mediated immune responses. The prediction of these computational approaches proved to be reliable since rLigBCon1-5 antigen-based IgG Dot ELISA Dipstick test and LAT gave results in concordance to gold standard test, the Microscopic Agglutination Test (MAT), for serodiagnosis of leptospirosis. Both the IgG Dot ELISA Dipstick test and LAT were serodiagnostic assays ideally suited for peripheral level of animal health care system as "point of care" tests for the detection of porcine leptospirosis.
Collapse
Affiliation(s)
- Sujit Kumar Behera
- Department of Epidemiology & Public Health, Central University of Tamil Nadu, Tiruvarur 610001, India;
| | - Thankappan Sabarinath
- Clinical Bacteriological Laboratory, Indian Council of Agricultural Research—Indian Veterinary Research Institute, Mukteshwar, Nainital 263138, India
| | - Prasanta Kumar K. Mishra
- Faculty of Veterinary and Animal Sciences, Rajiv Gandhi South Campus, Banaras Hindu University, Mirzapur 231001, India;
| | - Yosef Deneke
- School of Veterinary Medicine, Jimma University, Jimma 378, Ethiopia;
| | - Ashok Kumar
- Krishi Bhawan, Indian Council of Agricultural Research, New Delhi 110001, India;
| | - Shanmugam ChandraSekar
- Biochemistry Laboratory, Indian Council of Agricultural Research—Indian Veterinary Research Institute, Mukteshwar, Nainital 263138, India;
| | - Kuppusamy Senthilkumar
- Zoonoses Research Laboratory, Tamil Nadu Veterinary and Animal Sciences University, Chennai 600051, India;
| | - MedRam Verma
- Livestock Economics & Statistics Division, Indian Council of Agricultural Research—Indian Veterinary Research Institute, Bareilly 243122, India;
| | | | - Amol Gurav
- Temperate Animal Husbandry Division, ICAR—Indian Veterinary Research Institute (IVRI), Mukteshwar, Nainital 263138, India;
| | - Abhishek Hota
- Department of Animal Science, Centurion University of Technology and Management, Paralakhemundi 761211, India;
| |
Collapse
|
6
|
Xia YH, Peng CX, Zhou XG, Zhang GJ. A Sequential Niche Multimodal Conformational Sampling Algorithm for Protein Structure Prediction. Bioinformatics 2021; 37:4357-4365. [PMID: 34245242 DOI: 10.1093/bioinformatics/btab500] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 06/23/2021] [Accepted: 07/05/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Massive local minima on the protein energy landscape often cause traditional conformational sampling algorithms to be easily trapped in local basin regions, because they find it difficult to overcome high-energy barriers. Also, the lowest energy conformation may not correspond to the native structure due to the inaccuracy of energy models. This study investigates whether these two problems can be alleviated by a sequential niche technique without loss of accuracy. RESULTS A sequential niche multimodal conformational sampling algorithm for protein structure prediction (SNfold) is proposed in this study. In SNfold, a derating function is designed based on the knowledge learned from the previous sampling and used to construct a series of sampling-guided energy functions. These functions then help the sampling algorithm overcome high-energy barriers and avoid the re-sampling of the explored regions. In inaccurate protein energy models, the high-energy conformation that may correspond to the native structure can be sampled with successively updated sampling-guided energy functions. The proposed SNfold is tested on 300 benchmark proteins, 24 CASP13 and 19 CASP14 FM targets. Results show that SNfold correctly folds (TM-score ≥ 0.5) 231 out of 300 proteins. In particular, compared with Rosetta restrained by distance (Rosetta-dist), SNfold achieves higher average TM-score and improves the sampling efficiency by more than 100 times. On several CASP FM targets, SNfold also shows good performance compared with four state-of-the-art servers in CASP. As a plug-in conformational sampling algorithm, SNfold can be extended to other protein structure prediction methods. AVAILABILITY The source code and executable versions are freely available at https://github.com/iobio-zjut/SNfold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Xiao-Gen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109-2218, USA
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| |
Collapse
|
7
|
Pearce R, Zhang Y. Toward the solution of the protein structure prediction problem. J Biol Chem 2021; 297:100870. [PMID: 34119522 PMCID: PMC8254035 DOI: 10.1016/j.jbc.2021.100870] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/07/2021] [Accepted: 06/09/2021] [Indexed: 11/20/2022] Open
Abstract
Since Anfinsen demonstrated that the information encoded in a protein's amino acid sequence determines its structure in 1973, solving the protein structure prediction problem has been the Holy Grail of structural biology. The goal of protein structure prediction approaches is to utilize computational modeling to determine the spatial location of every atom in a protein molecule starting from only its amino acid sequence. Depending on whether homologous structures can be found in the Protein Data Bank (PDB), structure prediction methods have been historically categorized as template-based modeling (TBM) or template-free modeling (FM) approaches. Until recently, TBM has been the most reliable approach to predicting protein structures, and in the absence of reliable templates, the modeling accuracy sharply declines. Nevertheless, the results of the most recent community-wide assessment of protein structure prediction experiment (CASP14) have demonstrated that the protein structure prediction problem can be largely solved through the use of end-to-end deep machine learning techniques, where correct folds could be built for nearly all single-domain proteins without using the PDB templates. Critically, the model quality exhibited little correlation with the quality of available template structures, as well as the number of sequence homologs detected for a given target protein. Thus, the implementation of deep-learning techniques has essentially broken through the 50-year-old modeling border between TBM and FM approaches and has made the success of high-resolution structure prediction significantly less dependent on template availability in the PDB library.
Collapse
Affiliation(s)
- Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA; Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, USA.
| |
Collapse
|
8
|
Zhao KL, Liu J, Zhou XG, Su JZ, Zhang Y, Zhang GJ. MMpred: a distance-assisted multimodal conformation sampling for de novo protein structure prediction. Bioinformatics 2021; 37:4350-4356. [PMID: 34185079 DOI: 10.1093/bioinformatics/btab484] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 06/22/2021] [Accepted: 06/28/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The mathematically optimal solution in computational protein folding simulations does not always correspond to the native structure, due to the imperfection of the energy force fields. There is therefore a need to search for more diverse suboptimal solutions in order to identify the states close to the native. We propose a novel multimodal optimization protocol to improve the conformation sampling efficiency and modeling accuracy of de novo protein structure folding simulations. RESULTS A distance-assisted multimodal optimization sampling algorithm, MMpred, is proposed for de novo protein structure prediction. The protocol consists of three stages. In the first modal exploration stage, a structural similarity evaluation model DMscore is designed to control the diversity of conformations, generating a population of diverse structures in different low-energy basins. In the second modal maintaining stage, an adaptive clustering algorithm MNDcluster is proposed to divide the populations and merge the modal by adjusting the annealing temperature to locate the promising basins. In the last stage of modal exploitation, a greedy search strategy is used to accelerate the convergence of the modal. Distance constraint information is used to construct the conformation scoring model to guide sampling. MMpred is tested on 320 non-redundant proteins, where MMpred obtains models with TM-score ≥ 0.5 on 268 cases, which is 20.3% higher than that of Rosetta guided with the same distance constraints. In addition, on 320 benchmark proteins, the average TM-score of the enhanced version of MMpred (E-MMpred) is 0.732 on the best model, which is comparable to trRosetta (0.730). AVAILABILITY The source code and executable are freely available at https://github.com/iobio-zjut/MMpred. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Jun Liu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Xiao-Gen Zhou
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, MI 48109-2218, USA
| | - Jian-Zhong Su
- School of Biomedical Engineering, School of Ophthalmology and Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325011, Zhejiang, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw, Ann Arbor, MI 48109-2218, USA
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|
9
|
Abstract
For two decades, Rosetta has consistently been at the forefront of protein structure
prediction. While it has become a very large package comprising programs, scripts, and tools, for
different types of macromolecular modelling such as ligand docking, protein-protein docking,
protein design, and loop modelling, it started as the implementation of an algorithm for ab initio
protein structure prediction. The term ’Rosetta’ appeared for the first time twenty years ago in the
literature to describe that algorithm and its contribution to the third edition of the community wide
Critical Assessment of techniques for protein Structure Prediction (CASP3). Similar to the Rosetta
stone that allowed deciphering the ancient Egyptian civilisation, David Baker and his co-workers
have been contributing to deciphering ’the second half of the genetic code’. Although the focus of
Baker’s team has expended to de novo protein design in the past few years, Rosetta’s ‘fame’ is
associated with its fragment-assembly protein structure prediction approach. Following a
presentation of the main concepts underpinning its foundation, especially sequence-structure
correlation and usage of fragments, we review the main stages of its developments and highlight
the milestones it has achieved in terms of protein structure prediction, particularly in CASP.
Collapse
Affiliation(s)
- Jad Abbass
- Department of Computer Science, Lebanese International University, Bekaa, Lebanon
| | - Jean-Christophe Nebel
- Faculty of Science, Engineering and Computing, Kingston University, London, KT1 2EE, United Kingdom
| |
Collapse
|
10
|
Lin M, Wang F, Zhu Y. Modeled structure-based computational redesign of a glycosyltransferase for the synthesis of rebaudioside D from rebaudioside A. Biochem Eng J 2020. [DOI: 10.1016/j.bej.2020.107626] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
11
|
Abbass J, Nebel JC. Enhancing fragment-based protein structure prediction by customising fragment cardinality according to local secondary structure. BMC Bioinformatics 2020; 21:170. [PMID: 32357827 PMCID: PMC7195757 DOI: 10.1186/s12859-020-3491-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 04/13/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Whenever suitable template structures are not available, usage of fragment-based protein structure prediction becomes the only practical alternative as pure ab initio techniques require massive computational resources even for very small proteins. However, inaccuracy of their energy functions and their stochastic nature imposes generation of a large number of decoys to explore adequately the solution space, limiting their usage to small proteins. Taking advantage of the uneven complexity of the sequence-structure relationship of short fragments, we adjusted the fragment insertion process by customising the number of available fragment templates according to the expected complexity of the predicted local secondary structure. Whereas the number of fragments is kept to its default value for coil regions, important and dramatic reductions are proposed for beta sheet and alpha helical regions, respectively. RESULTS The evaluation of our fragment selection approach was conducted using an enhanced version of the popular Rosetta fragment-based protein structure prediction tool. It was modified so that the number of fragment candidates used in Rosetta could be adjusted based on the local secondary structure. Compared to Rosetta's standard predictions, our strategy delivered improved first models, + 24% and + 6% in terms of GDT, when using 2000 and 20,000 decoys, respectively, while reducing significantly the number of fragment candidates. Furthermore, our enhanced version of Rosetta is able to deliver with 2000 decoys a performance equivalent to that produced by standard Rosetta while using 20,000 decoys. We hypothesise that, as the fragment insertion process focuses on the most challenging regions, such as coils, fewer decoys are needed to explore satisfactorily conformation spaces. CONCLUSIONS Taking advantage of the high accuracy of sequence-based secondary structure predictions, we showed the value of that information to customise the number of candidates used during the fragment insertion process of fragment-based protein structure prediction. Experimentations conducted using standard Rosetta showed that, when using the recommended number of decoys, i.e. 20,000, our strategy produces better results. Alternatively, similar results can be achieved using only 2000 decoys. Consequently, we recommend the adoption of this strategy to either improve significantly model quality or reduce processing times by a factor 10.
Collapse
Affiliation(s)
- Jad Abbass
- Faculty of Science, Engineering and Computing, Kingston University, London, KT1 2EE UK
- Department of Computer Science, Lebanese International University, Bekaa, Lebanon
| | - Jean-Christophe Nebel
- Faculty of Science, Engineering and Computing, Kingston University, London, KT1 2EE UK
| |
Collapse
|
12
|
Zheng W, Li Y, Zhang C, Pearce R, Mortuza SM, Zhang Y. Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 2019; 87:1149-1164. [PMID: 31365149 PMCID: PMC6851476 DOI: 10.1002/prot.25792] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 07/14/2019] [Accepted: 07/27/2019] [Indexed: 12/28/2022]
Abstract
We report the results of two fully automated structure prediction pipelines, "Zhang-Server" and "QUARK", in CASP13. The pipelines were built upon the C-I-TASSER and C-QUARK programs, which in turn are based on I-TASSER and QUARK but with three new modules: (a) a novel multiple sequence alignment (MSA) generation protocol to construct deep sequence-profiles for contact prediction; (b) an improved meta-method, NeBcon, which combines multiple contact predictors, including ResPRE that predicts contact-maps by coupling precision-matrices with deep residual convolutional neural-networks; and (c) an optimized contact potential to guide structure assembly simulations. For 50 CASP13 FM domains that lacked homologous templates, average TM-scores of the first models produced by C-I-TASSER and C-QUARK were 28% and 56% higher than those constructed by I-TASSER and QUARK, respectively. For the first time, contact-map predictions demonstrated usefulness on TBM domains with close homologous templates, where TM-scores of C-I-TASSER models were significantly higher than those of I-TASSER models with a P-value <.05. Detailed data analyses showed that the success of C-I-TASSER and C-QUARK was mainly due to the increased accuracy of deep-learning-based contact-maps, as well as the careful balance between sequence-based contact restraints, threading templates, and generic knowledge-based potentials. Nevertheless, challenges still remain for predicting quaternary structure of multi-domain proteins, due to the difficulties in domain partitioning and domain reassembly. In addition, contact prediction in terminal regions was often unsatisfactory due to the sparsity of MSAs. Development of new contact-based domain partitioning and assembly methods and training contact models on sparse MSAs may help address these issues.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
13
|
Wang Y, Virtanen J, Xue Z, Zhang Y. I-TASSER-MR: automated molecular replacement for distant-homology proteins using iterative fragment assembly and progressive sequence truncation. Nucleic Acids Res 2019; 45:W429-W434. [PMID: 28472524 PMCID: PMC5793832 DOI: 10.1093/nar/gkx349] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 04/20/2017] [Indexed: 11/16/2022] Open
Abstract
Molecular replacement (MR) is one of the most common techniques used for solving the phase problem in X-ray crystal diffraction. The success rate of MR however drops quickly when the sequence identity between query and templates is reduced, while the I-TASSER-MR server is designed to solve the phase problem for proteins that lack close homologous templates. Starting from a sequence, it first generates full-length models using I-TASSER by iterative structural fragment reassembly. A progressive sequence truncation procedure is then used for editing the models based on local variations of the structural assembly simulations. Next, the edited models are submitted to MR-REX to search for optimal placements in the crystal unit-cells through replica-exchange Monte Carlo simulations, with the phasing results used by CNS for final atomic model refinement and selection. The I-TASSER-MR algorithm was tested in large-scale benchmark datasets and solved 36% more targets compared to using the best threading templates. The server takes primary sequence and raw crystal diffraction data as input, with output containing annotated phase information and refined structure models. It also allows users to choose between different methods for setting B-factors and the number of models used for phasing. The online server is freely available at http://zhanglab.ccmb.med.umich.edu/I-TASSER-MR.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China.,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jouko Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Zhidong Xue
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
14
|
Malik A, Afaq S, Gamal BE, Ellatif MA, Hassan WN, Dera A, Noor R, Tarique M. Molecular docking and pharmacokinetic evaluation of natural compounds as targeted inhibitors against Crz1 protein in Rhizoctonia solani. Bioinformation 2019; 15:277-286. [PMID: 31285645 PMCID: PMC6599437 DOI: 10.6026/97320630015277] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 03/27/2019] [Indexed: 11/29/2022] Open
Abstract
Crz1p regulates Calcineurin, a serine-threonine-specific protein phosphatase, in Rhizoctonia solani. It has attracted consideration as a novel target of antifungal therapy based on studies in numerous pathogenic fungi, including, Cryptococcus neoformans, Candida albicans and Aspergillus fumigatus. To investigate whether Calcineurin can be a useful target for the treatment of Crz1 protein in R. solani causing wet root rot in Chickpea. The work presented here reports the in-silico studies of Crz1 protein against natural compounds. This study Comprises of quantitative structure-toxicity relationship (QSTR) and quantitative structure-activity relationship (QSAR). All compounds showed high binding energy for Crz1 protein through molecular docking. Further, a pharmacokinetic study revealed that these compounds had minimal side effects. Biological activity spectrum prediction of these compounds showed potential antifungal properties by showing significant interaction with Crz1. Hence, these compounds can be used for the prevention and treatment of wet root rot in Chickpea.
Collapse
Affiliation(s)
- Ajit Malik
- Department of Clinical Biochemistry, College of Medicine, King Khalid University, Abha, Saudi Arabia
| | - Sarah Afaq
- Department of Clinical Biochemistry, College of Medicine, King Khalid University, Abha, Saudi Arabia
| | - Basiouny El Gamal
- Department of Clinical Biochemistry, College of Medicine, King Khalid University, Abha, Saudi Arabia
| | - Mohamed Abd Ellatif
- Department of Clinical Biochemistry, College of Medicine, King Khalid University, Abha, Saudi Arabia
- Department of Medical Biochemistry,Faculty of Medicine, Mansoura University, Mansoura, Egypt
| | - Waleed N Hassan
- Department of Clinical Biochemistry, College of Medicine, King Khalid University, Abha, Saudi Arabia
| | - Ayed Dera
- Departments of Clinical Laboratory Science, College of Applied MedicalScience, King Khalid University, Abha, Saudi Arabia
| | - Rana Noor
- 5Department of Biochemistry, Faculty of Dentistry, Jamia Millia Islamia, New Delhi-110025, India
| | - Mohammed Tarique
- Center for InterdisciplinaryResearch in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi-110025, India
| |
Collapse
|
15
|
AIMOES: Archive information assisted multi-objective evolutionary strategy for ab initio protein structure prediction. Knowl Based Syst 2018. [DOI: 10.1016/j.knosys.2018.01.028] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
16
|
Assessing Exhaustiveness of Stochastic Sampling for Integrative Modeling of Macromolecular Structures. Biophys J 2018; 113:2344-2353. [PMID: 29211988 DOI: 10.1016/j.bpj.2017.10.005] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2017] [Revised: 09/22/2017] [Accepted: 10/02/2017] [Indexed: 12/22/2022] Open
Abstract
Modeling of macromolecular structures involves structural sampling guided by a scoring function, resulting in an ensemble of good-scoring models. By necessity, the sampling is often stochastic, and must be exhaustive at a precision sufficient for accurate modeling and assessment of model uncertainty. Therefore, the very first step in analyzing the ensemble is an estimation of the highest precision at which the sampling is exhaustive. Here, we present an objective and automated method for this task. As a proxy for sampling exhaustiveness, we evaluate whether two independently and stochastically generated sets of models are sufficiently similar. The protocol includes testing 1) convergence of the model score, 2) whether model scores for the two samples were drawn from the same parent distribution, 3) whether each structural cluster includes models from each sample proportionally to its size, and 4) whether there is sufficient structural similarity between the two model samples in each cluster. The evaluation also provides the sampling precision, defined as the smallest clustering threshold that satisfies the third, most stringent test. We validate the protocol with the aid of enumerated good-scoring models for five illustrative cases of binary protein complexes. Passing the proposed four tests is necessary, but not sufficient for thorough sampling. The protocol is general in nature and can be applied to the stochastic sampling of any set of models, not just structural models. In addition, the tests can be used to stop stochastic sampling as soon as exhaustiveness at desired precision is reached, thereby improving sampling efficiency; they may also help in selecting a model representation that is sufficiently detailed to be informative, yet also sufficiently coarse for sampling to be exhaustive.
Collapse
|
17
|
Lima Leite A, Silva Fernandes M, Charone S, Whitford GM, Everett ET, Buzalaf MAR. Proteomic Mapping of Dental Enamel Matrix from Inbred Mouse Strains: Unraveling Potential New Players in Enamel. Caries Res 2017; 52:78-87. [PMID: 29248934 DOI: 10.1159/000479039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Accepted: 06/23/2017] [Indexed: 01/21/2023] Open
Abstract
Enamel formation is a complex 2-step process by which proteins are secreted to form an extracellular matrix, followed by massive protein degradation and subsequent mineralization. Excessive systemic exposure to fluoride can disrupt this process and lead to a condition known as dental fluorosis. The genetic background influences the responses of mineralized tissues to fluoride, such as dental fluorosis, observed in A/J and 129P3/J mice. The aim of the present study was to map the protein profile of enamel matrix from A/J and 129P3/J strains. Enamel matrix samples were obtained from A/J and 129P3/J mice and analyzed by 2-dimensional electrophoresis and liquid chromatography coupled with mass spectrometry. A total of 120 proteins were identified, and 7 of them were classified as putative uncharacterized proteins and analyzed in silico for structural and functional characterization. An interesting finding was the possibility of the uncharacterized sequence Q8BIS2 being an enzyme involved in the degradation of matrix proteins. Thus, the results provide a comprehensive view of the structure and function for putative uncharacterized proteins found in the enamel matrix that could help to elucidate the mechanisms involved in enamel biomineralization and genetic susceptibility to dental fluorosis.
Collapse
Affiliation(s)
- Aline Lima Leite
- Department of Biological Sciences, Bauru Dental School, University of São Paulo, São Paulo, Brazil
| | | | | | | | | | | |
Collapse
|
18
|
Leonetti A, Cervoni L, Polticelli F, Kanamori Y, Yurtsever ZN, Agostinelli E, Mariottini P, Stano P, Cervelli M. Spectroscopic and calorimetric characterization of spermine oxidase and its association forms. Biochem J 2017; 474:4253-4268. [PMID: 29138259 DOI: 10.1042/bcj20170744] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Revised: 11/07/2017] [Accepted: 11/13/2017] [Indexed: 12/11/2022]
Abstract
Spermine oxidase (SMOX) is a flavin-containing enzyme that oxidizes spermine to produce spermidine, 3-aminopropanaldehyde, and hydrogen peroxide. SMOX has been shown to play key roles in inflammation and carcinogenesis; indeed, it is differentially expressed in several human cancer types. Our previous investigation has revealed that SMOX purified after heterologous expression in Escherichia coli actually consists of monomers, covalent homodimers, and other higher-order forms. All association forms oxidize spermine and, after treatment with dithiothreitol, revert to SMOX monomer. Here, we report a detailed investigation on the thermal denaturation of SMOX and its association forms in native and reducing conditions. By combining spectroscopic methods (circular dichroism, fluorescence) and thermal methods (differential scanning calorimetry), we provide new insights into the structure, the transformation, and the stability of SMOX. While the crystal structure of this protein is not available yet, experimental results are interpreted also on the basis of a novel SMOX structural model, obtained in silico exploiting the recently solved acetylspermine oxidase crystal structure. We conclude that while at least one specific intermolecular disulfide bond links two SMOX molecules to form the homodimer, the thermal denaturation profiles can be justified by the presence of at least one intramolecular disulfide bond, which also plays a critical role in the stabilization of the overall three-dimensional SMOX structure, and in particular of its flavin adenine dinucleotide-containing active site.
Collapse
Affiliation(s)
- Alessia Leonetti
- Department of Sciences, Roma Tre University, Viale Guglielmo Marconi 446, Rome I-00146, Italy
| | - Laura Cervoni
- Department of Biochemical Sciences 'A. Rossi Fanelli', University of 'La Sapienza', Piazzale Aldo Moro 5, Rome I-00185, Italy
| | - Fabio Polticelli
- Department of Sciences, Roma Tre University, Viale Guglielmo Marconi 446, Rome I-00146, Italy
- National Institute of Nuclear Physics, Roma Tre Section, Via della Vasca Navale 84, Rome I-00146, Italy
| | - Yuta Kanamori
- Department of Biochemical Sciences 'A. Rossi Fanelli', University of 'La Sapienza', Piazzale Aldo Moro 5, Rome I-00185, Italy
| | - Zuleyha Nihan Yurtsever
- Department of Biochemical Sciences 'A. Rossi Fanelli', University of 'La Sapienza', Piazzale Aldo Moro 5, Rome I-00185, Italy
| | - Enzo Agostinelli
- Department of Biochemical Sciences 'A. Rossi Fanelli', University of 'La Sapienza', Piazzale Aldo Moro 5, Rome I-00185, Italy
| | - Paolo Mariottini
- Department of Sciences, Roma Tre University, Viale Guglielmo Marconi 446, Rome I-00146, Italy
| | - Pasquale Stano
- Department of Sciences, Roma Tre University, Viale Guglielmo Marconi 446, Rome I-00146, Italy
| | - Manuela Cervelli
- Department of Sciences, Roma Tre University, Viale Guglielmo Marconi 446, Rome I-00146, Italy
| |
Collapse
|
19
|
Zhang C, Mortuza SM, He B, Wang Y, Zhang Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 2017; 86 Suppl 1:136-151. [PMID: 29082551 DOI: 10.1002/prot.25414] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/09/2017] [Accepted: 10/27/2017] [Indexed: 12/26/2022]
Abstract
We develop two complementary pipelines, "Zhang-Server" and "QUARK", based on I-TASSER and QUARK pipelines for template-based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I-TASSER and QUARK successfully folds three medium-size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence-based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM-score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non-foldable to foldable (TM-score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue-level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi-domain proteins and folding of β-proteins with complicated topologies bound by long-range strand-strand interactions. Improvements on domain boundary and long-range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yanting Wang
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
20
|
Zhang GJ, Zhou XG, Yu XF, Hao XH, Yu L. Enhancing Protein Conformational Space Sampling Using Distance Profile-Guided Differential Evolution. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1288-1301. [PMID: 28113726 DOI: 10.1109/tcbb.2016.2566617] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
De novo protein structure prediction aims to search for low-energy conformations as it follows the thermodynamics hypothesis that places native conformations at the global minimum of the protein energy surface. However, the native conformation is not necessarily located in the lowest-energy regions owing to the inaccuracies of the energy model. This study presents a differential evolution algorithm using distance profile-based selection strategy to sample conformations with reasonable structure effectively. In the proposed algorithm, besides energy, the residue-residue distance is considered another measure of the conformation. The average distance errors of decoys between the distance of each residue pair and the corresponding distance in the distance profiles are first calculated when the trial conformation yields a larger energy value than that of the target. Then, the distance acceptance probability of the trial conformation is designed based on distance profiles if the trial conformation obtains a lower average distance error compared with that of the target conformation. The trial conformation is accepted to the next generation in accordance with its distance acceptance probability. By using the dual constraints of energy and distance in guiding sampling, the algorithm can sample conformations with lower energies and more reasonable structures. Experimental results of 28 benchmark proteins show that the proposed algorithm can effectively predict near-native protein structures.
Collapse
|
21
|
Hao XH, Zhang GJ, Zhou XG. Conformational Space Sampling Method Using Multi-Subpopulation Differential Evolution for De novo Protein Structure Prediction. IEEE Trans Nanobioscience 2017; 16:618-633. [DOI: 10.1109/tnb.2017.2749243] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
22
|
Hao XH, Zhang GJ, Zhou XG, Yu XF. A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:887-900. [PMID: 26552093 DOI: 10.1109/tcbb.2015.2497226] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
To address the searching problem of protein conformational space in ab-initio protein structure prediction, a novel method using abstract convex underestimation (ACUE) based on the framework of evolutionary algorithm was proposed. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and rugged energy surface of the protein conformational space. As a consequence, the dimension of protein conformational space should be reduced to a proper level. In this paper, the high-dimensionality original conformational space was converted into feature space whose dimension is considerably reduced by feature extraction technique. And, the underestimate space could be constructed according to abstract convex theory. Thus, the entropy effect caused by searching in the high-dimensionality conformational space could be avoided through such conversion. The tight lower bound estimate information was obtained to guide the searching direction, and the invalid searching area in which the global optimal solution is not located could be eliminated in advance. Moreover, instead of expensively calculating the energy of conformations in the original conformational space, the estimate value is employed to judge if the conformation is worth exploring to reduce the evaluation time, thereby making computational cost lower and the searching process more efficient. Additionally, fragment assembly and the Monte Carlo method are combined to generate a series of metastable conformations by sampling in the conformational space. The proposed method provides a novel technique to solve the searching problem of protein conformational space. Twenty small-to-medium structurally diverse proteins were tested, and the proposed ACUE method was compared with It Fix, HEA, Rosetta and the developed method LEDE without underestimate information. Test results show that the ACUE method can more rapidly and more efficiently obtain the near-native protein structure.
Collapse
|
23
|
Lindert S, McCammon JA. Improved cryoEM-Guided Iterative Molecular Dynamics--Rosetta Protein Structure Refinement Protocol for High Precision Protein Structure Prediction. J Chem Theory Comput 2016; 11:1337-46. [PMID: 25883538 PMCID: PMC4393324 DOI: 10.1021/ct500995d] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Indexed: 12/13/2022]
Abstract
![]()
Many excellent methods exist that
incorporate cryo-electron microscopy
(cryoEM) data to constrain computational protein structure prediction
and refinement. Previously, it was shown that iteration of two such
orthogonal sampling and scoring methods – Rosetta and molecular
dynamics (MD) simulations – facilitated exploration of conformational
space in principle. Here, we go beyond a proof-of-concept study and
address significant remaining limitations of the iterative MD–Rosetta
protein structure refinement protocol. Specifically, all parts of
the iterative refinement protocol are now guided by medium-resolution
cryoEM density maps, and previous knowledge about the native structure
of the protein is no longer necessary. Models are identified solely
based on score or simulation time. All four benchmark proteins showed
substantial improvement through three rounds of the iterative refinement
protocol. The best-scoring final models of two proteins had sub-Ångstrom
RMSD to the native structure over residues in secondary structure
elements. Molecular dynamics was most efficient in refining secondary
structure elements and was thus highly complementary to the Rosetta
refinement which is most powerful in refining side chains and loop
regions.
Collapse
|
24
|
Wang Y, Virtanen J, Xue Z, Tesmer JJG, Zhang Y. Using iterative fragment assembly and progressive sequence truncation to facilitate phasing and crystal structure determination of distantly related proteins. Acta Crystallogr D Struct Biol 2016; 72:616-28. [PMID: 27139625 PMCID: PMC4931812 DOI: 10.1107/s2059798316003016] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 02/19/2016] [Indexed: 04/15/2023] Open
Abstract
Molecular replacement (MR) often requires templates with high homology to solve the phase problem in X-ray crystallography. I-TASSER-MR has been developed to test whether the success rate for structure determination of distant-homology proteins could be improved by a combination of iterative fragmental structure-assembly simulations with progressive sequence truncation designed to trim regions with high variation. The pipeline was tested on two independent protein sets consisting of 61 proteins from CASP8 and 100 high-resolution proteins from the PDB. After excluding homologous templates, I-TASSER generated full-length models with an average TM-score of 0.773, which is 12% higher than the best threading templates. Using these as search models, I-TASSER-MR found correct MR solutions for 95 of 161 targets as judged by having a TFZ of >8 or with the final structure closer to the native than the initial search models. The success rate was 16% higher than when using the best threading templates. I-TASSER-MR was also applied to 14 protein targets from structure genomics centers. Seven of these were successfully solved by I-TASSER-MR. These results confirm that advanced structure assembly and progressive structural editing can significantly improve the success rate of MR for targets with distant homology to proteins of known structure.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Molecular Biophysics of the Ministry of Education, School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People’s Republic of China
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jouko Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Zhidong Xue
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, Hubei 430074, People’s Republic of China
| | - John J. G. Tesmer
- Departments of Pharmacology and Biological Chemistry, University of Michigan, Ann Arbor, MI 41809, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
25
|
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A. Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics. PLoS Comput Biol 2016; 12:e1004619. [PMID: 27124275 PMCID: PMC4849799 DOI: 10.1371/journal.pcbi.1004619] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigation of macromolecular structure and dynamics is fundamental to understanding how macromolecules carry out their functions in the cell. Significant advances have been made toward this end in silico, with a growing number of computational methods proposed yearly to study and simulate various aspects of macromolecular structure and dynamics. This review aims to provide an overview of recent advances, focusing primarily on methods proposed for exploring the structure space of macromolecules in isolation and in assemblies for the purpose of characterizing equilibrium structure and dynamics. In addition to surveying recent applications that showcase current capabilities of computational methods, this review highlights state-of-the-art algorithmic techniques proposed to overcome challenges posed in silico by the disparate spatial and time scales accessed by dynamic macromolecules. This review is not meant to be exhaustive, as such an endeavor is impossible, but rather aims to balance breadth and depth of strategies for modeling macromolecular structure and dynamics for a broad audience of novices and experts.
Collapse
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Ryan Moffatt
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Buyong Ma
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
| | - Ruth Nussinov
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
- Department of Biongineering, George Mason University, Fairfax, Virginia, United States of America
- School of Systems Biology, George Mason University, Manassas, Virginia, United States of America
| |
Collapse
|
26
|
Iacoangeli A, Marcatili P, Tramontano A. Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures. J Chem Theory Comput 2015; 11:5045-51. [DOI: 10.1021/acs.jctc.5b00371] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Alfredo Iacoangeli
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Paolo Marcatili
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Anna Tramontano
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
- Istituto
Pasteur Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le
A. Moro 4, 00185 Rome, Italy
| |
Collapse
|
27
|
Zhang W, Yang J, He B, Walker SE, Zhang H, Govindarajoo B, Virtanen J, Xue Z, Shen HB, Zhang Y. Integration of QUARK and I-TASSER for Ab Initio Protein Structure Prediction in CASP11. Proteins 2015; 84 Suppl 1:76-86. [PMID: 26370505 DOI: 10.1002/prot.24930] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 08/26/2015] [Accepted: 09/10/2015] [Indexed: 11/12/2022]
Abstract
We tested two pipelines developed for template-free protein structure prediction in the CASP11 experiment. First, the QUARK pipeline constructs structure models by reassembling fragments of continuously distributed lengths excised from unrelated proteins. Five free-modeling (FM) targets have the model successfully constructed by QUARK with a TM-score above 0.4, including the first model of T0837-D1, which has a TM-score = 0.736 and RMSD = 2.9 Å to the native. Detailed analysis showed that the success is partly attributed to the high-resolution contact map prediction derived from fragment-based distance-profiles, which are mainly located between regular secondary structure elements and loops/turns and help guide the orientation of secondary structure assembly. In the Zhang-Server pipeline, weakly scoring threading templates are re-ordered by the structural similarity to the ab initio folding models, which are then reassembled by I-TASSER based structure assembly simulations; 60% more domains with length up to 204 residues, compared to the QUARK pipeline, were successfully modeled by the I-TASSER pipeline with a TM-score above 0.4. The robustness of the I-TASSER pipeline can stem from the composite fragment-assembly simulations that combine structures from both ab initio folding and threading template refinements. Despite the promising cases, challenges still exist in long-range beta-strand folding, domain parsing, and the uncertainty of secondary structure prediction; the latter of which was found to affect nearly all aspects of FM structure predictions, from fragment identification, target classification, structure assembly, to final model selection. Significant efforts are needed to solve these problems before real progress on FM could be made. Proteins 2016; 84(Suppl 1):76-86. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wenxuan Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jianyi Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Sara Elizabeth Walker
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Brandon Govindarajoo
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jouko Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Zhidong Xue
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hong-Bin Shen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109. .,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109.
| |
Collapse
|
28
|
Jang R, Wang Y, Xue Z, Zhang Y. NMR data-driven structure determination using NMR-I-TASSER in the CASD-NMR experiment. JOURNAL OF BIOMOLECULAR NMR 2015; 62:511-525. [PMID: 25737244 PMCID: PMC4560687 DOI: 10.1007/s10858-015-9914-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2014] [Accepted: 02/21/2015] [Indexed: 05/30/2023]
Abstract
NMR-I-TASSER, an adaption of the I-TASSER algorithm combining NMR data for protein structure determination, recently joined the second round of the CASD-NMR experiment. Unlike many molecular dynamics-based methods, NMR-I-TASSER takes a molecular replacement-like approach to the problem by first threading the target through the PDB to identify structural templates which are then used for iterative NOE assignments and fragment structure assembly refinements. The employment of multiple templates allows NMR-I-TASSER to sample different topologies while convergence to a single structure is not required. Retroactive and blind tests of the CASD-NMR targets from Rounds 1 and 2 demonstrate that even without using NOE peak lists I-TASSER can generate correct structure topology with 15 of 20 targets having a TM-score above 0.5. With the addition of NOE-based distance restraints, NMR-I-TASSER significantly improved the I-TASSER models with all models having the TM-score above 0.5. The average RMSD was reduced from 5.29 to 2.14 Å in Round 1 and 3.18 to 1.71 Å in Round 2. There is no obvious difference in the modeling results with using raw and refined peak lists, indicating robustness of the pipeline to the NOE assignment errors. Overall, despite the low-resolution modeling the current NMR-I-TASSER pipeline provides a coarse-grained structure folding approach complementary to traditional molecular dynamics simulations, which can produce fast near-native frameworks for atomic-level structural refinement.
Collapse
Affiliation(s)
- Richard Jang
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA
| | - Yan Wang
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA
| | - Zhidong Xue
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China.
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
29
|
Abbass J, Nebel JC. Customised fragments libraries for protein structure prediction based on structural class annotations. BMC Bioinformatics 2015; 16:136. [PMID: 25925397 PMCID: PMC4419399 DOI: 10.1186/s12859-015-0576-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/17/2015] [Indexed: 12/05/2022] Open
Abstract
Background Since experimental techniques are time and cost consuming, in silico protein structure prediction is essential to produce conformations of protein targets. When homologous structures are not available, fragment-based protein structure prediction has become the approach of choice. However, it still has many issues including poor performance when targets’ lengths are above 100 residues, excessive running times and sub-optimal energy functions. Taking advantage of the reliable performance of structural class prediction software, we propose to address some of the limitations of fragment-based methods by integrating structural constraints in their fragment selection process. Results Using Rosetta, a state-of-the-art fragment-based protein structure prediction package, we evaluated our proposed pipeline on 70 former CASP targets containing up to 150 amino acids. Using either CATH or SCOP-based structural class annotations, enhancement of structure prediction performance is highly significant in terms of both GDT_TS (at least +2.6, p-values < 0.0005) and RMSD (−0.4, p-values < 0.005). Although CATH and SCOP classifications are different, they perform similarly. Moreover, proteins from all structural classes benefit from the proposed methodology. Further analysis also shows that methods relying on class-based fragments produce conformations which are more relevant to user and converge quicker towards the best model as estimated by GDT_TS (up to 10% in average). This substantiates our hypothesis that usage of structurally relevant templates conducts to not only reducing the size of the conformation space to be explored, but also focusing on a more relevant area. Conclusions Since our methodology produces models the quality of which is up to 7% higher in average than those generated by a standard fragment-based predictor, we believe it should be considered before conducting any fragment-based protein structure prediction. Despite such progress, ab initio prediction remains a challenging task, especially for proteins of average and large sizes. Apart from improving search strategies and energy functions, integration of additional constraints seems a promising route, especially if they can be accurately predicted from sequence alone. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0576-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jad Abbass
- Faculty of Science, Engineering and Computing, Kingston University, London, KT1 2EE, UK.
| | - Jean-Christophe Nebel
- Faculty of Science, Engineering and Computing, Kingston University, London, KT1 2EE, UK.
| |
Collapse
|
30
|
Du H, Brender JR, Zhang J, Zhang Y. Protein structure prediction provides comparable performance to crystallographic structures in docking-based virtual screening. Methods 2015; 71:77-84. [PMID: 25220914 PMCID: PMC4431978 DOI: 10.1016/j.ymeth.2014.08.017] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 08/14/2014] [Accepted: 08/31/2014] [Indexed: 11/26/2022] Open
Abstract
Structure based virtual screening has largely been limited to protein targets for which either an experimental structure is available or a strongly homologous template exists so that a high-resolution model can be constructed. The performance of state of the art protein structure predictions in virtual screening in systems where only weakly homologous templates are available is largely untested. Using the challenging DUD database of structural decoys, we show here that even using templates with only weak sequence homology (<30% sequence identity) structural models can be constructed by I-TASSER which achieve comparable enrichment rates to using the experimental bound crystal structure in the majority of the cases studied. For 65% of the targets, the I-TASSER models, which are constructed essentially in the apo conformations, reached 70% of the virtual screening performance of using the holo-crystal structures. A correlation was observed between the success of I-TASSER in modeling the global fold and local structures in the binding pockets of the proteins versus the relative success in virtual screening. The virtual screening performance can be further improved by the recognition of chemical features of the ligand compounds. These results suggest that the combination of structure-based docking and advanced protein structure modeling methods should be a valuable approach to the large-scale drug screening and discovery studies, especially for the proteins lacking crystallographic structures.
Collapse
Affiliation(s)
- Hongying Du
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; Department of Public Health, Lanzhou University, Lanzhou 730000, China
| | - Jeffrey R Brender
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Jian Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA.
| |
Collapse
|
31
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
32
|
Ab Initio structure prediction for Escherichia coli: towards genome-wide protein structure modeling and fold assignment. Sci Rep 2014; 3:1895. [PMID: 23719418 PMCID: PMC3667494 DOI: 10.1038/srep01895] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Accepted: 05/08/2013] [Indexed: 11/09/2022] Open
Abstract
Genome-wide protein structure prediction and structure-based function annotation have been a long-term goal in molecular biology but not yet become possible due to difficulties in modeling distant-homology targets. We developed a hybrid pipeline combining ab initio folding and template-based modeling for genome-wide structure prediction applied to the Escherichia coli genome. The pipeline was tested on 43 known sequences, where QUARK-based ab initio folding simulation generated models with TM-score 17% higher than that by traditional comparative modeling methods. For 495 unknown hard sequences, 72 are predicted to have a correct fold (TM-score > 0.5) and 321 have a substantial portion of structure correctly modeled (TM-score > 0.35). 317 sequences can be reliably assigned to a SCOP fold family based on structural analogy to existing proteins in PDB. The presented results, as a case study of E. coli, represent promising progress towards genome-wide structure modeling and fold family assignment using state-of-the-art ab initio folding algorithms.
Collapse
|
33
|
Reconstructing protein structures by neural network pairwise interaction fields and iterative decoy set construction. Biomolecules 2014; 4:160-80. [PMID: 24970210 PMCID: PMC4030983 DOI: 10.3390/biom4010160] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2013] [Revised: 01/22/2014] [Accepted: 01/30/2014] [Indexed: 11/17/2022] Open
Abstract
Predicting the fold of a protein from its amino acid sequence is one of the grand problems in computational biology. While there has been progress towards a solution, especially when a protein can be modelled based on one or more known structures (templates), in the absence of templates, even the best predictions are generally much less reliable. In this paper, we present an approach for predicting the three-dimensional structure of a protein from the sequence alone, when templates of known structure are not available. This approach relies on a simple reconstruction procedure guided by a novel knowledge-based evaluation function implemented as a class of artificial neural networks that we have designed: Neural Network Pairwise Interaction Fields (NNPIF). This evaluation function takes into account the contextual information for each residue and is trained to identify native-like conformations from non-native-like ones by using large sets of decoys as a training set. The training set is generated and then iteratively expanded during successive folding simulations. As NNPIF are fast at evaluating conformations, thousands of models can be processed in a short amount of time, and clustering techniques can be adopted for model selection. Although the results we present here are very preliminary, we consider them to be promising, with predictions being generated at state-of-the-art levels in some of the cases.
Collapse
|
34
|
In silico prediction of structure and functions for some proteins of male-specific region of the human Y chromosome. Interdiscip Sci 2014; 5:258-69. [PMID: 24402818 DOI: 10.1007/s12539-013-0178-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 09/03/2012] [Accepted: 11/08/2012] [Indexed: 10/25/2022]
Abstract
Male-specific region of the human Y chromosome (MSY) comprises 95% of its length that is functionally active. This portion inherits in block from father to male offspring. Most of the genes in the MSY region are involved in male-specific function, such as sex determination and spermatogenesis; also contains genes probably involved in other cellular functions. However, a detailed characterization of numerous MSY-encoded proteins still remains to be done. In this study, 12 uncharacterized proteins of MSY were analyzed through bioinformatics tools for structural and functional characterization. Within these 12 proteins, a total of 55 domains were found, with DnaJ domain signature corresponding to be the highest (11%) followed by both FAD-dependent pyridine nucleotide reductase signature and fumarate lyase superfamily signature (9%). The 3D structures of our selected proteins were built up using homology modeling and the protein threading approaches. These predicted structures confirmed in detail the stereochemistry; indicating reasonably good quality model. Furthermore the predicted functions and the proteins with whom they interact established their biological role and their mechanism of action at molecular level. The results of these structure-functional annotations provide a comprehensive view of the proteins encoded by MSY, which sheds light on their biological functions and molecular mechanisms. The data presented in this study may assist in future prognosis of several human diseases such as Turner syndrome, gonadal sex reversal, spermatogenic failure, and gonadoblastoma.
Collapse
|
35
|
Petrella RJ. OPTIMIZATION BIAS IN ENERGY-BASED STRUCTURE PREDICTION. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2013; 12:1341014. [PMID: 25552783 PMCID: PMC4278582 DOI: 10.1142/s0219633613410149] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Physics-based computational approaches to predicting the structure of macromolecules such as proteins are gaining increased use, but there are remaining challenges. In the current work, it is demonstrated that in energy-based prediction methods, the degree of optimization of the sampled structures can influence the prediction results. In particular, discrepancies in the degree of local sampling can bias the predictions in favor of the oversampled structures by shifting the local probability distributions of the minimum sampled energies. In simple systems, it is shown that the magnitude of the errors can be calculated from the energy surface, and for certain model systems, derived analytically. Further, it is shown that for energy wells whose forms differ only by a randomly assigned energy shift, the optimal accuracy of prediction is achieved when the sampling around each structure is equal. Energy correction terms can be used in cases of unequal sampling to reproduce the total probabilities that would occur under equal sampling, but optimal corrections only partially restore the prediction accuracy lost to unequal sampling. For multiwell systems, the determination of the correction terms is a multibody problem; it is shown that the involved cross-correlation multiple integrals can be reduced to simpler integrals. The possible implications of the current analysis for macromolecular structure prediction are discussed.
Collapse
Affiliation(s)
- Robert J. Petrella
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
- Department of Medicine, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| |
Collapse
|
36
|
Molloy K, Shehu A. Elucidating the ensemble of functionally-relevant transitions in protein systems with a robotics-inspired method. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S8. [PMID: 24565158 PMCID: PMC3952944 DOI: 10.1186/1472-6807-13-s1-s8] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Background Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. Methods We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Results and conclusions Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers.
Collapse
|
37
|
Mutation induced structural variation in membrane proteins. Chem Res Chin Univ 2013. [DOI: 10.1007/s40242-013-2427-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
38
|
Structural modelling and dynamics of proteins for insights into drug interactions. Adv Drug Deliv Rev 2012; 64:323-43. [PMID: 22155026 DOI: 10.1016/j.addr.2011.11.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2011] [Revised: 11/17/2011] [Accepted: 11/24/2011] [Indexed: 12/27/2022]
Abstract
Proteins are the workhorses of biomolecules and their function is affected by their structure and their structural rearrangements during ligand entry, ligand binding and protein-protein interactions. Hence, the knowledge of protein structure and, importantly, the dynamic behaviour of the structure are critical for understanding how the protein performs its function. The predictions of the structure and the dynamic behaviour can be performed by combinations of structure modelling and molecular dynamics simulations. The simulations also need to be sensitive to the constraints of the environment in which the protein resides. Standard computational methods now exist in this field to support the experimental effort of solving protein structures. This review presents a comprehensive overview of the basis of the calculations and the well-established computational methods used to generate and understand protein structure and function and the study of their dynamic behaviour with the reference to lung-related targets.
Collapse
|
39
|
Dal Palú A, Spyrakis F, Cozzini P. A new approach for investigating protein flexibility based on Constraint Logic Programming. The first application in the case of the estrogen receptor. Eur J Med Chem 2012; 49:127-40. [PMID: 22277571 DOI: 10.1016/j.ejmech.2012.01.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2011] [Revised: 01/05/2012] [Accepted: 01/05/2012] [Indexed: 12/01/2022]
Abstract
We describe the potential of a novel method, based on Constraint Logic Programming (CLP), developed for an exhaustive sampling of protein conformational space. The CLP framework proposed here has been tested and applied to the estrogen receptor, whose activity and function is strictly related to its intrinsic, and well known, dynamics. We have investigated in particular the flexibility of H12, focusing on the pathways followed by the helix when moving from one stable crystallographic conformation to the others. Millions of geometrically feasible conformations were generated, selected and the traces connecting the different forms were determined by using a shortest path algorithm. The preliminary analyses showed a marked agreement between the crystallographic agonist-like, antagonist-like and hypothetical apo forms, and the corresponding conformations identified by the CLP framework. These promising results, together with the short computational time required to perform the analyses, make this constraint-based approach a valuable tool for the study of protein folding prediction. The CLP framework enables one to consider various structural and energetic scenarious, without changing the core algorithm. To show the feasibility of the method, we intentionally choose a pure geometric setting, neglecting the energetic evaluation of the poses, in order to be independent from a specific force field and to provide the possibility of comparing different behaviours associated with various energy models.
Collapse
|
40
|
Srivastava M, Gupta SK, Abhilash PC, Singh N. Structure prediction and binding sites analysis of curcin protein of Jatropha curcas using computational approaches. J Mol Model 2011; 18:2971-9. [PMID: 22146985 DOI: 10.1007/s00894-011-1320-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 11/22/2011] [Indexed: 11/29/2022]
Abstract
Ribosome inactivating proteins (RIPs) are defense proteins in a number of higher-plant species that are directly targeted toward herbivores. Jatropha curcas is one of the biodiesel plants having RIPs. The Jatropha seed meal, after extraction of oil, is rich in curcin, a highly toxic RIP similar to ricin, which makes it unsuitable for animal feed. Although the toxicity of curcin is well documented in the literature, the detailed toxic properties and the 3D structure of curcin has not been determined by X-ray crystallography, NMR spectroscopy or any in silico techniques to date. In this pursuit, the structure of curcin was modeled by a composite approach of 3D structure prediction using threading and ab initio modeling. Assessment of model quality was assessed by methods which include Ramachandran plot analysis and Qmean score estimation. Further, we applied the protein-ligand docking approach to identify the r-RNA binding residue of curcin. The present work provides the first structural insight into the binding mode of r-RNA adenine to the curcin protein and forms the basis for designing future inhibitors of curcin. Cloning of a future peptide inhibitor within J. curcas can produce non-toxic varieties of J. curcas, which would make the seed-cake suitable as animal feed without curcin detoxification.
Collapse
Affiliation(s)
- Mugdha Srivastava
- Eco-Auditing Laboratory, National Botanical Research Institute, CSIR, Lucknow, 226001 Uttar Pradesh, India.
| | | | | | | |
Collapse
|
41
|
Abstract
Genome sequencing projects have ciphered millions of protein sequence, which require knowledge of their structure and function to improve the understanding of their biological role. Although experimental methods can provide detailed information for a small fraction of these proteins, computational modeling is needed for the majority of protein molecules which are experimentally uncharacterized. The I-TASSER server is an on-line workbench for high-resolution modeling of protein structure and function. Given a protein sequence, a typical output from the I-TASSER server includes secondary structure prediction, predicted solvent accessibility of each residue, homologous template proteins detected by threading and structure alignments, up to five full-length tertiary structural models, and structure-based functional annotations for enzyme classification, Gene Ontology terms and protein-ligand binding sites. All the predictions are tagged with a confidence score which tells how accurate the predictions are without knowing the experimental data. To facilitate the special requests of end users, the server provides channels to accept user-specified inter-residue distance and contact maps to interactively change the I-TASSER modeling; it also allows users to specify any proteins as template, or to exclude any template proteins during the structure assembly simulations. The structural information could be collected by the users based on experimental evidences or biological insights with the purpose of improving the quality of I-TASSER predictions. The server was evaluated as the best programs for protein structure and function predictions in the recent community-wide CASP experiments. There are currently >20,000 registered scientists from over 100 countries who are using the on-line I-TASSER server.
Collapse
Affiliation(s)
- Ambrish Roy
- Center for Computational Medicine and Bioinformatics, University of Michigan, USA
| | | | | | | |
Collapse
|
42
|
Lee HS, Zhang Y. BSP-SLIM: a blind low-resolution ligand-protein docking approach using predicted protein structures. Proteins 2011; 80:93-110. [PMID: 21971880 DOI: 10.1002/prot.23165] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Revised: 06/30/2011] [Accepted: 08/04/2011] [Indexed: 01/19/2023]
Abstract
We developed BSP-SLIM, a new method for ligand-protein blind docking using low-resolution protein structures. For a given sequence, protein structures are first predicted by I-TASSER; putative ligand binding sites are transferred from holo-template structures which are analogous to the I-TASSER models; ligand-protein docking conformations are then constructed by shape and chemical match of ligand with the negative image of binding pockets. BSP-SLIM was tested on 71 ligand-protein complexes from the Astex diverse set where the protein structures were predicted by I-TASSER with an average RMSD 2.92 Å on the binding residues. Using I-TASSER models, the median ligand RMSD of BSP-SLIM docking is 3.99 Å which is 5.94 Å lower than that by AutoDock; the median binding-site error by BSP-SLIM is 1.77 Å which is 6.23 Å lower than that by AutoDock and 3.43 Å lower than that by LIGSITE(CSC) . Compared to the models using crystal protein structures, the median ligand RMSD by BSP-SLIM using I-TASSER models increases by 0.87 Å, while that by AutoDock increases by 8.41 Å; the median binding-site error by BSP-SLIM increase by 0.69Å while that by AutoDock and LIGSITE(CSC) increases by 7.31 Å and 1.41 Å, respectively. As case studies, BSP-SLIM was used in virtual screening for six target proteins, which prioritized actives of 25% and 50% in the top 9.2% and 17% of the library on average, respectively. These results demonstrate the usefulness of the template-based coarse-grained algorithms in the low-resolution ligand-protein docking and drug-screening. An on-line BSP-SLIM server is freely available at http://zhanglab.ccmb.med.umich.edu/BSP-SLIM.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Biological Chemistry, Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
43
|
Kifer I, Nussinov R, Wolfson HJ. Protein structure prediction using a docking-based hierarchical folding scheme. Proteins 2011; 79:1759-73. [PMID: 21445943 PMCID: PMC3092838 DOI: 10.1002/prot.22999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 01/02/2011] [Accepted: 01/18/2011] [Indexed: 12/13/2022]
Abstract
The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method.
Collapse
Affiliation(s)
- Ilona Kifer
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | |
Collapse
|
44
|
Lee J, Lee J, Sasaki TN, Sasai M, Seok C, Lee J. De novo
protein structure prediction by dynamic fragment assembly and conformational space annealing. Proteins 2011; 79:2403-17. [DOI: 10.1002/prot.23059] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Revised: 03/24/2011] [Accepted: 04/12/2011] [Indexed: 12/25/2022]
|
45
|
Rankin CA, Roy A, Zhang Y, Richter M. Parkin, A Top Level Manager in the Cell's Sanitation Department. Open Biochem J 2011; 5:9-26. [PMID: 21633666 PMCID: PMC3104551 DOI: 10.2174/1874091x01105010009] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2010] [Revised: 01/25/2011] [Accepted: 01/31/2011] [Indexed: 01/31/2023] Open
Abstract
Parkin belongs to a class of multiple RING domain proteins designated as RBR (RING, in between RING, RING) proteins. In this review we examine what is known regarding the structure/function relationship of the Parkin protein. Parkin contains three RING domains plus a ubiquitin-like domain and an in-between-RING (IBR) domain. RING domains are rich in cysteine amino acids that act as ligands to bind zinc ions. RING domains may interact with DNA or with other proteins and perform a wide range of functions. Some function as E3 ubiquitin ligases, participating in attachment of ubiquitin chains to signal proteasome degradation; however, ubiquitin may be attached for purposes other than proteasome degradation. It was determined that the C-terminal most RING, RING2, is essential for Parkin to function as an E3 ubiquitin ligase and a number of substrates have been identified. However, Parkin also participates in a number of other fiunctions, such as DNA repair, microtubule stabilization, and formation of aggresomes. Some functions, such as participation in a multi-protein complex implicated in NMDA activity at the post synaptic density, do not require ubiquitination of substrate molecules. Recent observations of RING proteins suggest their function may be regulated by zinc ion binding. We have modeled the three RING domains of Parkin and have identified a new set of RING2 ligands. This set allows for binding of two rather than just one zinc ion, opening the possibility that the number of zinc ions bound acts as a molecular switch to modulate Parkin function.
Collapse
Affiliation(s)
- Carolyn A Rankin
- Molecular Biosciences Department, University of Kansas, Lawrence KS 66045, USA
| | | | | | | |
Collapse
|
46
|
Bazzoli A, Tettamanzi AGB, Zhang Y. Computational protein design and large-scale assessment by I-TASSER structure assembly simulations. J Mol Biol 2011; 407:764-76. [PMID: 21329699 DOI: 10.1016/j.jmb.2011.02.017] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2010] [Revised: 01/30/2011] [Accepted: 02/05/2011] [Indexed: 10/18/2022]
Abstract
Protein design aims at designing new protein molecules of desired structure and functionality. One of the major obstacles to large-scale protein design are the extensive time and manpower requirements for experimental validation of designed sequences. Recent advances in protein structure prediction have provided potentials for an automated assessment of the designed sequences via folding simulations. We present a new protocol for protein design and validation. The sequence space is initially searched by Monte Carlo sampling guided by a public atomic potential, with candidate sequences selected by the clustering of sequence decoys. The designed sequences are then assessed by I-TASSER folding simulations, which generate full-length atomic structural models by the iterative assembly of threading fragments. The protocol is tested on 52 nonhomologous single-domain proteins, with an average sequence identity of 24% between the designed sequences and the native sequences. Despite this low sequence identity, three-dimensional models predicted for the first designed sequence have an RMSD of <2 Å to the target structure in 62% of cases. This percentage increases to 77% if we consider the three-dimensional models from the top 10 designed sequences. Such a striking consistency between the target structure and the structural prediction from nonhomologous sequences, despite the fact that the design and folding algorithms adopt completely different force fields, indicates that the design algorithm captures the features essential to the global fold of the target. On average, the designed sequences have a free energy that is 0.39 kcal/(mol residue) lower than in the native sequences, potentially affording a greater stability to synthesized target folds.
Collapse
Affiliation(s)
- Andrea Bazzoli
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | | | | |
Collapse
|
47
|
Lee SY, Skolnick J. TASSER_WT: a protein structure prediction algorithm with accurate predicted contact restraints for difficult protein targets. Biophys J 2011; 99:3066-75. [PMID: 21044605 DOI: 10.1016/j.bpj.2010.09.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2010] [Revised: 08/29/2010] [Accepted: 09/07/2010] [Indexed: 12/29/2022] Open
Abstract
To improve the prediction accuracy in the regime where template alignment quality is poor, an updated version of TASSER_2.0, namely TASSER_WT, was developed. TASSER_WT incorporates more accurate contact restraints from a new method, COMBCON. COMBCON uses confidence-weighted contacts from PROSPECTOR_3.5, the latest version, PROSPECTOR_4, and a new local structural fragment-based threading algorithm, STITCH, implemented in two variants depending on expected fragment prediction accuracy. TASSER_WT is tested on 622 Hard proteins, the most difficult targets (incorrect alignments and/or templates and incorrect side-chain contact restraints) in a comprehensive benchmark of 2591 nonhomologous, single domain proteins ≤ 200 residues that cover the PDB at 35% pairwise sequence identity. For 454 of 622 Hard targets, COMBCON provides contact restraints with higher accuracy and number of contacts per residue. As contact coverage with confidence weight ≥ 3 (F(wt ≥ 3)(cov)) increases, the more improved are TASSER_WT models. When F(wt ≥ 3)(cov) > 1.0 and > 0.4, the average root mean-square deviation of TASSER_WT (TASSER_2.0) models is 4.11 Å (6.72 Å) and 5.03 Å (6.40 Å), respectively. Regarding a structure prediction as successful when a model has a TM-score to the native structure ≥ 0.4, when F(wt ≥ 3)(cov) > 1.0 and > 0.4, the success rate of TASSER_WT (TASSER_2.0) is 98.8% (76.2%) and 93.7% (81.1%), respectively.
Collapse
Affiliation(s)
- Seung Yup Lee
- Center for Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | | |
Collapse
|
48
|
Yuan X, Shao Y, Bystroff C. Ab initio protein structure prediction using pathway models. Comp Funct Genomics 2010; 4:397-401. [PMID: 18629080 PMCID: PMC2447365 DOI: 10.1002/cfg.305] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2003] [Revised: 05/30/2003] [Accepted: 06/02/2003] [Indexed: 11/13/2022] Open
Abstract
Ab initio prediction is the challenging attempt to predict protein structures based only on sequence information and without using templates. It is often divided into
two distinct sub-problems: (a) the scoring function that can distinguish native, or
native-like structures, from non-native ones; and (b) the method of searching the
conformational space. Currently, there is no reliable scoring function that can
always drive a search to the native fold, and there is no general search method
that can guarantee a significant sampling of near-natives. Pathway models combine
the scoring function and the search. In this short review, we explore some of the
ways pathway models are used in folding, in published works since 2001, and
present a new pathway model, HMMSTR-CM, that uses a fragment library and
a set of nucleation/propagation-based rules. The new method was used for ab initio
predictions as part of CASP5. This work was presented at the Winter School in
Bioinformatics, Bologna, Italy, 10–14 February 2003.
Collapse
Affiliation(s)
- Xin Yuan
- Department of Biology, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | | | | |
Collapse
|
49
|
Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 2010; 5:725-38. [PMID: 20360767 PMCID: PMC2849174 DOI: 10.1038/nprot.2010.5] [Citation(s) in RCA: 4715] [Impact Index Per Article: 336.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Collapse
Affiliation(s)
- Ambrish Roy
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Alper Kucukural
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| |
Collapse
|
50
|
Li Y, Zhang Y. REMO: A new protocol to refine full atomic protein models from C-alpha traces by optimizing hydrogen-bonding networks. Proteins 2009; 76:665-76. [PMID: 19274737 PMCID: PMC2771173 DOI: 10.1002/prot.22380] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Protein structure prediction approaches usually perform modeling simulations based on reduced representation of protein structures. For biological utilizations, it is an important step to construct full atomic models from the reduced structure decoys. Most of the current full atomic model reconstruction procedures have defects which either could not completely remove the steric clashes among backbone atoms or generate final atomic models with worse topology similarity relative to the native structures than the reduced models. In this work, we develop a new protocol, called REMO, to generate full atomic protein models by optimizing the hydrogen-bonding network with basic fragments matched from a newly constructed backbone isomer library of solved protein structures. The algorithm is benchmarked on 230 nonhomologous proteins with reduced structure decoys generated by I-TASSER simulations. The results show that REMO has a significant ability to remove steric clashes, and meanwhile retains good topology of the reduced model. The hydrogen-bonding network of the final models is dramatically improved during the procedure. The REMO algorithm has been exploited in the recent CASP8 experiment which demonstrated significant improvements of the I-TASSER models in both atomic-level structural refinement and hydrogen-bonding network construction.
Collapse
Affiliation(s)
- Yunqi Li
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Yang Zhang
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| |
Collapse
|