1
|
Jung Y, Geng C, Bonvin AMJJ, Xue LC, Honavar VG. MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein-Protein Docking Conformations. Biomolecules 2023; 13:121. [PMID: 36671507 PMCID: PMC9855734 DOI: 10.3390/biom13010121] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/22/2022] [Accepted: 12/26/2022] [Indexed: 01/11/2023] Open
Abstract
Protein-protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and time-consuming experimental approaches for determining the 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking-the so-called scoring problem-still has considerable room for improvement. We present MetaScore, a new machine-learning-based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using their protein-protein interfacial features. The features include physicochemical properties, energy terms, interaction-propensity-based features, geometric properties, interface topology features, evolutionary conservation, and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of the nine traditional SFs included in this work in terms of success rate and hit rate evaluated over conformations ranked among the top 10; (ii) an ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by using machine learning to judiciously leverage protein-protein interfacial features and by using ensemble methods to combine multiple scoring functions.
Collapse
Affiliation(s)
- Yong Jung
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Cunliang Geng
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Li C. Xue
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Center for Molecular and Biomolecular Informatics, Radboudumc, Greet Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | - Vasant G. Honavar
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA 16802, USA
- College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA
| |
Collapse
|
2
|
Su W, Wu S, Yang Y, Guo Y, Zhang H, Su J, Chen L, Mao Z, Lan R, Cao R, Wang C, Xu H, Zhang C, Li S, Gao M, Chen X, Zheng Z, Wang B, Liu Y, Liu Z, Wang Z, Liu B, Fan X, Zhang X, Guan Y. Phosphorylation of 17β-hydroxysteroid dehydrogenase 13 at serine 33 attenuates nonalcoholic fatty liver disease in mice. Nat Commun 2022; 13:6577. [PMID: 36323699 PMCID: PMC9630536 DOI: 10.1038/s41467-022-34299-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022] Open
Abstract
17β-hydroxysteroid dehydrogenase-13 is a hepatocyte-specific, lipid droplet-associated protein. A common loss-of-function variant of HSD17B13 (rs72613567: TA) protects patients against non-alcoholic fatty liver disease with underlying mechanism incompletely understood. In the present study, we identify the serine 33 of 17β-HSD13 as an evolutionally conserved PKA target site and its phosphorylation facilitates lipolysis by promoting its interaction with ATGL on lipid droplets. Targeted mutation of Ser33 to Ala (S33A) decreases ATGL-dependent lipolysis in cultured hepatocytes by reducing CGI-58-mediated ATGL activation. Importantly, a transgenic knock-in mouse strain carrying the HSD17B13 S33A mutation (HSD17B1333A/A) spontaneously develops hepatic steatosis with reduced lipolysis and increased inflammation. Moreover, Hsd17B1333A/A mice are more susceptible to high-fat diet-induced nonalcoholic steatohepatitis. Finally, we find reproterol, a potential 17β-HSD13 modulator and FDA-approved drug, confers a protection against nonalcoholic steatohepatitis via PKA-mediated Ser33 phosphorylation of 17β-HSD13. Therefore, targeting the Ser33 phosphorylation site could represent a potential approach to treat NASH.
Collapse
Affiliation(s)
- Wen Su
- grid.263488.30000 0001 0472 9649Department of Pathophysiology, Shenzhen University, Shenzhen, 518060 China ,Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Sijin Wu
- grid.9227.e0000000119573309State Key Laboratory of Molecular Reaction Dynamics, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian, 116024 China
| | - Yongliang Yang
- grid.30055.330000 0000 9247 7930Laboratoy of Innovative Drug Discovery, School of Bioengineering, Dalian University of Technology, Dalian, 116023 China
| | - Yanlin Guo
- grid.22069.3f0000 0004 0369 6365Health Science Center, East China Normal University, Shanghai, 200241 China
| | - Haibo Zhang
- grid.411971.b0000 0000 9558 1426Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, 116044 China
| | - Jie Su
- grid.263488.30000 0001 0472 9649Department of Pathophysiology, Shenzhen University, Shenzhen, 518060 China ,Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Lei Chen
- grid.263488.30000 0001 0472 9649Department of Pathophysiology, Shenzhen University, Shenzhen, 518060 China ,Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Zhuo Mao
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Rongfeng Lan
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Rong Cao
- grid.263488.30000 0001 0472 9649Department of Nephrology, The First Affiliated Hospital of Shenzhen University, Shenzhen, 518035 China
| | - Chunjiong Wang
- grid.265021.20000 0000 9792 1228Department of Physiology and Pathophysiology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin, China
| | - Hu Xu
- grid.411971.b0000 0000 9558 1426Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, 116044 China
| | - Cong Zhang
- grid.411971.b0000 0000 9558 1426Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, 116044 China
| | - Sha Li
- grid.412028.d0000 0004 1757 5708Medical College, Hebei University of Engineering, Handan, China
| | - Min Gao
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Xiaocong Chen
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Zhiyou Zheng
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Bing Wang
- grid.411971.b0000 0000 9558 1426Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, 116044 China
| | - Yi’ao Liu
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Zuojun Liu
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Zimei Wang
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Baohua Liu
- Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Xinmin Fan
- grid.263488.30000 0001 0472 9649Department of Pathophysiology, Shenzhen University, Shenzhen, 518060 China ,Shenzhen University Health Science Center, Shenzhen University, Shenzhen, 518060 China
| | - Xiaoyan Zhang
- grid.22069.3f0000 0004 0369 6365Health Science Center, East China Normal University, Shanghai, 200241 China
| | - Youfei Guan
- grid.411971.b0000 0000 9558 1426Advanced Institute for Medical Sciences, Dalian Medical University, Dalian, 116044 China ,grid.411971.b0000 0000 9558 1426Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Dalian Medical University, Dalian, 116044 China
| |
Collapse
|
3
|
Yamamori Y, Tomii K. Application of Homology Modeling by Enhanced Profile-Profile Alignment and Flexible-Fitting Simulation to Cryo-EM Based Structure Determination. Int J Mol Sci 2022; 23:1977. [PMID: 35216093 PMCID: PMC8879198 DOI: 10.3390/ijms23041977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 12/03/2022] Open
Abstract
Application of cryo-electron microscopy (cryo-EM) is crucially important for ascertaining the atomic structure of large biomolecules such as ribosomes and protein complexes in membranes. Advances in cryo-EM technology and software have made it possible to obtain data with near-atomic resolution, but the method is still often capable of producing only a density map with up to medium resolution, either partially or entirely. Therefore, bridging the gap separating the density map and the atomic model is necessary. Herein, we propose a methodology for constructing atomic structure models based on cryo-EM maps with low-to-medium resolution. The method is a combination of sensitive and accurate homology modeling using our profile-profile alignment method with a flexible-fitting method using molecular dynamics simulation. As described herein, this study used benchmark applications to evaluate the model constructions of human two-pore channel 2 (one target protein in CASP13 with its structure determined using cryo-EM data) and the overall structure of Enterococcus hirae V-ATPase complex.
Collapse
Affiliation(s)
- Yu Yamamori
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan;
- AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
4
|
Narykov O, Johnson NT, Korkin D. Predicting protein interaction network perturbation by alternative splicing with semi-supervised learning. Cell Rep 2021; 37:110045. [PMID: 34818539 DOI: 10.1016/j.celrep.2021.110045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 07/21/2021] [Accepted: 11/02/2021] [Indexed: 10/19/2022] Open
Abstract
Alternative splicing introduces an additional layer of protein diversity and complexity in regulating cellular functions that can be specific to the tissue and cell type, physiological state of a cell, or disease phenotype. Recent high-throughput experimental studies have illuminated the functional role of splicing events through rewiring protein-protein interactions; however, the extent to which the macromolecular interactions are affected by alternative splicing has yet to be fully understood. In silico methods provide a fast and cheap alternative to interrogating functional characteristics of thousands of alternatively spliced isoforms. Here, we develop an accurate feature-based machine learning approach that predicts whether a protein-protein interaction carried out by a reference isoform is perturbed by an alternatively spliced isoform. Our method, called the alternatively spliced interactions prediction (ALT-IN) tool, is compared with the state-of-the-art PPI prediction tools and shows superior performance, achieving 0.92 in precision and recall values.
Collapse
Affiliation(s)
- Oleksandr Narykov
- Department of Computer Science, and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA
| | - Nathan T Johnson
- Department of Computer Science, and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA; Harvard Program in Therapeutic Sciences, Harvard Medical School, and Breast Tumor Immunology Laboratory, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Dmitry Korkin
- Department of Computer Science, and Bioinformatics and Computational Biology Program, Worcester Polytechnic Institute, Worcester, MA, USA.
| |
Collapse
|
5
|
Redesigning an antibody H3 loop by virtual screening of a small library of human germline-derived sequences. Sci Rep 2021; 11:21362. [PMID: 34725391 PMCID: PMC8560851 DOI: 10.1038/s41598-021-00669-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/05/2021] [Indexed: 01/01/2023] Open
Abstract
The design of superior biologic therapeutics, including antibodies and engineered proteins, involves optimizing their specific ability to bind to disease-related molecular targets. Previously, we developed and applied the Assisted Design of Antibody and Protein Therapeutics (ADAPT) platform for virtual affinity maturation of antibodies (Vivcharuk et al. in PLoS One 12(7):e0181490, 10.1371/journal.pone.0181490, 2017). However, ADAPT is limited to point mutations of hot-spot residues in existing CDR loops. In this study, we explore the possibility of wholesale replacement of the entire H3 loop with no restriction to maintain the parental loop length. This complements other currently published studies that sample replacements for the CDR loops L1, L2, L3, H1 and H2. Given the immense sequence space theoretically available to H3, we focused on the virtual grafting of over 5000 human germline-derived H3 sequences from the IGMT/LIGM database increasing the diversity of the sequence space when compared to using crystalized H3 loop sequences. H3 loop conformations are generated and scored to identify optimized H3 sequences. Experimental testing of high-ranking H3 sequences grafted into the framework of the bH1 antibody against human VEGF-A led to the discovery of multiple hits, some of which had similar or better affinities relative to the parental antibody. In over 75% of the tested designs, the re-designed H3 loop contributed favorably to overall binding affinity. The hits also demonstrated good developability attributes such as high thermal stability and no aggregation. Crystal structures of select re-designed H3 variants were solved and indicated that although some deviations from predicted structures were seen in the more solvent accessible regions of the H3 loop, they did not significantly affect predicted affinity scores.
Collapse
|
6
|
Postic G, Janel N, Moroy G. Representations of protein structure for exploring the conformational space: A speed-accuracy trade-off. Comput Struct Biotechnol J 2021; 19:2618-2625. [PMID: 34025948 PMCID: PMC8120936 DOI: 10.1016/j.csbj.2021.04.049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 04/19/2021] [Accepted: 04/20/2021] [Indexed: 11/25/2022] Open
Abstract
We compare ten structural representations, either atomistic or coarse-grained. Thus, ten distance-dependent statistical potentials of mean force (PMF) were built. The Cβ-only and Cα + Cβ representations provide the best speed–accuracy trade-off. Including glycines through Cα, in a Cβ-only representation, yields a higher accuracy. We generalize the conclusions to the total information gain (TIG) scoring function.
The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.
Collapse
Affiliation(s)
- Guillaume Postic
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France
- Corresponding author.
| | - Nathalie Janel
- Université de Paris, BFA, UMR 8251, CNRS, F-75013 Paris, France
| | - Gautier Moroy
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France
| |
Collapse
|
7
|
Wang Q, Chen F, Liu P, Mu Y, Sun S, Yuan X, Shang P, Ji B. Scaffold-based analysis of nonpeptide oncogenic FTase inhibitors using multiple similarity matching, binding affinity scoring and enzyme inhibition assay. J Mol Graph Model 2021; 105:107898. [PMID: 33784524 DOI: 10.1016/j.jmgm.2021.107898] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Revised: 02/25/2021] [Accepted: 03/05/2021] [Indexed: 10/21/2022]
Abstract
Oncogenic protein farnesyltransferase (FTase) is a key enzyme responsible for the lipid modification of a large and important number of proteins including Ras, which has been recognized as a druggable target of diverse cancers. Here, we report a systematic scaffold-based analysis to investigate the affinity, selectivity and cross-reactivity of nonpeptide inhibitors across ontology-enriched, disease-associated FTase mutants, by integrating multiple similarity matching, binding affinity scoring and enzyme inhibition assay. It is revealed that nonpeptide inhibitors are generally insensitive to FTase mutations; many of them cannot definitely select for wild-type target over mutant enzymes. Therefore, off-target is observed as a common phenomenon for the untargeted consequence of targeted therapies with FTase inhibition. This is not unexpected if considering that the enzyme active site is highly conserved in composition, configuration and function. The off-target, on the one hand, causes nonpeptide inhibitors with adverse drug reactions and, on the other hand, makes the inhibitors as promising candidates for the new use of old drugs. To practice the latter, a number of unexpected mutant-inhibitor interactions involved in cancer signaling pathways are uncovered in the created profile, from which several nonpeptide inhibitors are identified as insensitive to a drug-resistant mutation. Structural analysis suggests that the inhibitor ligands can bind to the mutant active site in a similar manner with wild-type target, although their nonbonded interactions appear to be impaired moderately upon the mutation.
Collapse
Affiliation(s)
- Qifei Wang
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Fei Chen
- Department of Gastroenterology, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Peng Liu
- Department of Chest Surgery, Ningyang First People's Hospital, Taian, 271400, China
| | - Yushu Mu
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Shibin Sun
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Xulong Yuan
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Pan Shang
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China
| | - Bo Ji
- Department of Chest Surgery, The Second Affiliated Hospital of Shandong First Medical University, Taian, 271000, China.
| |
Collapse
|
8
|
Li T, Kong L, Li X, Wu S, Attri KS, Li Y, Gong W, Zhao B, Li L, Herring LE, Asara JM, Xu L, Luo X, Lei YL, Ma Q, Seveau S, Gunn JS, Cheng X, Singh PK, Green DR, Wang H, Wen H. Listeria monocytogenes upregulates mitochondrial calcium signalling to inhibit LC3-associated phagocytosis as a survival strategy. Nat Microbiol 2021; 6:366-379. [PMID: 33462436 PMCID: PMC8323152 DOI: 10.1038/s41564-020-00843-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Accepted: 11/27/2020] [Indexed: 01/29/2023]
Abstract
Mitochondria are believed to have originated ~2.5 billion years ago. As well as energy generation in cells, mitochondria have a role in defence against bacterial pathogens. Despite profound changes in mitochondrial morphology and functions following bacterial challenge, whether intracellular bacteria can hijack mitochondria to promote their survival remains elusive. We report that Listeria monocytogenes-an intracellular bacterial pathogen-suppresses LC3-associated phagocytosis (LAP) by modulation of mitochondrial Ca2+ (mtCa2+) signalling in order to survive inside cells. Invasion of macrophages by L. monocytogenes induced mtCa2+ uptake through the mtCa2+ uniporter (MCU), which in turn increased acetyl-coenzyme A (acetyl-CoA) production by pyruvate dehydrogenase. Acetylation of the LAP effector Rubicon with acetyl-CoA decreased LAP formation. Genetic ablation of MCU attenuated intracellular bacterial growth due to increased LAP formation. Our data show that modulation of mtCa2+ signalling can increase bacterial survival inside cells, and highlight the importance of mitochondrial metabolism in host-microbial interactions.
Collapse
Affiliation(s)
- Tianliang Li
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA
| | - Ligang Kong
- Shandong Institute of Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Shandong University, Jinan, China
| | - Xinghui Li
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA
| | - Sijin Wu
- College of Pharmacy, Medicinal Chemistry and Pharmacognosy, The Ohio State University, Columbus, OH, USA
| | - Kuldeep S Attri
- Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE, USA
| | - Yan Li
- Department of Physiology and Cell Biology, The Ohio State University, Columbus, OH, USA
| | - Weipeng Gong
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA
| | - Bao Zhao
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA
| | - Lupeng Li
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Laura E Herring
- Proteomics Core Facility, Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - John M Asara
- Division of Signal Transduction, Beth Israel Deaconess Medical Center and Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Lei Xu
- Shandong Institute of Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Shandong University, Jinan, China
| | - Xiaobo Luo
- Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Yu L Lei
- Department of Periodontics and Oral Medicine, University of Michigan School of Dentistry, Ann Arbor, MI, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, USA
| | - Stephanie Seveau
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA
| | - John S Gunn
- Center for Microbial Pathogenesis, Abigail Wexner Research Institute at Nationwide Children's Hospital, Columbus, OH, USA
| | - Xiaolin Cheng
- College of Pharmacy, Medicinal Chemistry and Pharmacognosy, The Ohio State University, Columbus, OH, USA
| | - Pankaj K Singh
- Eppley Institute for Research in Cancer and Allied Diseases, University of Nebraska Medical Center, Omaha, NE, USA
| | - Douglas R Green
- Department of Immunology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Haibo Wang
- Shandong Institute of Otolaryngology, Department of Otolaryngology-Head and Neck Surgery, Shandong Provincial ENT Hospital, Shandong University, Jinan, China.
| | - Haitao Wen
- Department of Microbial Infection and Immunity, The Ohio State University, Columbus, OH, USA.
| |
Collapse
|
9
|
Stam MJ, Wood CW. DE-STRESS: a user-friendly web application for the evaluation of protein designs. Protein Eng Des Sel 2021; 34:gzab029. [PMID: 34908138 PMCID: PMC8672653 DOI: 10.1093/protein/gzab029] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 10/11/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022] Open
Abstract
De novo protein design is a rapidly growing field, and there are now many interesting and useful examples of designed proteins in the literature. However, most designs could be classed as failures when characterised in the lab, usually as a result of low expression, misfolding, aggregation or lack of function. This high attrition rate makes protein design unreliable and costly. It is possible that some of these failures could be caught earlier in the design process if it were quick and easy to generate information and a set of high-quality metrics regarding designs, which could be used to make reproducible and data-driven decisions about which designs to characterise experimentally. We present DE-STRESS (DEsigned STRucture Evaluation ServiceS), a web application for evaluating structural models of designed and engineered proteins. DE-STRESS has been designed to be simple, intuitive to use and responsive. It provides a wealth of information regarding designs, as well as tools to help contextualise the results and formally describe the properties that a design requires to be fit for purpose.
Collapse
Affiliation(s)
- Michael J Stam
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | - Christopher W Wood
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FF, UK
| |
Collapse
|
10
|
Hernandez R, Facelli JC. Understanding protein structural changes for oncogenic missense variants. Heliyon 2021; 7:e06013. [PMID: 33553733 PMCID: PMC7846930 DOI: 10.1016/j.heliyon.2021.e06013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2020] [Revised: 08/20/2020] [Accepted: 01/15/2021] [Indexed: 12/31/2022] Open
Abstract
Understanding and predicting the changes of protein structure and function upon mutation and their relationship to human health is a critical element to translate the genomic revolution into actionable interventions. Therefore, it is pertinent to explore how mutations result in structural changes leading to pathogenic proteins, but due to the protein structural knowledge gap, experimental approaches are lacking. Protein structure prediction methods, such as I-TASSER, have made it possible to predict the structure of a given amino acid sequence, thus opening a new way to explore protein structure changes upon mutations when experimental information is not available. Using known mutations from the Catalogue of Somatic Mutation in Cancer (COSMIC) and ClinVar databases, we compare predicted structure-derived properties from wild type (WT) and mutated proteins and find differences between the local and global 3D protein structures of the WT and the mutants. The studies in this relatively small sample reveal that the structural changes are quite diverse.
Collapse
Affiliation(s)
- Rolando Hernandez
- Department of Biomedical Informatics and Center for Clinical and Translational Science, The University of Utah, Salt Lake City, Utah, USA
| | - Julio C. Facelli
- Department of Biomedical Informatics and Center for Clinical and Translational Science, The University of Utah, Salt Lake City, Utah, USA
| |
Collapse
|
11
|
Chen Y, Lu H, Zhang N, Zhu Z, Wang S, Li M. PremPS: Predicting the impact of missense mutations on protein stability. PLoS Comput Biol 2020; 16:e1008543. [PMID: 33378330 PMCID: PMC7802934 DOI: 10.1371/journal.pcbi.1008543] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 01/12/2021] [Accepted: 11/16/2020] [Indexed: 12/12/2022] Open
Abstract
Computational methods that predict protein stability changes induced by missense mutations have made a lot of progress over the past decades. Most of the available methods however have very limited accuracy in predicting stabilizing mutations because existing experimental sets are dominated by mutations reducing protein stability. Moreover, few approaches could consistently perform well across different test cases. To address these issues, we developed a new computational method PremPS to more accurately evaluate the effects of missense mutations on protein stability. The PremPS method is composed of only ten evolutionary- and structure-based features and parameterized on a balanced dataset with an equal number of stabilizing and destabilizing mutations. A comprehensive comparison of the predictive performance of PremPS with other available methods on nine benchmark datasets confirms that our approach consistently outperforms other methods and shows considerable improvement in estimating the impacts of stabilizing mutations. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. Thus, we further estimated the impact of using different structures on prediction accuracy, and demonstrate that our method performs well across different types of structures except for low-resolution structures and models built based on templates with low sequence identity. PremPS can be used for finding functionally important variants, revealing the molecular mechanisms of functional influences and protein design. PremPS is freely available at https://lilab.jysw.suda.edu.cn/research/PremPS/, which allows to do large-scale mutational scanning and takes about four minutes to perform calculations for a single mutation per protein with ~ 300 residues and requires ~ 0.4 seconds for each additional mutation.
Collapse
Affiliation(s)
- Yuting Chen
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Haoyu Lu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Ning Zhang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Zefeng Zhu
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Shuqin Wang
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| | - Minghui Li
- Center for Systems Biology, Department of Bioinformatics, School of Biology and Basic Medical Sciences, Soochow University, Suzhou, China
| |
Collapse
|
12
|
Chang RL, Stanley JA, Robinson MC, Sher JW, Li Z, Chan YA, Omdahl AR, Wattiez R, Godzik A, Matallana-Surget S. Protein structure, amino acid composition and sequence determine proteome vulnerability to oxidation-induced damage. EMBO J 2020; 39:e104523. [PMID: 33073387 PMCID: PMC7705453 DOI: 10.15252/embj.2020104523] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 09/16/2020] [Accepted: 09/22/2020] [Indexed: 02/05/2023] Open
Abstract
Oxidative stress alters cell viability, from microorganism irradiation sensitivity to human aging and neurodegeneration. Deleterious effects of protein carbonylation by reactive oxygen species (ROS) make understanding molecular properties determining ROS susceptibility essential. The radiation‐resistant bacterium Deinococcus radiodurans accumulates less carbonylation than sensitive organisms, making it a key model for deciphering properties governing oxidative stress resistance. We integrated shotgun redox proteomics, structural systems biology, and machine learning to resolve properties determining protein damage by γ‐irradiation in Escherichia coli and D. radiodurans at multiple scales. Local accessibility, charge, and lysine enrichment accurately predict ROS susceptibility. Lysine, methionine, and cysteine usage also contribute to ROS resistance of the D. radiodurans proteome. Our model predicts proteome maintenance machinery, and proteins protecting against ROS are more resistant in D. radiodurans. Our findings substantiate that protein‐intrinsic protection impacts oxidative stress resistance, identifying causal molecular properties.
Collapse
Affiliation(s)
- Roger L Chang
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Julian A Stanley
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Matthew C Robinson
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Joel W Sher
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Zhanwen Li
- Division of Biomedical Sciences, University of California Riverside School of Medicine, Riverside, CA, USA
| | - Yujia A Chan
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA.,Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA, USA
| | - Ashton R Omdahl
- Department of Systems Biology, Blavatnik Institute at Harvard Medical School, Boston, MA, USA
| | - Ruddy Wattiez
- Department of Proteomics and Microbiology, Research Institute for Biosciences, University of Mons, Mons, Belgium
| | - Adam Godzik
- Division of Biomedical Sciences, University of California Riverside School of Medicine, Riverside, CA, USA
| | - Sabine Matallana-Surget
- Division of Biological and Environmental Sciences, Faculty of Natural Sciences, University of Stirling, Stirling, UK
| |
Collapse
|
13
|
Barozet A, Bianciotto M, Vaisset M, Siméon T, Minoux H, Cortés J. Protein loops with multiple meta-stable conformations: A challenge for sampling and scoring methods. Proteins 2020; 89:218-231. [PMID: 32920900 DOI: 10.1002/prot.26008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2019] [Revised: 08/10/2020] [Accepted: 08/25/2020] [Indexed: 12/25/2022]
Abstract
Flexible regions in proteins, such as loops, cannot be represented by a single conformation. Instead, conformational ensembles are needed to provide a more global picture. In this context, identifying statistically meaningful conformations within an ensemble generated by loop sampling techniques remains an open problem. The difficulty is primarily related to the lack of structural data about these flexible regions. With the majority of structural data coming from x-ray crystallography and ignoring plasticity, the conception and evaluation of loop scoring methods is challenging. In this work, we compare the performance of various scoring methods on a set of eight protein loops that are known to be flexible. The ability of each method to identify and select all of the known conformations is assessed, and the underlying energy landscapes are produced and projected to visualize the qualitative differences obtained when using the methods. Statistical potentials are found to provide considerable reliability despite their being designed to tradeoff accuracy for lower computational cost. On a large pool of loop models, they are capable of filtering out statistically improbable states while retaining those that resemble known (and thus likely) conformations. However, computationally expensive methods are still required for more precise assessment and structural refinement. The results also highlight the importance of employing several scaffolds for the protein, due to the high influence of small structural rearrangements in the rest of the protein over the modeled energy landscape for the loop.
Collapse
Affiliation(s)
- Amélie Barozet
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France.,Sanofi Recherche & Développement, Integrated Drug Discovery, Molecular Design Sciences, Vitry-sur-Seine, France
| | - Marc Bianciotto
- Sanofi Recherche & Développement, Integrated Drug Discovery, Molecular Design Sciences, Vitry-sur-Seine, France
| | - Marc Vaisset
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Thierry Siméon
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Hervé Minoux
- Sanofi Recherche & Développement, Integrated Drug Discovery, Molecular Design Sciences, Vitry-sur-Seine, France
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| |
Collapse
|
14
|
Postic G, Janel N, Tufféry P, Moroy G. An information gain-based approach for evaluating protein structure models. Comput Struct Biotechnol J 2020; 18:2228-2236. [PMID: 32837711 PMCID: PMC7431362 DOI: 10.1016/j.csbj.2020.08.013] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2020] [Revised: 08/06/2020] [Accepted: 08/07/2020] [Indexed: 12/23/2022] Open
Abstract
For three decades now, knowledge-based scoring functions that operate through the "potential of mean force" (PMF) approach have continuously proven useful for studying protein structures. Although these statistical potentials are not to be confused with their physics-based counterparts of the same name-i.e. PMFs obtained by molecular dynamics simulations-their particular success in assessing the native-like character of protein structure predictions has lead authors to consider the computed scores as approximations of the free energy. However, this physical justification is a matter of controversy since the beginning. Alternative interpretations based on Bayes' theorem have been proposed, but the misleading formalism that invokes the inverse Boltzmann law remains recurrent in the literature. In this article, we present a conceptually new method for ranking protein structure models by quality, which is (i) independent of any physics-based explanation and (ii) relevant to statistics and to a general definition of information gain. The theoretical development described in this study provides new insights into how statistical PMFs work, in comparison with our approach. To prove the concept, we have built interatomic distance-dependent scoring functions, based on the former and new equations, and compared their performance on an independent benchmark of 60,000 protein structures. The results demonstrate that our new formalism outperforms statistical PMFs in evaluating the quality of protein structural decoys. Therefore, this original type of score offers a possibility to improve the success of statistical PMFs in the various fields of structural biology where they are applied. The open-source code is available for download at https://gitlab.rpbs.univ-paris-diderot.fr/src/ig-score.
Collapse
Affiliation(s)
- Guillaume Postic
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France.,Université de Paris, BFA, UMR 8251, CNRS, F-75013 Paris, France.,Institut Français de Bioinformatique (IFB), UMS 3601-CNRS, Université Paris-Saclay, Orsay, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Nathalie Janel
- Université de Paris, BFA, UMR 8251, CNRS, F-75013 Paris, France
| | - Pierre Tufféry
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Gautier Moroy
- Université de Paris, BFA, UMR 8251, CNRS, ERL U1133, Inserm, F-75013 Paris, France
| |
Collapse
|
15
|
Abstract
Atom pairwise potential functions make up an essential part of many scoring functions for protein decoy detection. With the development of machine learning (ML) tools, there are multiple ways to combine potential functions to create novel ML models and methods. Potential function parameters can be easily extracted; however, it is usually hard to directly obtain the calculated atom pairwise energies from scoring functions. Amber, as one of the most popular suites of modeling programs, has an extensive history and library of force field potential functions. In this work, we directly used the force field parameters in ff94 and ff14SB from Amber and encoded them to calculate atom pairwise energies for different interactions. Two sets of structures (single amino acid set and a dipeptide set) were used to evaluate the performance of our encoded Amber potentials. From the comparison results between energy terms obtained from our encoding and Amber, we find energy difference within ±0.06 kcal/mol for all tested structures. Previously we have shown that the Random Forest (RF) model can help to emphasize more important atom pairwise interactions and ignore insignificant ones [Pei, J.; Zheng, Z.; Merz, K. M. J. Chem. Inf. Model. 2019, 59, 1919-1929]. Here, as an example of combining ML methods with traditional potential functions, we followed the same work flow to combine the RF models with force field potential functions from Amber. To determine the performance of our RF models with force field potential functions, 224 different protein native-decoy systems were used as our training and testing sets We find that the RF models with ff94 and ff14SB force field parameters outperformed all other scoring functions (RF models with KECSA2, RWplus, DFIRE, dDFIRE, and GOAP) considered in this work for native structure detection, and they performed similarly in detecting the best decoy. Through inclusion of best decoy to decoy comparisons in building our RF models, we were able to generate models that outperformed the score functions tested herein both on accuracy and best decoy detection, again showing the performance and flexibility of our RF models to tackle this problem. Finally, the importance of the RF algorithm and force field parameters were also tested and the comparison results suggest that both the RF algorithm and force field potentials are important with the ML scoring function achieving its best performance only by combining them together. All code and data used in this work are available at https://github.com/JunPei000/FFENCODER_for_Protein_Folding_Pose_Selection.
Collapse
Affiliation(s)
- Jun Pei
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Lin Frank Song
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry and the Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
16
|
Bi J, Chen S, Zhao X, Nie Y, Xu Y. Computation-aided engineering of starch-debranching pullulanase from Bacillus thermoleovorans for enhanced thermostability. Appl Microbiol Biotechnol 2020; 104:7551-7562. [PMID: 32632476 DOI: 10.1007/s00253-020-10764-z] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 06/17/2020] [Accepted: 06/30/2020] [Indexed: 12/26/2022]
Abstract
Pullulanases are widely used in food, medicine, and other industries because they specifically hydrolyze α-1,6-glycosidic linkages in starch and oligosaccharides. In addition, high-temperature thermostable pullulanase has multiple advantages, including decreasing saccharification solution viscosity accompanied with enhanced mass transfer and reducing microbial contamination in starch hydrolysis. However, thermophilic pullulanase availability remains limited. Additionally, most do not meet starch-manufacturing requirements due to weak thermostability. Here, we developed a computation-aided strategy to engineer the thermophilic pullulanase from Bacillus thermoleovorans. First, three computational design predictors (FoldX, I-Mutant 3.0, and dDFIRE) were combined to predict stability changes introduced by mutations. After excluding conserved and catalytic sites, 17 mutants were identified. After further experimental verification, we confirmed six positive mutants. Among them, the G692M mutant had the highest thermostability improvement, with 3.8 °C increased Tm and 2.1-fold longer half-life than the wild type at 70 °C. We then characterized the mechanism underlying increased thermostability, such as rigidity enhancement, closer conformation, and strengthened motion correlation using root mean square fluctuation (RMSF), principal component analysis (PCA), dynamic cross-correlation map (DCCM), and free energy landscape (FEL) analysis. KEY POINTS: • A computation-aided strategy was developed to engineer pullulanase thermostability. • Seventeen mutants were identified by combining three computational design predictors. • The G692M mutant was obtained with increased Tmand half-life at 70 °C.
Collapse
Affiliation(s)
- Jiahua Bi
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, 214122, China
| | - Shuhui Chen
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, 214122, China
| | - Xianghan Zhao
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, 214122, China
| | - Yao Nie
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, 214122, China. .,Suqian Industrial Technology Research Institute of Jiangnan University, Suqian, 223814, China.
| | - Yan Xu
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, 214122, China.,State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, 214122, China
| |
Collapse
|
17
|
Broom A, Trainor K, Jacobi Z, Meiering EM. Computational Modeling of Protein Stability: Quantitative Analysis Reveals Solutions to Pervasive Problems. Structure 2020; 28:717-726.e3. [DOI: 10.1016/j.str.2020.04.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Revised: 03/26/2020] [Accepted: 04/06/2020] [Indexed: 12/20/2022]
|
18
|
Olechnovič K, Venclovas Č. VoroMQA web server for assessing three-dimensional structures of proteins and protein complexes. Nucleic Acids Res 2020; 47:W437-W442. [PMID: 31073605 PMCID: PMC6602437 DOI: 10.1093/nar/gkz367] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 04/19/2019] [Accepted: 05/05/2019] [Indexed: 01/12/2023] Open
Abstract
The VoroMQA (Voronoi tessellation-based Model Quality Assessment) web server is dedicated to the estimation of protein structure quality, a common step in selecting realistic and most accurate computational models and in validating experimental structures. As an input, the VoroMQA web server accepts one or more protein structures in PDB format. Input structures may be either monomeric proteins or multimeric protein complexes. For every input structure, the server provides both global and local (per-residue) scores. Visualization of the local scores along the protein chain is enhanced by providing secondary structure assignment and information on solvent accessibility. A unique feature of the VoroMQA server is the ability to directly assess protein-protein interaction interfaces. If this type of assessment is requested, the web server provides interface quality scores, interface energy estimates, and local scores for residues involved in inter-chain interfaces. VoroMQA, the underlying method of the web server, was extensively tested in recent community-wide CASP and CAPRI experiments. During these experiments VoroMQA showed outstanding performance both in model selection and in estimation of accuracy of local structural regions. The VoroMQA web server is available at http://bioinformatics.ibt.lt/wtsam/voromqa.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius LT-10257, Lithuania
| |
Collapse
|
19
|
Sen Gupta PS, Islam RNUI, Banerjee S, Nayek A, Rana MK, Bandyopadhyay AK. Screening and molecular characterization of lethal mutations of human homogentisate 1, 2 dioxigenase. J Biomol Struct Dyn 2020; 39:1661-1671. [PMID: 32107984 DOI: 10.1080/07391102.2020.1736158] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Parth Sarthi Sen Gupta
- Department of Biotechnology, The University of Burdwan, Bardhaman, West Bengal, India
- Department of Chemical Sciences, Indian Institute of Science Education and Research (IISER) Berhampur, Ganjam, Odisha, India
| | - Rifat Nawaz UI Islam
- Department of Biotechnology, The University of Burdwan, Bardhaman, West Bengal, India
| | - Sahini Banerjee
- Department of Biological Sciences, Indian Statistical Institute, Kolkata, West Bengal, India
| | - Arnab Nayek
- Department of Biotechnology, The University of Burdwan, Bardhaman, West Bengal, India
| | - Malay Kumar Rana
- Department of Chemical Sciences, Indian Institute of Science Education and Research (IISER) Berhampur, Ganjam, Odisha, India
| | | |
Collapse
|
20
|
Chen S, Sun Z, Lin L, Liu Z, Liu X, Chong Y, Lu Y, Zhao H, Yang Y. To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map. J Chem Inf Model 2019; 60:391-399. [PMID: 31800243 DOI: 10.1021/acs.jcim.9b00438] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein sequence profile prediction aims to generate multiple sequences from structural information to advance the protein design. Protein sequence profile can be computationally predicted by energy-based or fragment-based methods. By integrating these methods with neural networks, our previous method, SPIN2, has achieved a sequence recovery rate of 34%. However, SPIN2 employed only one-dimensional (1D) structural properties that are not sufficient to represent three-dimensional (3D) structures. In this study, we represented 3D structures by 2D maps of pairwise residue distances and developed a new method (SPROF) to predict protein sequence profiles based on an image captioning learning frame. To our best knowledge, this is the first method to employ a 2D distance map for predicting protein properties. SPROF achieved 39.8% in sequence recovery of residues on the independent test set, representing a 5.2% improvement over SPIN2. We also found the sequence recovery increased with the number of their neighbored residues in 3D structural space, indicating that our method can effectively learn long-range information from the 2D distance map. Thus, such network architecture using a 2D distance map is expected to be useful for other 3D structure-based applications, such as binding site prediction, protein function prediction, and protein interaction prediction. The online server and the source code is available at http://biomed.nscc-gz.cn and https://github.com/biomed-AI/SPROF , respectively.
Collapse
Affiliation(s)
- Sheng Chen
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zhe Sun
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Lihua Lin
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Zifeng Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Xun Liu
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutian Chong
- Third Affiliated Hospital of Sun Yat-sen University , Guangzhou 510000 , China
| | - Yutong Lu
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital , Sun Yat-sen University , Guangzhou 510000 , China
| | - Yuedong Yang
- School of Data and Computer Science , Sun Yat-sen University , Guangzhou 510000 , China.,Key Laboratory of Machine Intelligence and Advanced Computing (Sun Yat-sen University) of the Ministry of Education , Guangzhou 510000 , China
| |
Collapse
|
21
|
Cai Y, Li X, Sun Z, Lu Y, Zhao H, Hanson J, Paliwal K, Litfin T, Zhou Y, Yang Y. SPOT-Fold: Fragment-Free Protein Structure Prediction Guided by Predicted Backbone Structure and Contact Map. J Comput Chem 2019; 41:745-750. [PMID: 31845383 DOI: 10.1002/jcc.26132] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 10/07/2019] [Accepted: 12/01/2019] [Indexed: 02/01/2023]
Abstract
Protein structure determination has long been one of the most challenging problems in molecular biology for the past 60 years. Here we present an ab initio protein tertiary-structure prediction method assisted by predicted contact maps from SPOT-Contact and predicted dihedral angles from SPIDER 3. These predicted properties were then fed to the crystallography and NMR system (CNS) for restrained structure modeling. The resulted structures are first evaluated by the potential energy calculated by CNS, followed by dDFIRE energy function for model selections. The method called SPOT-Fold has been tested on 241 CASP targets between 67 and 670 amino acid residues, 60 randomly selected globular proteins under 100 amino acids. The method has a comparable accuracy to other contact-map-based modeling techniques. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yufeng Cai
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Xiongjun Li
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Zhe Sun
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Yutong Lu
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland, 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland, 4122, Australia
| | - Thomas Litfin
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, 4222, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, 4222, Australia
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| |
Collapse
|
22
|
Vianney YM, Tjoa SEE, Aditama R, Dwi Putra SE. Designing a less immunogenic nattokinase from Bacillus subtilis subsp. natto: a computational mutagenesis. J Mol Model 2019; 25:337. [DOI: 10.1007/s00894-019-4225-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 10/09/2019] [Indexed: 12/22/2022]
|
23
|
DLIGAND2: an improved knowledge-based energy function for protein-ligand interactions using the distance-scaled, finite, ideal-gas reference state. J Cheminform 2019; 11:52. [PMID: 31392430 PMCID: PMC6686496 DOI: 10.1186/s13321-019-0373-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 07/27/2019] [Indexed: 12/14/2022] Open
Abstract
Performance of structure-based molecular docking largely depends on the accuracy of scoring functions. One important type of scoring functions are knowledge-based potentials derived from known three-dimensional structures of proteins and/or protein–ligand complex structures. This study seeks to improve a knowledge-based protein–ligand potential based on a distance-scale finite ideal-gas reference (DFIRE) state (DLIGAND) by expanding the representation of protein atoms from 13 mol2 atom types to 167 residue-specific atom types, and employing a recently updated dataset containing 12,450 monomer protein chains for training. We found that the updated version DLIGAND2 has a consistent improvement over DLIGAND in predicting binding affinities for either native complex structures or docking-generated poses. More importantly, DLIGAND2 has a 52% increase over DLIGAND in enrichment factors in top 1% predictions based on the DUD-E decoy set, and consistently improves over Autodock Vina and other statistical energy functions in all three benchmark tests. We further found that DLIGAND2 outperforms empirical and machine-learning methods compared for virtual screening on new targets that are not homologous to the DUD-E training set. Given the best performance as a parameter-free statistical potential and among the best in all performance measures, DLIGAND2 should be useful for re-assessing the poses generated by docking software, or acting as one term in other scoring functions. The program is available at https://github.com/sysu-yanglab/DLIGAND2.![]()
Collapse
|
24
|
Methods for the Refinement of Protein Structure 3D Models. Int J Mol Sci 2019; 20:ijms20092301. [PMID: 31075942 PMCID: PMC6539982 DOI: 10.3390/ijms20092301] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Revised: 04/24/2019] [Accepted: 05/07/2019] [Indexed: 12/25/2022] Open
Abstract
The refinement of predicted 3D protein models is crucial in bringing them closer towards experimental accuracy for further computational studies. Refinement approaches can be divided into two main stages: The sampling and scoring stages. Sampling strategies, such as the popular Molecular Dynamics (MD)-based protocols, aim to generate improved 3D models. However, generating 3D models that are closer to the native structure than the initial model remains challenging, as structural deviations from the native basin can be encountered due to force-field inaccuracies. Therefore, different restraint strategies have been applied in order to avoid deviations away from the native structure. For example, the accurate prediction of local errors and/or contacts in the initial models can be used to guide restraints. MD-based protocols, using physics-based force fields and smart restraints, have made significant progress towards a more consistent refinement of 3D models. The scoring stage, including energy functions and Model Quality Assessment Programs (MQAPs) are also used to discriminate near-native conformations from non-native conformations. Nevertheless, there are often very small differences among generated 3D models in refinement pipelines, which makes model discrimination and selection problematic. For this reason, the identification of the most native-like conformations remains a major challenge.
Collapse
|
25
|
Wang X, Huang SY. Integrating Bonded and Nonbonded Potentials in the Knowledge-Based Scoring Function for Protein Structure Prediction. J Chem Inf Model 2019; 59:3080-3090. [DOI: 10.1021/acs.jcim.9b00057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Xinxiang Wang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- Institute of Biophysics, School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
26
|
Banerjee A, Ray S. Molecular interactions and mutational impact upon rhodopsin (G90→D90) for hindering dark adaptation of eye: A comparative structural level outlook for signaling mechanism in night blindness. Mutat Res 2019; 814:7-14. [PMID: 30659944 DOI: 10.1016/j.mrfmmm.2019.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2018] [Revised: 10/17/2018] [Accepted: 01/03/2019] [Indexed: 06/09/2023]
Abstract
For night blindness, a detailed structural exploration of the interactions among G-protein receptor rhodopsin, transducin and arrestin was performed. Rhodopsin is responsible for dim light vision while a point mutation (G90→D90) results in an adverse change in its photo-transduction. The validated 3D models of the three proteins were utilized, and upon mutation and interactions, rhodopsin attained higher stability (evaluated through thermodynamic energy calculations, electrostatic surface potential and solvent accessible area), thereby participating strongly with transducin. Conformational switches in mutated rhodopsin also depicted a firm conformation with few 310 helices accompanied by increased percentage of pure α-helices and sheets. All evaluations were corroborated through paired T-tests. Glu33 (glycosylated unit in the N-terminal zone) of rhodopsin plays a chief role in the overall interaction pattern. Arg69 and Glu33 from wild-type rhodopsin participated in ionic interactions, while the latter set of ionic interaction remained preserved even after mutation. Cys323 (C-terminal residue) and Arg69 formed H-bonds from the wild-type rhodopsin. Cys323 exceptionally supports cellular signaling pattern in the non-mutated situation and for the non-sufferers of night-blindness. Ser297 and Tyr43 from mutated rhodopsin reside in helices and interact with Thr32 of transducin, preserving the steady conformation in activated interacted state, even in the dark. Ser297 lies adjoined to Lys296 (retinal attachment site), which resides in NPXXY motif (an "activation switch" for signal transduction). Thus, the molecular facet for involvement of photo-transduction, which holds a paramount zone in ophthalmology, was dealt with. This might instigate the future prospect for drug discovery to prevent such mutations.
Collapse
Affiliation(s)
- Arundhati Banerjee
- Department of Biochemistry and Biophysics, University of Kalyani, Kalyani, Nadia, India.
| | - Sujay Ray
- Amity Institute of Biotechnology, Amity University, Kolkata, India.
| |
Collapse
|
27
|
Wang CK, Craik DJ. Toward Structure Determination of Disulfide-Rich Peptides Using Chemical Shift-Based Methods. J Phys Chem B 2019; 123:1903-1912. [PMID: 30730741 DOI: 10.1021/acs.jpcb.8b10649] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Disulfide-rich peptides are a class of molecules for which NMR spectroscopy has been the primary tool for structural characterization. Here, we explore whether the process can be achieved by using structural information encoded in chemical shifts. We examine (i) a representative set of five cyclic disulfide-rich peptides that have high-resolution NMR and X-ray structures and (ii) a larger set of 100 disulfide-rich peptides from the PDB. Accuracy of the calculated structures was dependent on the methods used for searching through conformational space and for identifying native conformations. Although Hα chemical shifts could be predicted reasonably well using SHIFTX, agreement between predicted and experimental chemical shifts was sufficient for identifying native conformations for only some peptides in the representative set. Combining chemical shift data with the secondary structure information and potential energy calculations improved the ability to identify native conformations. Additional use of sparse distance restraints or homology information to restrict the search space also improved the resolution of the calculated structures. This study demonstrates that abbreviated methods have potential for elucidation of peptide structures to high resolution and further optimization of these methods, e.g., improvement in chemical shift prediction accuracy, will likely help transition these methods into the mainstream of disulfide-rich peptide structural biology.
Collapse
Affiliation(s)
- Conan K Wang
- Institute for Molecular Bioscience , The University of Queensland , Brisbane , Queensland 4072 , Australia
| | - David J Craik
- Institute for Molecular Bioscience , The University of Queensland , Brisbane , Queensland 4072 , Australia
| |
Collapse
|
28
|
Pei J, Zheng Z, Merz KM. Random Forest Refinement of the KECSA2 Knowledge-Based Scoring Function for Protein Decoy Detection. J Chem Inf Model 2019; 59:1919-1929. [DOI: 10.1021/acs.jcim.8b00734] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jun Pei
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Zheng Zheng
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M. Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
- Institute for Cyber Enabled Research, Michigan State University, 567 Wilson Road, East Lansing, Michigan 48824, United States
| |
Collapse
|
29
|
Zhan J, Jia H, Semchenko EA, Bian Y, Zhou AM, Li Z, Yang Y, Wang J, Sarkar S, Totsika M, Blanchard H, Jen FEC, Ye Q, Haselhorst T, Jennings MP, Seib KL, Zhou Y. Self-derived structure-disrupting peptides targeting methionine aminopeptidase in pathogenic bacteria: a new strategy to generate antimicrobial peptides. FASEB J 2019; 33:2095-2104. [PMID: 30260702 PMCID: PMC6338635 DOI: 10.1096/fj.201700613rr] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 08/27/2018] [Indexed: 11/11/2022]
Abstract
Bacterial infection is one of the leading causes of death in young, elderly, and immune-compromised patients. The rapid spread of multi-drug-resistant (MDR) bacteria is a global health emergency and there is a lack of new drugs to control MDR pathogens. We describe a heretofore-unexplored discovery pathway for novel antibiotics that is based on self-targeting, structure-disrupting peptides. We show that a helical peptide, KFF- EcH3, derived from the Escherichia coli methionine aminopeptidase can disrupt secondary and tertiary structure of this essential enzyme, thereby killing the bacterium (including MDR strains). Significantly, no detectable resistance developed against this peptide. Based on a computational analysis, our study predicted that peptide KFF- EcH3 has the strongest interaction with the structural core of the methionine aminopeptidase. We further used our approach to identify peptide KFF- NgH1 to target the same enzyme from Neisseria gonorrhoeae. This peptide inhibited bacterial growth and was able to treat a gonococcal infection in a human cervical epithelial cell model. These findings present an exciting new paradigm in antibiotic discovery using self-derived peptides that can be developed to target the structures of any essential bacterial proteins.-Zhan, J., Jia, H., Semchenko, E. A., Bian, Y., Zhou, A. M., Li, Z., Yang, Y., Wang, J., Sarkar, S., Totsika, M., Blanchard, H., Jen, F. E.-C., Ye, Q., Haselhorst, T., Jennings, M. P., Seib, K. L., Zhou, Y. Self-derived structure-disrupting peptides targeting methionine aminopeptidase in pathogenic bacteria: a new strategy to generate antimicrobial peptides.
Collapse
Affiliation(s)
- Jian Zhan
- Institute for Glycomics, Griffith University, Queensland, Australia
| | - Husen Jia
- Institute for Glycomics, Griffith University, Queensland, Australia
| | | | - Yunqiang Bian
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
| | - Amy M. Zhou
- Queensland Academies–Health Sciences, Southport, Queensland, Australia
| | - Zhixiu Li
- Indiana University School of Informatics, Indiana University–Purdue University Indianapolis, Indianapolis, Indiana, USA
| | - Yuedong Yang
- Institute for Glycomics, Griffith University, Queensland, Australia
| | - Jihua Wang
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
| | - Sohinee Sarkar
- Institute of Health and Biomedical Innovation, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Makrina Totsika
- Institute of Health and Biomedical Innovation, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Helen Blanchard
- Institute for Glycomics, Griffith University, Queensland, Australia
| | - Freda E.-C. Jen
- Institute for Glycomics, Griffith University, Queensland, Australia
| | - Qizhuang Ye
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, Indiana, USA
- School of Medicine, Shenzhen University, Shenzhen, China
| | | | | | - Kate L. Seib
- Institute for Glycomics, Griffith University, Queensland, Australia
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Queensland, Australia
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou, China
- Indiana University School of Informatics, Indiana University–Purdue University Indianapolis, Indianapolis, Indiana, USA
| |
Collapse
|
30
|
Litfin T, Yang Y, Zhou Y. SPOT-Peptide: Template-Based Prediction of Peptide-Binding Proteins and Peptide-Binding Sites. J Chem Inf Model 2019; 59:924-930. [DOI: 10.1021/acs.jcim.8b00777] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Thomas Litfin
- School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
| | - Yuedong Yang
- School of Data and Computer Science, Sun-Yat Sen University, Guangzhou, Guangdong 510006, China
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Southport, QLD 4222, Australia
- Institute for Glycomics, Griffith University, Southport, QLD 4222, Australia
| |
Collapse
|
31
|
Faraggi E, Krupa P, Mozolewska MA, Liwo A, Kloczkowski A. Reoptimized UNRES Potential for Protein Model Quality Assessment. Genes (Basel) 2018; 9:genes9120601. [PMID: 30513992 PMCID: PMC6315818 DOI: 10.3390/genes9120601] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 11/25/2018] [Accepted: 11/27/2018] [Indexed: 11/16/2022] Open
Abstract
Ranking protein structure models is an elusive problem in bioinformatics. These models are evaluated on both the degree of similarity to the native structure and the folding pathway. Here, we simulated the use of the coarse-grained UNited RESidue (UNRES) force field as a tool to choose the best protein structure models for a given protein sequence among a pool of candidate models, using server data from the CASP11 experiment. Because the original UNRES was optimized for Molecular Dynamics simulations, we reoptimized UNRES using a deep feed-forward neural network, and we show that introducing additional descriptive features can produce better results. Overall, we found that the reoptimized UNRES performs better in selecting the best structures and tracking protein unwinding from its native state. We also found a relatively poor correlation between UNRES values and the model’s Template Modeling Score (TMS). This is remedied by reoptimization. We discuss some cases where our reoptimization procedure is useful.
Collapse
Affiliation(s)
- Eshel Faraggi
- Research and Information Systems, LLC, Indianapolis, IN 46240, USA.
- Department of Physics, Indiana University Purdue University Indianapolis, Indianapolis, IN 46202, USA.
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA.
| | - Pawel Krupa
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA.
- Institute of Physics, Polish Academy of Sciences, Al. Lotnikow 32/46, PL-02-668 Warsaw, Poland.
| | - Magdalena A Mozolewska
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA.
- Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5, 01-248 Warszawa, Poland.
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland.
- Center for In Silico Protein Structure and School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Korea.
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH 43215, USA.
- Department of Pediatrics, The Ohio State University, Columbus, OH 43215, USA.
- Kavli Institute for Theoretical Physics China, Chinese Academy of Sciences, Beijing 100190, China.
| |
Collapse
|
32
|
Jiménez-García B, Roel-Touris J, Romero-Durana M, Vidal M, Jiménez-González D, Fernández-Recio J. LightDock: a new multi-scale approach to protein-protein docking. Bioinformatics 2018; 34:49-55. [PMID: 28968719 DOI: 10.1093/bioinformatics/btx555] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 09/01/2017] [Indexed: 12/18/2022] Open
Abstract
Motivation Computational prediction of protein-protein complex structure by docking can provide structural and mechanistic insights for protein interactions of biomedical interest. However, current methods struggle with difficult cases, such as those involving flexible proteins, low-affinity complexes or transient interactions. A major challenge is how to efficiently sample the structural and energetic landscape of the association at different resolution levels, given that each scoring function is often highly coupled to a specific type of search method. Thus, new methodologies capable of accommodating multi-scale conformational flexibility and scoring are strongly needed. Results We describe here a new multi-scale protein-protein docking methodology, LightDock, capable of accommodating conformational flexibility and a variety of scoring functions at different resolution levels. Implicit use of normal modes during the search and atomic/coarse-grained combined scoring functions yielded improved predictive results with respect to state-of-the-art rigid-body docking, especially in flexible cases. Availability and implementation The source code of the software and installation instructions are available for download at https://life.bsc.es/pid/lightdock/. Contact juanf@bsc.es. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Brian Jiménez-García
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Jorge Roel-Touris
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Miguel Romero-Durana
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Miquel Vidal
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Daniel Jiménez-González
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Department of Computer Architecture, Universitat Politècnica de Catalunya, 08034 Barcelona, Spain
| | - Juan Fernández-Recio
- Life Sciences Department, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Structural Biology Unit, IBMB-CSIC, 08028 Barcelona, Spain
| |
Collapse
|
33
|
Zhao H, Taherzadeh G, Zhou Y, Yang Y. Computational Prediction of Carbohydrate-Binding Proteins and Binding Sites. ACTA ACUST UNITED AC 2018; 94:e75. [PMID: 30106511 DOI: 10.1002/cpps.75] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Protein-carbohydrate interaction is essential for biological systems, and carbohydrate-binding proteins (CBPs) are important targets when designing antiviral and anticancer drugs. Due to the high cost and difficulty associated with experimental approaches, many computational methods have been developed as complementary approaches to predict CBPs or carbohydrate-binding sites. However, most of these computational methods are not publicly available. Here, we provide a comprehensive review of related studies and demonstrate our two recently developed bioinformatics methods. The method SPOT-CBP is a template-based method for detecting CBPs based on structure through structural homology search combined with a knowledge-based scoring function. This method can yield model complex structure in addition to accurate prediction of CBPs. Furthermore, it has been observed that similarly accurate predictions can be made using structures from homology modeling, which has significantly expanded its applicability. The other method, SPRINT-CBH, is a de novo approach that predicts binding residues directly from protein sequences by using sequence information and predicted structural properties. This approach does not need structurally similar templates and thus is not limited by the current database of known protein-carbohydrate complex structures. These two complementary methods are available at https://sparks-lab.org. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Huiying Zhao
- Sun Yat-Sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China
| | - Ghazaleh Taherzadeh
- School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, Australia
| | - Yaoqi Zhou
- School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, Australia.,Institute for Glycomics, Griffith University, Gold Coast, Queensland, Australia
| | - Yuedong Yang
- School of Information and Communication Technology, Griffith University, Gold Coast, Queensland, Australia.,Institute for Glycomics, Griffith University, Gold Coast, Queensland, Australia.,School of Data and Computer Science, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
34
|
Anishchenko I, Kundrotas PJ, Vakser IA. Contact Potential for Structure Prediction of Proteins and Protein Complexes from Potts Model. Biophys J 2018; 115:809-821. [PMID: 30122295 DOI: 10.1016/j.bpj.2018.07.035] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 07/16/2018] [Accepted: 07/31/2018] [Indexed: 12/18/2022] Open
Abstract
The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
| | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| |
Collapse
|
35
|
Raucci R, Laine E, Carbone A. Local Interaction Signal Analysis Predicts Protein-Protein Binding Affinity. Structure 2018; 26:905-915.e4. [PMID: 29779789 DOI: 10.1016/j.str.2018.04.006] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 02/06/2018] [Accepted: 04/10/2018] [Indexed: 12/27/2022]
Abstract
Several models estimating the strength of the interaction between proteins in a complex have been proposed. By exploring the geometry of contact distribution at protein-protein interfaces, we provide an improved model of binding energy. Local interaction signal analysis (LISA) is a radial function based on terms describing favorable and non-favorable contacts obtained by density functional theory, the support-core-rim interface residue distribution, non-interacting charged residues and secondary structures contribution. The three-dimensional organization of the contacts and their contribution on localized hot-sites over the entire interaction surface were numerically evaluated. LISA achieves a correlation of 0.81 (and a root-mean-square error of 2.35 ± 0.38 kcal/mol) when tested on 125 complexes for which experimental measurements were realized. LISA's performance is stable for subsets defined by functional composition and extent of conformational changes upon complex formation. A large-scale comparison with 17 other functions demonstrated the power of the geometrical model in the understanding of complex binding.
Collapse
Affiliation(s)
- Raffaele Raucci
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Sorbonne Université, Institut des Sciences du Calcul et des Données (ISCD), 75005 Paris, France
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 4 Place Jussieu, 75005 Paris, France; Institut Universitaire de France, 75005 Paris, France.
| |
Collapse
|
36
|
Manavalan B, Lee J. SVMQA: support-vector-machine-based protein single-model quality assessment. Bioinformatics 2018; 33:2496-2503. [PMID: 28419290 DOI: 10.1093/bioinformatics/btx222] [Citation(s) in RCA: 130] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2016] [Accepted: 04/12/2017] [Indexed: 01/03/2023] Open
Abstract
Motivation The accurate ranking of predicted structural models and selecting the best model from a given candidate pool remain as open problems in the field of structural bioinformatics. The quality assessment (QA) methods used to address these problems can be grouped into two categories: consensus methods and single-model methods. Consensus methods in general perform better and attain higher correlation between predicted and true quality measures. However, these methods frequently fail to generate proper quality scores for native-like structures which are distinct from the rest of the pool. Conversely, single-model methods do not suffer from this drawback and are better suited for real-life applications where many models from various sources may not be readily available. Results In this study, we developed a support-vector-machine-based single-model global quality assessment (SVMQA) method. For a given protein model, the SVMQA method predicts TM-score and GDT_TS score based on a feature vector containing statistical potential energy terms and consistency-based terms between the actual structural features (extracted from the three-dimensional coordinates) and predicted values (from primary sequence). We trained SVMQA using CASP8, CASP9 and CASP10 targets and determined the machine parameters by 10-fold cross-validation. We evaluated the performance of our SVMQA method on various benchmarking datasets. Results show that SVMQA outperformed the existing best single-model QA methods both in ranking provided protein models and in selecting the best model from the pool. According to the CASP12 assessment, SVMQA was the best method in selecting good-quality models from decoys in terms of GDTloss. Availability and implementation SVMQA method can be freely downloaded from http://lee.kias.re.kr/SVMQA/SVMQA_eval.tar.gz. Contact jlee@kias.re.kr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Balachandran Manavalan
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study, Seoul 130-722, Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science and School of Computational Sciences, Korea Institute for Advanced Study, Seoul 130-722, Korea
| |
Collapse
|
37
|
Wang X, Zhang D, Huang SY. New Knowledge-Based Scoring Function with Inclusion of Backbone Conformational Entropies from Protein Structures. J Chem Inf Model 2018; 58:724-732. [DOI: 10.1021/acs.jcim.7b00601] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Xinxiang Wang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P. R. China
| |
Collapse
|
38
|
Nakamura T, Oda T, Fukasawa Y, Tomii K. Template-based quaternary structure prediction of proteins using enhanced profile-profile alignments. Proteins 2017; 86 Suppl 1:274-282. [PMID: 29178285 PMCID: PMC5836938 DOI: 10.1002/prot.25432] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2017] [Revised: 11/11/2017] [Accepted: 11/22/2017] [Indexed: 12/26/2022]
Abstract
Proteins often exist as their multimeric forms when they function as so‐called biological assemblies consisting of the specific number and arrangement of protein subunits. Consequently, elucidating biological assemblies is necessary to improve understanding of protein function. Template‐Based Modeling (TBM), based on known protein structures, has been used widely for protein structure prediction. Actually, TBM has become an increasingly useful approach in recent years because of the increased amounts of information related to protein amino acid sequences and three‐dimensional structures. An apparently similar situation exists for biological assembly structure prediction as protein complex structures in the PDB increase, although the inference of biological assemblies is not a trivial task. Many methods using TBM, including ours, have been developed for protein structure prediction. Using enhanced profile–profile alignments, we participated in the 12th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP12), as the FONT team (Group # 480). Herein, we present experimental procedures and results of retrospective analyses using our approach for the Quaternary Structure Prediction category of CASP12. We performed profile–profile alignments of several types, based on FORTE, our profile–profile alignment algorithm, to identify suitable templates. Results show that these alignment results enable us to find templates in almost all possible cases. Moreover, we have come to understand the necessity of developing a model selection method that provides improved accuracy. Results also demonstrate that, to some extent, finding templates of protein complexes is useful even for MEDIUM and HARD assembly prediction.
Collapse
Affiliation(s)
- Tsukasa Nakamura
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8562, Japan
| | - Toshiyuki Oda
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Yoshinori Fukasawa
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Kentaro Tomii
- Artificial Intelligence Research Center (AIRC), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba, 277-8562, Japan.,Biotechnology Research Institute for Drug Discovery (BRD), National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo, 135-0064, Japan.,AIST-Tokyo Tech Real World Big-Data Computation Open Innovation Laboratory (RWBC-OIL), 2-12-1 Ookayama, Meguro-ku, Tokyo, 152-8550, Japan
| |
Collapse
|
39
|
Structure-based cross-docking analysis of antibody-antigen interactions. Sci Rep 2017; 7:8145. [PMID: 28811664 PMCID: PMC5557897 DOI: 10.1038/s41598-017-08414-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 07/10/2017] [Indexed: 12/02/2022] Open
Abstract
Antibody–antigen interactions are critical to our immune response, and understanding the structure-based biophysical determinants for their binding specificity and affinity is of fundamental importance. We present a computational structure-based cross-docking study to test the identification of native antibody–antigen interaction pairs among cognate and non-cognate complexes. We picked a dataset of 17 antibody–antigen complexes of which 11 have both bound and unbound structures available, and we generated a representative ensemble of cognate and non-cognate complexes. Using the Rosetta interface score as a classifier, the cognate pair was the top-ranked model in 80% (14/17) of the antigen targets using bound monomer structures in docking, 35% (6/17) when using unbound, and 12% (2/17) when using the homology-modeled backbones to generate the complexes. Increasing rigid-body diversity of the models using RosettaDock’s local dock routine lowers the discrimination accuracy with the cognate antibody–antigen pair ranking in bound and unbound models but recovers additional top-ranked cognate complexes when using homology models. The study is the first structure-based cross-docking attempt aimed at distinguishing antibody–antigen binders from non-binders and demonstrates the challenges to address for the methods to be widely applicable to supplement high-throughput experimental antibody sequencing workflows.
Collapse
|
40
|
Broom A, Jacobi Z, Trainor K, Meiering EM. Computational tools help improve protein stability but with a solubility tradeoff. J Biol Chem 2017; 292:14349-14361. [PMID: 28710274 DOI: 10.1074/jbc.m117.784165] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Revised: 07/11/2017] [Indexed: 01/18/2023] Open
Abstract
Accurately predicting changes in protein stability upon amino acid substitution is a much sought after goal. Destabilizing mutations are often implicated in disease, whereas stabilizing mutations are of great value for industrial and therapeutic biotechnology. Increasing protein stability is an especially challenging task, with random substitution yielding stabilizing mutations in only ∼2% of cases. To overcome this bottleneck, computational tools that aim to predict the effect of mutations have been developed; however, achieving accuracy and consistency remains challenging. Here, we combined 11 freely available tools into a meta-predictor (meieringlab.uwaterloo.ca/stabilitypredict/). Validation against ∼600 experimental mutations indicated that our meta-predictor has improved performance over any of the individual tools. The meta-predictor was then used to recommend 10 mutations in a previously designed protein of moderate thermodynamic stability, ThreeFoil. Experimental characterization showed that four mutations increased protein stability and could be amplified through ThreeFoil's structural symmetry to yield several multiple mutants with >2-kcal/mol stabilization. By avoiding residues within functional ties, we could maintain ThreeFoil's glycan-binding capacity. Despite successfully achieving substantial stabilization, however, almost all mutations decreased protein solubility, the most common cause of protein design failure. Examination of the 600-mutation data set revealed that stabilizing mutations on the protein surface tend to increase hydrophobicity and that the individual tools favor this approach to gain stability. Thus, whereas currently available tools can increase protein stability and combining them into a meta-predictor yields enhanced reliability, improvements to the potentials/force fields underlying these tools are needed to avoid gaining protein stability at the cost of solubility.
Collapse
Affiliation(s)
- Aron Broom
- From the Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Zachary Jacobi
- From the Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Kyle Trainor
- From the Department of Chemistry, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | | |
Collapse
|
41
|
Abstract
The pore domain of human voltage-dependent cardiac sodium channel Nav1.5 (hNav1.5) is the crucial binding targets for anti-arrhythmics drugs and some local anesthetic drugs but its three-dimensional structure is still lacking. This has affected the detailed studies of the binding features and mechanism of these drugs. In this paper, we present a structural model for open-state pore domain of hNav1.5 built using single template ROSETTA-membrane homology modeling with the crystal structure of NavMs. The assembled structural models are evaluated by rosettaMP energy and locations of binding sites. The modeled structures of the pore domain of hNav1.5 in open state will be helpful to explore molecular mechanism of a state-dependent drug binding and help designing new drugs.
Collapse
Affiliation(s)
- Xiaofeng Ji
- a School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education , Huazhong University of Science and Technology , Wuhan , Hubei 430074 , China.,b Yellow Sea Fisheries Research Institute , Chinese Academy of Fishery Sciences , Qingdao , Shandong 266071 , China
| | - Yi Xiao
- a School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education , Huazhong University of Science and Technology , Wuhan , Hubei 430074 , China
| | - Shiyong Liu
- a School of Physics and Key Laboratory of Molecular Biophysics of the Ministry of Education , Huazhong University of Science and Technology , Wuhan , Hubei 430074 , China
| |
Collapse
|
42
|
Schwarte A, Genz M, Skalden L, Nobili A, Vickers C, Melse O, Kuipers R, Joosten HJ, Stourac J, Bendl J, Black J, Haase P, Baakman C, Damborsky J, Bornscheuer U, Vriend G, Venselaar H. NewProt – a protein engineering portal. Protein Eng Des Sel 2017; 30:441-447. [DOI: 10.1093/protein/gzx024] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 04/13/2017] [Indexed: 11/13/2022] Open
|
43
|
Abstract
Computational protein design (CPD), a yet evolving field, includes computer-aided engineering for partial or full de novo designs of proteins of interest. Designs are defined by a requested structure, function, or working environment. This chapter describes the birth and maturation of the field by presenting 101 CPD examples in a chronological order emphasizing achievements and pending challenges. Integrating these aspects presents the plethora of CPD approaches with the hope of providing a "CPD 101". These reflect on the broader structural bioinformatics and computational biophysics field and include: (1) integration of knowledge-based and energy-based methods, (2) hierarchical designated approach towards local, regional, and global motifs and the integration of high- and low-resolution design schemes that fit each such region, (3) systematic differential approaches towards different protein regions, (4) identification of key hot-spot residues and the relative effect of remote regions, (5) assessment of shape-complementarity, electrostatics and solvation effects, (6) integration of thermal plasticity and functional dynamics, (7) negative design, (8) systematic integration of experimental approaches, (9) objective cross-assessment of methods, and (10) successful ranking of potential designs. Future challenges also include dissemination of CPD software to the general use of life-sciences researchers and the emphasis of success within an in vivo milieu. CPD increases our understanding of protein structure and function and the relationships between the two along with the application of such know-how for the benefit of mankind. Applied aspects range from biological drugs, via healthier and tastier food products to nanotechnology and environmentally friendly enzymes replacing toxic chemicals utilized in the industry.
Collapse
|
44
|
Cao R, Bhattacharya D, Adhikari B, Li J, Cheng J. Large-scale model quality assessment for improving protein tertiary structure prediction. Bioinformatics 2015; 31:i116-23. [PMID: 26072473 PMCID: PMC4553833 DOI: 10.1093/bioinformatics/btv235] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Motivation: Sampling structural models and ranking them are the two major challenges of protein structure prediction. Traditional protein structure prediction methods generally use one or a few quality assessment (QA) methods to select the best-predicted models, which cannot consistently select relatively better models and rank a large number of models well. Results: Here, we develop a novel large-scale model QA method in conjunction with model clustering to rank and select protein structural models. It unprecedentedly applied 14 model QA methods to generate consensus model rankings, followed by model refinement based on model combination (i.e. averaging). Our experiment demonstrates that the large-scale model QA approach is more consistent and robust in selecting models of better quality than any individual QA method. Our method was blindly tested during the 11th Critical Assessment of Techniques for Protein Structure Prediction (CASP11) as MULTICOM group. It was officially ranked third out of all 143 human and server predictors according to the total scores of the first models predicted for 78 CASP11 protein domains and second according to the total scores of the best of the five models predicted for these domains. MULTICOM’s outstanding performance in the extremely competitive 2014 CASP11 experiment proves that our large-scale QA approach together with model clustering is a promising solution to one of the two major problems in protein structure modeling. Availability and implementation: The web server is available at: http://sysbio.rnet.missouri.edu/multicom_cluster/human/. Contact: chengji@missouri.edu
Collapse
Affiliation(s)
- Renzhi Cao
- Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA
| | - Debswapna Bhattacharya
- Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA
| | - Badri Adhikari
- Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA
| | - Jilong Li
- Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA
| | - Jianlin Cheng
- Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA Computer Science Department, University of Missouri, Columbia, Missouri, 65211, USA, Informatics Institute, University of Missouri, Columbia, Missouri, 65211, USA and C. Bond Life Science Center, University of Missouri, Columbia, Missouri, 65211, USA
| |
Collapse
|
45
|
Abstract
We report the performance of our approaches for protein-protein docking and interface analysis in CAPRI rounds 20-26. At the core of our pipeline was the ZDOCK program for rigid-body protein-protein docking. We then reranked the ZDOCK predictions using the ZRANK or IRAD scoring functions, pruned and analyzed energy landscapes using clustering, and analyzed the docking results using our interface prediction approach RCF. When possible, we used biological information from the literature to apply constraints to the search space during or after the ZDOCK runs. For approximately half of the standard docking challenges we made at least one prediction that was acceptable or better. For the scoring challenges we made acceptable or better predictions for all but one target. This indicates that our scoring functions are generally able to select the correct binding mode.
Collapse
Affiliation(s)
- Thom Vreven
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts, 01605
| | | | | | | |
Collapse
|
46
|
A large-scale conformation sampling and evaluation server for protein tertiary structure prediction and its assessment in CASP11. BMC Bioinformatics 2015; 16:337. [PMID: 26493701 PMCID: PMC4619059 DOI: 10.1186/s12859-015-0775-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 10/14/2015] [Indexed: 11/10/2022] Open
Abstract
Background With more and more protein sequences produced in the genomic era, predicting protein structures from sequences becomes very important for elucidating the molecular details and functions of these proteins for biomedical research. Traditional template-based protein structure prediction methods tend to focus on identifying the best templates, generating the best alignments, and applying the best energy function to rank models, which often cannot achieve the best performance because of the difficulty of obtaining best templates, alignments, and models. Methods We developed a large-scale conformation sampling and evaluation method and its servers to improve the reliability and robustness of protein structure prediction. In the first step, our method used a variety of alignment methods to sample relevant and complementary templates and to generate alternative and diverse target-template alignments, used a template and alignment combination protocol to combine alignments, and used template-based and template-free modeling methods to generate a pool of conformations for a target protein. In the second step, it used a large number of protein model quality assessment methods to evaluate and rank the models in the protein model pool, in conjunction with an exception handling strategy to deal with any additional failure in model ranking. Results The method was implemented as two protein structure prediction servers: MULTICOM-CONSTRUCT and MULTICOM-CLUSTER that participated in the 11th Critical Assessment of Techniques for Protein Structure Prediction (CASP11) in 2014. The two servers were ranked among the best 10 server predictors. Conclusions The good performance of our servers in CASP11 demonstrates the effectiveness and robustness of the large-scale conformation sampling and evaluation. The MULTICOM server is available at: http://sysbio.rnet.missouri.edu/multicom_cluster/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0775-x) contains supplementary material, which is available to authorized users.
Collapse
|
47
|
Zhang J, Barz B, Zhang J, Xu D, Kosztin I. Selective refinement and selection of near-native models in protein structure prediction. Proteins 2015; 83:1823-35. [PMID: 26214389 PMCID: PMC4700123 DOI: 10.1002/prot.24866] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 06/22/2015] [Accepted: 07/21/2015] [Indexed: 11/07/2022]
Abstract
In recent years in silico protein structure prediction reached a level where fully automated servers can generate large pools of near-native structures. However, the identification and further refinement of the best structures from the pool of models remain problematic. To address these issues, we have developed (i) a target-specific selective refinement (SR) protocol; and (ii) molecular dynamics (MD) simulation based ranking (SMDR) method. In SR the all-atom refinement of structures is accomplished via the Rosetta Relax protocol, subject to specific constraints determined by the size and complexity of the target. The best-refined models are selected with SMDR by testing their relative stability against gradual heating through all-atom MD simulations. Through extensive testing we have found that Mufold-MD, our fully automated protein structure prediction server updated with the SR and SMDR modules consistently outperformed its previous versions.
Collapse
Affiliation(s)
- Jiong Zhang
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri 65211
| | - Bagdan Barz
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri 65211
| | - Jingfen Zhang
- Department of Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211
| | - Dong Xu
- Department of Computer Science, Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, Missouri 65211
| | - Ioan Kosztin
- Department of Physics and Astronomy, University of Missouri, Columbia, Missouri 65211
| |
Collapse
|
48
|
Park H, DiMaio F, Baker D. The origin of consistent protein structure refinement from structural averaging. Structure 2015; 23:1123-8. [PMID: 25960407 PMCID: PMC4456269 DOI: 10.1016/j.str.2015.03.022] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Revised: 03/03/2015] [Accepted: 03/26/2015] [Indexed: 11/27/2022]
Abstract
Recent studies have shown that explicit solvent molecular dynamics (MD) simulation followed by structural averaging can consistently improve protein structure models. We find that improvement upon averaging is not limited to explicit water MD simulation, as consistent improvements are also observed for more efficient implicit solvent MD or Monte Carlo minimization simulations. To determine the origin of these improvements, we examine the changes in model accuracy brought about by averaging at the individual residue level. We find that the improvement in model quality from averaging results from the superposition of two effects: a dampening of deviations from the correct structure in the least well modeled regions, and a reinforcement of consistent movements towards the correct structure in better modeled regions. These observations are consistent with an energy landscape model in which the magnitude of the energy gradient toward the native structure decreases with increasing distance from the native state.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Frank DiMaio
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA; Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA.
| |
Collapse
|
49
|
Zheng F, Zhang J, Grigoryan G. Tertiary Structural Propensities Reveal Fundamental Sequence/Structure Relationships. Structure 2015; 23:961-971. [DOI: 10.1016/j.str.2015.03.015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Revised: 03/02/2015] [Accepted: 03/22/2015] [Indexed: 02/08/2023]
|
50
|
Malhotra S, Mathew OK, Sowdhamini R. DOCKSCORE: a webserver for ranking protein-protein docked poses. BMC Bioinformatics 2015; 16:127. [PMID: 25902779 PMCID: PMC4414291 DOI: 10.1186/s12859-015-0572-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2014] [Accepted: 04/13/2015] [Indexed: 11/28/2022] Open
Abstract
Background Proteins interact with a variety of other molecules such as nucleic acids, small molecules and other proteins inside the cell. Structure-determination of protein-protein complexes is challenging due to several reasons such as the large molecular weights of these macromolecular complexes, their dynamic nature, difficulty in purification and sample preparation. Computational docking permits an early understanding of the feasibility and mode of protein-protein interactions. However, docking algorithms propose a number of solutions and it is a challenging task to select the native or near native pose(s) from this pool. DockScore is an objective scoring scheme that can be used to rank protein-protein docked poses. It considers several interface parameters, namely, surface area, evolutionary conservation, hydrophobicity, short contacts and spatial clustering at the interface for scoring. Results We have implemented DockScore in form of a webserver for its use by the scientific community. DockScore webserver can be employed, subsequent to docking, to perform scoring of the docked solutions, starting from multiple poses as inputs. The results, on scores and ranks for all the poses, can be downloaded as a csv file and graphical view of the interface of best ranking poses is possible. Conclusions The webserver for DockScore is made freely available for the scientific community at: http://caps.ncbs.res.in/dockscore/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0572-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sony Malhotra
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India.
| | - Oommen K Mathew
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India. .,SASTRA University, Tirumalaisamudram, Thanjavur, 613 401, Tamil Nadu, India.
| | - Ramanathan Sowdhamini
- National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore, 560 065, India.
| |
Collapse
|