1
|
Automated Protein Secondary Structure Assignment from C α Positions Using Neural Networks. Biomolecules 2022; 12:biom12060841. [PMID: 35740966 PMCID: PMC9220970 DOI: 10.3390/biom12060841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/17/2022] Open
Abstract
The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs Cα coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only Cα trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods.
Collapse
|
2
|
Prediction of Protein Tertiary Structure via Regularized Template Classification Techniques. Molecules 2020; 25:molecules25112467. [PMID: 32466409 PMCID: PMC7321371 DOI: 10.3390/molecules25112467] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 05/21/2020] [Accepted: 05/22/2020] [Indexed: 11/24/2022] Open
Abstract
We discuss the use of the regularized linear discriminant analysis (LDA) as a model reduction technique combined with particle swarm optimization (PSO) in protein tertiary structure prediction, followed by structure refinement based on singular value decomposition (SVD) and PSO. The algorithm presented in this paper corresponds to the category of template-based modeling. The algorithm performs a preselection of protein templates before constructing a lower dimensional subspace via a regularized LDA. The protein coordinates in the reduced spaced are sampled using a highly explorative optimization algorithm, regressive–regressive PSO (RR-PSO). The obtained structure is then projected onto a reduced space via singular value decomposition and further optimized via RR-PSO to carry out a structure refinement. The final structures are similar to those predicted by best structure prediction tools, such as Rossetta and Zhang servers. The main advantage of our methodology is that alleviates the ill-posed character of protein structure prediction problems related to high dimensional optimization. It is also capable of sampling a wide range of conformational space due to the application of a regularized linear discriminant analysis, which allows us to expand the differences over a reduced basis set.
Collapse
|
3
|
Macnar JM, Szulc NA, Kryś JD, Badaczewska-Dawid AE, Gront D. BioShell 3.0: Library for Processing Structural Biology Data. Biomolecules 2020; 10:biom10030461. [PMID: 32188163 PMCID: PMC7175226 DOI: 10.3390/biom10030461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Revised: 03/05/2020] [Accepted: 03/10/2020] [Indexed: 01/11/2023] Open
Abstract
BioShell is an open-source package for processing biological data, particularly focused on structural applications. The package provides parsers, data structures and algorithms for handling and analyzing macromolecular sequences, structures and sequence profiles. The most frequently used routines are accessible by a set of easy-to-use command line utilities for a Linux environment. The full functionality of the package assumes knowledge of C++ or Python to assemble an application using this software library. Since the last publication that announced the version 2.0, the package has been greatly expanded and rewritten in C++ standard 11 (C++11) to improve its modularity and efficiency. A new testing platform has been implemented to continuously test the correctness and integrity of the package. More than two hundred test programs have been published to provide simple examples that can be used as templates. This makes BioShell an easy to use library that greatly speeds up development of bioinformatics applications and web services without compromising computational efficiency.
Collapse
Affiliation(s)
- Joanna M. Macnar
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Stefana Banacha 2C, 02-097 Warsaw, Poland
| | - Natalia A. Szulc
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- Laboratory of Protein Metabolism, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Street, 02-109 Warsaw, Poland
| | - Justyna D. Kryś
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
| | - Aleksandra E. Badaczewska-Dawid
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- Correspondence:
| |
Collapse
|
4
|
Kopeć K, Pędziwiatr M, Gront D, Sztatelman O, Sławski J, Łazicka M, Worch R, Zawada K, Makarova K, Nyk M, Grzyb J. Comparison of α-Helix and β-Sheet Structure Adaptation to a Quantum Dot Geometry: Toward the Identification of an Optimal Motif for a Protein Nanoparticle Cover. ACS OMEGA 2019; 4:13086-13099. [PMID: 31460436 PMCID: PMC6705085 DOI: 10.1021/acsomega.9b00505] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Accepted: 07/23/2019] [Indexed: 05/31/2023]
Abstract
While quantum dots (QDs) are useful as fluorescent labels, their application in biosciences is limited due to the stability and hydrophobicity of their surface. In this study, we tested two types of proteins for use as a cover for spherical QDs, composed of cadmium selenide. Pumilio homology domain (Puf), which is mostly α-helical, and leucine-rich repeat (LRR) domain, which is rich in β-sheets, were selected to determine if there is a preference for one of these secondary structure types for nanoparticle covers. The protein sequences were optimized to improve their interaction with the surface of QDs. The solubilization of the apoproteins and their assembly with nanoparticles required the application of a detergent, which was removed in subsequent steps. Finally, only the Puf-based cover was successful enough as a QD hydrophilic cover. We showed that a single polypeptide dimer of Puf, PufPuf, can form a cover. We characterized the size and fluorescent properties of the obtained QD:protein assemblies. We showed that the secondary structure of the Puf proteins was not destroyed upon contact with the QDs. We demonstrated that these assemblies do not promote the formation of reactive oxygen species during illumination of the nanoparticles. The data represent advances in the effort to obtain a stable biocompatible cover for QDs.
Collapse
Affiliation(s)
- Katarzyna Kopeć
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotników 32/46, PL02668 Warsaw, Poland
| | - Marta Pędziwiatr
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotników 32/46, PL02668 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, PL02093 Warsaw, Poland
| | - Olga Sztatelman
- Institute
of Biochemistry and Biophysics, Polish Academy of Sciences, Pawińskiego 5a, PL02106 Warsaw, Poland
| | - Jakub Sławski
- Department
of Biophysics, Faculty of Biotechnology, University of Wrocław, F. Joliot-Curie Street 14a, PL50383 Wrocław, Poland
| | - Magdalena Łazicka
- Department
of Metabolic Regulation, Institute of Biochemistry, Faculty of Biology, University of Warsaw, Miecznikowa 1, PL02096 Warsaw, Poland
| | - Remigiusz Worch
- Institute
of Physics, Polish Academy of Sciences, Aleja Lotników 32/46, PL02668 Warsaw, Poland
| | - Katarzyna Zawada
- Department
of Physical Chemistry, Faculty of Pharmacy with the Laboratory Medicine
Division, The Medical University of Warsaw, Banacha 1 Street, PL02097 Warsaw, Poland
| | - Katerina Makarova
- Department
of Physical Chemistry, Faculty of Pharmacy with the Laboratory Medicine
Division, The Medical University of Warsaw, Banacha 1 Street, PL02097 Warsaw, Poland
| | - Marcin Nyk
- Advanced
Materials Engineering and Modelling Group, Faculty of Chemistry, Wrocław University of Science and Technology, Wybrzeże Wyspiańskiego
27, PL50370 Wrocław, Poland
| | - Joanna Grzyb
- Department
of Biophysics, Faculty of Biotechnology, University of Wrocław, F. Joliot-Curie Street 14a, PL50383 Wrocław, Poland
| |
Collapse
|
5
|
Álvarez Ó, Fernández-Martínez JL, Corbeanu AC, Fernández-Muñiz Z, Kloczkowski A. Predicting protein tertiary structure and its uncertainty analysis via particle swarm sampling. J Mol Model 2019; 25:79. [PMID: 30810816 PMCID: PMC7586042 DOI: 10.1007/s00894-019-3956-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 02/05/2019] [Indexed: 10/27/2022]
Abstract
We discuss the relationship between the problem of protein tertiary structure prediction from the amino acid sequence and the uncertainty analysis. The algorithm presented in this paper belongs to the category of decoy-based modeling, where different known protein models are used to establish a low dimensional space via principal component analysis. The low dimensional space is utilized to perform an energy optimization via a family of very explorative particle swarm optimizers to find the global minimum. The aim of this procedure is to get a representative sample of the nonlinear equivalent region, that is, protein models that have their energy lower than a certain energy bound. The posterior analysis of this family provides very valuable information about the backbone structure of the native conformation and its possible alternate states. This methodology has the advantage of being simple and fast and can help refine the tertiary protein structure. We comprehensively illustrate the performance of our algorithm on one protein from the CASP-9 protein structure prediction experiment. We also provide a theoretical analysis of the energy landscape found in the tertiary structure protein inverse problem, explaining why model reduction techniques (principal component analysis in this case) serve to alleviate the ill-posed character of this high dimensional optimization problem. In addition, we expand the computational benchmark with a summary of other CASP-9 proteins in the Appendix.
Collapse
Affiliation(s)
- Óscar Álvarez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Juan Luis Fernández-Martínez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain.
| | - Ana Cernea Corbeanu
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Zulima Fernández-Muñiz
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
6
|
Álvarez Ó, Fernández-Martínez JL, Fernández-Brillet C, Cernea A, Fernández-Muñiz Z, Kloczkowski A. Principal component analysis in protein tertiary structure prediction. J Bioinform Comput Biol 2018; 16:1850005. [PMID: 29566640 DOI: 10.1142/s0219720018500051] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We discuss applicability of principal component analysis (PCA) for protein tertiary structure prediction from amino acid sequence. The algorithm presented in this paper belongs to the category of protein refinement models and involves establishing a low-dimensional space where the sampling (and optimization) is carried out via particle swarm optimizer (PSO). The reduced space is found via PCA performed for a set of low-energy protein models previously found using different optimization techniques. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. This term is aimed at providing high frequency details in the energy optimization. The goal of this research is to analyze how the dimensionality reduction affects the prediction capability of the PSO procedure. For that purpose, different proteins from the Critical Assessment of Techniques for Protein Structure Prediction experiments were modeled. In all the cases, both the energy of the best decoy and the distance to the native structure have decreased. Our analysis also shows how the predicted backbone structure of native conformation and of alternative low energy states varies with respect to the PCA dimensionality. Generally speaking, the reconstruction can be successfully achieved with 10 principal components and the high frequency term. We also provide a computational analysis of protein energy landscape for the inverse problem of reconstructing structure from the reduced number of principal components, showing that the dimensionality reduction alleviates the ill-posed character of this high-dimensional energy optimization problem. The procedure explained in this paper is very fast and allows testing different PCA expansions. Our results show that PSO improves the energy of the best decoy used in the PCA when the adequate number of PCA terms is considered.
Collapse
Affiliation(s)
- Óscar Álvarez
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Juan Luis Fernández-Martínez
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Celia Fernández-Brillet
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Ana Cernea
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Zulima Fernández-Muñiz
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Andrzej Kloczkowski
- † Batelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,‡ Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
7
|
Matowane RG, Wieteska L, Bamal HD, Kgosiemang IKR, Van Wyk M, Manume NA, Abdalla SMH, Mashele SS, Gront D, Syed K. In silico analysis of cytochrome P450 monooxygenases in chronic granulomatous infectious fungus Sporothrix schenckii: Special focus on CYP51. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2017; 1866:166-177. [PMID: 28989052 DOI: 10.1016/j.bbapap.2017.10.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2017] [Revised: 09/29/2017] [Accepted: 10/02/2017] [Indexed: 01/19/2023]
Abstract
Sporotrichosis is an emerging chronic, granulomatous, subcutaneous, mycotic infection caused by Sporothrix species. Sporotrichosis is treated with the azole drug itraconazole as ketoconazole is ineffective. It is a well-known fact that azole drugs act by inhibiting cytochrome P450 monooxygenases (P450s), heme-thiolate proteins. To date, nothing is known about P450s in Sporothrix schenckii and the molecular basis of its resistance to ketoconazole. Here we present genome-wide identification, annotation, phylogenetic analysis and comprehensive P450 family-level comparative analysis of S. schenckii P450s with pathogenic fungi P450s, along with a rationale for ketoconazole resistance by S. schenckii based on in silico structural analysis of CYP51. Genome data-mining of S. schenckii revealed 40 P450s in its genome that can be grouped into 32 P450 families and 39 P450 subfamilies. Comprehensive comparative analysis of P450s revealed that S. schenckii shares 11 P450 families with plant pathogenic fungi and has three unique P450 families: CYP5077, CYP5386 and CYP5696 (novel family). Among P450s, CYP51, the main target of azole drugs was also found in S. schenckii. 3D modeling of S. schenckii CYP51 revealed the presence of characteristic P450 motifs with exceptionally large reductase interaction site 2. In silico analysis revealed number of mutations that can be associated with ketoconazole resistance, especially at the channel entrance to the active site. One of possible reason for better stabilization of itraconazole, compared to ketoconazole, is that the more extended molecule of itraconazole may form a hydrogen bond with ASN-230. This in turn may explain its effectiveness against S. schenckii vis-a-vis resistant to ketoconazole. This article is part of a Special Issue entitled: Cytochrome P450 biodiversity and biotechnology, edited by Erika Plettner, Gianfranco Gilardi, Luet Wong, Vlada Urlacher, Jared Goldstone.
Collapse
Affiliation(s)
- Retshedisitswe Godfrey Matowane
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Lukasz Wieteska
- Laboratory of Theory of Biopolymers, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Hans Denis Bamal
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Ipeleng Kopano Rosinah Kgosiemang
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Mari Van Wyk
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Nessie Agnes Manume
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Sara Mohamed Hasaan Abdalla
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Samson Sitheni Mashele
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa
| | - Dominik Gront
- Laboratory of Theory of Biopolymers, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Khajamohiddin Syed
- Unit for Drug Discovery Research, Department of Health Sciences, Faculty of Health and Environmental Sciences, Central University of Technology, Bloemfontein 9300, Free State, South Africa.
| |
Collapse
|
8
|
Sluchanko NN, Beelen S, Kulikova AA, Weeks SD, Antson AA, Gusev NB, Strelkov SV. Structural Basis for the Interaction of a Human Small Heat Shock Protein with the 14-3-3 Universal Signaling Regulator. Structure 2017; 25:305-316. [PMID: 28089448 PMCID: PMC5321513 DOI: 10.1016/j.str.2016.12.005] [Citation(s) in RCA: 88] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 11/14/2016] [Accepted: 12/12/2016] [Indexed: 12/31/2022]
Abstract
By interacting with hundreds of protein partners, 14-3-3 proteins coordinate vital cellular processes. Phosphorylation of the small heat shock protein, HSPB6, within its intrinsically disordered N-terminal domain activates its interaction with 14-3-3, ultimately triggering smooth muscle relaxation. After analyzing the binding of an HSPB6-derived phosphopeptide to 14-3-3 using isothermal calorimetry and X-ray crystallography, we have determined the crystal structure of the complete assembly consisting of the 14-3-3 dimer and full-length HSPB6 dimer and further characterized this complex in solution using fluorescence spectroscopy, small-angle X-ray scattering, and limited proteolysis. We show that selected intrinsically disordered regions of HSPB6 are transformed into well-defined conformations upon the interaction, whereby an unexpectedly asymmetric structure is formed. This structure provides the first atomic resolution snapshot of a human small HSP in functional state, explains how 14-3-3 proteins sequester their regulatory partners, and can inform the design of small-molecule interaction modifiers to be used as myorelaxants.
Collapse
Affiliation(s)
- Nikolai N Sluchanko
- Laboratory of Structural Biochemistry of Proteins, A.N. Bach Institute of Biochemistry, Federal Research Center "Fundamentals of Biotechnology", Russian Academy of Sciences, 119071 Moscow, Russia.
| | - Steven Beelen
- Laboratory for Biocrystallography, Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, 3000 Leuven, Belgium
| | - Alexandra A Kulikova
- Laboratory of Protein Conformational Polymorphism in Health and Disease, Engelhardt Institute of Molecular Biology, 119991 Moscow, Russia
| | - Stephen D Weeks
- Laboratory for Biocrystallography, Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, 3000 Leuven, Belgium
| | - Alfred A Antson
- York Structural Biology Laboratory, Department of Chemistry, University of York, York YO10 5YW, UK
| | - Nikolai B Gusev
- Department of Biochemistry, School of Biology, Moscow State University, 119991 Moscow, Russia
| | - Sergei V Strelkov
- Laboratory for Biocrystallography, Department of Pharmaceutical and Pharmacological Sciences, KU Leuven, 3000 Leuven, Belgium.
| |
Collapse
|
9
|
Leelananda SP, Kloczkowski A, Jernigan RL. Fold-specific sequence scoring improves protein sequence matching. BMC Bioinformatics 2016; 17:328. [PMID: 27578239 PMCID: PMC5006591 DOI: 10.1186/s12859-016-1198-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 08/24/2016] [Indexed: 11/10/2022] Open
Abstract
Background Sequence matching is extremely important for applications throughout biology, particularly for discovering information such as functional and evolutionary relationships, and also for discriminating between unimportant and disease mutants. At present the functions of a large fraction of genes are unknown; improvements in sequence matching will improve gene annotations. Universal amino acid substitution matrices such as Blosum62 are used to measure sequence similarities and to identify distant homologues, regardless of the structure class. However, such single matrices do not take into account important structural information evident within the different topologies of proteins and treats substitutions within all protein folds identically. Others have suggested that the use of structural information can lead to significant improvements in sequence matching but this has not yet been very effective. Here we develop novel substitution matrices that include not only general sequence information but also have a topology specific component that is unique for each CATH topology. This novel feature of using a combination of sequence and structure information for each protein topology significantly improves the sequence matching scores for the sequence pairs tested. We have used a novel multi-structure alignment method for each homology level of CATH in order to extract topological information. Results We obtain statistically significant improved sequence matching scores for 73 % of the alpha helical test cases. On average, 61 % of the test cases showed improvements in homology detection when structure information was incorporated into the substitution matrices. On average z-scores for homology detection are improved by more than 54 % for all cases, and some individual cases have z-scores more than twice those obtained using generic matrices. Our topology specific similarity matrices also outperform other traditional similarity matrices and single matrix based structure methods. When default amino acid substitution matrix in the Psi-blast algorithm is replaced by our structure-based matrices, the structure matching is significantly improved over conventional Psi-blast. It also outperforms results obtained for the corresponding HMM profiles generated for each topology. Conclusions We show that by incorporating topology-specific structure information in addition to sequence information into specific amino acid substitution matrices, the sequence matching scores and homology detection are significantly improved. Our topology specific similarity matrices outperform other traditional similarity matrices, single matrix based structure methods, also show improvement over conventional Psi-blast and HMM profile based methods in sequence matching. The results support the discriminatory ability of the new amino acid similarity matrices to distinguish between distant homologs and structurally dissimilar pairs. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1198-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sumudu P Leelananda
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.,Present Address: 2120 Newman and Wolfrom Laboratory, The Ohio State University, 100 W 18th Ave, Columbus, OH, 43210, USA.,Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA
| | - Andrzej Kloczkowski
- Present Address: Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Present Address: Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA
| | - Robert L Jernigan
- Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA. .,Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, 112 Office and Lab Building, Ames, IA, 50011-3020, USA.
| |
Collapse
|
10
|
Chowdhury SD, Sarkar AK, Lahiri A. Effect of Inactivating Mutations on Peptide Conformational Ensembles: The Plant Polypeptide Hormone Systemin. J Chem Inf Model 2016; 56:1267-81. [PMID: 27341535 DOI: 10.1021/acs.jcim.5b00666] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
As part of their basal immune mechanism against insect/herbivore attacks, plants have evolved systemic response mechanisms. Such a systemic wound response in tomato was found to involve an 18 amino acid polypeptide called systemin, the first polypeptide hormone to be discovered in plants. Systematic alanine scanning and deletion studies showed differential modulation in its activity, particularly a major loss of function due to alanine substitution at positions 13 and 17 and less extentive loss of function due to substitution at position 12. We have studied the conformational ensembles of wild-type systemin along with its 17 variants by carrying out a total of 5.76 μs of replica-exchange molecular dynamics simulation in an implicit solvent environment. In our simulations, wild-type systemin showed a lack of α-helical and β-sheet structures, in conformity with earlier circular dichroism and NMR data. On the other hand, two regions containing diproline segments showed a tendency to adopt polyproline II structures. Examination of conformational ensembles of the 17 variants revealed a change in the population distributions, suggesting a less flexible structure for alanine substitutions at positions 12 and 13 but not for position 17. Combined with the experimental observations that positions 1-14 of systemin are important for the formation of the peptide-receptor complex, this leads to the hypothesis that loss of conformational flexibility may play a role in the loss of activity of systemin due to the P12A and P13A substitutions, while T17A deactivation probably occurs for a different reason, most likely the loss of the threonine phosphorylation site. We also indicate possible structural reasons why the substitution of the prolines at positions 12 and 13 leads to a loss of conformational freedom in the peptide.
Collapse
Affiliation(s)
- Saikat Dutta Chowdhury
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta , 92 Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India
| | - Aditya K Sarkar
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta , 92 Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India
| | - Ansuman Lahiri
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta , 92 Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India
| |
Collapse
|
11
|
Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev 2016; 116:7898-936. [DOI: 10.1021/acs.chemrev.6b00163] [Citation(s) in RCA: 555] [Impact Index Per Article: 69.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sebastian Kmiecik
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Bioinformatics
Laboratory, Mossakowski Medical Research Center of the Polish Academy of Sciences, Pawinskiego 5, 02-106 Warsaw, Poland
| | - Lukasz Wieteska
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
- Department
of Medical Biochemistry, Medical University of Lodz, Mazowiecka 6/8, 92-215 Lodz, Poland
| | | | - Andrzej Kolinski
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
12
|
Geidl S, Svobodová Vařeková R, Bendová V, Petrusek L, Ionescu CM, Jurka Z, Abagyan R, Koča J. How Does the Methodology of 3D Structure Preparation Influence the Quality of pKa Prediction? J Chem Inf Model 2015; 55:1088-97. [PMID: 26010215 DOI: 10.1021/ci500758w] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The acid dissociation constant is an important molecular property, and it can be successfully predicted by Quantitative Structure-Property Relationship (QSPR) models, even for in silico designed molecules. We analyzed how the methodology of in silico 3D structure preparation influences the quality of QSPR models. Specifically, we evaluated and compared QSPR models based on six different 3D structure sources (DTP NCI, Pubchem, Balloon, Frog2, OpenBabel, and RDKit) combined with four different types of optimization. These analyses were performed for three classes of molecules (phenols, carboxylic acids, anilines), and the QSPR model descriptors were quantum mechanical (QM) and empirical partial atomic charges. Specifically, we developed 516 QSPR models and afterward systematically analyzed the influence of the 3D structure source and other factors on their quality. Our results confirmed that QSPR models based on partial atomic charges are able to predict pKa with high accuracy. We also confirmed that ab initio and semiempirical QM charges provide very accurate QSPR models and using empirical charges based on electronegativity equalization is also acceptable, as well as advantageous, because their calculation is very fast. On the other hand, Gasteiger-Marsili empirical charges are not applicable for pKa prediction. We later found that QSPR models for some classes of molecules (carboxylic acids) are less accurate. In this context, we compared the influence of different 3D structure sources. We found that an appropriate selection of 3D structure source and optimization method is essential for the successful QSPR modeling of pKa. Specifically, the 3D structures from the DTP NCI and Pubchem databases performed the best, as they provided very accurate QSPR models for all the tested molecular classes and charge calculation approaches, and they do not require optimization. Also, Frog2 performed very well. Other 3D structure sources can also be used but are not so robust, and an unfortunate combination of molecular class and charge calculation approach can produce weak QSPR models. Additionally, these 3D structures generally need optimization in order to produce good quality QSPR models.
Collapse
Affiliation(s)
- Stanislav Geidl
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Radka Svobodová Vařeková
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Veronika Bendová
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Lukáš Petrusek
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Crina-Maria Ionescu
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Zdeněk Jurka
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| | - Ruben Abagyan
- ‡Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, 9500 Gilman Drive, MC 0657, San Diego, California 92161, United States
| | - Jaroslav Koča
- †National Centre for Biomolecular Research, Faculty of Science, and CEITEC - Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 625 00 Brno-Bohunice, Czech Republic
| |
Collapse
|
13
|
Improving thermal stability of thermophilic l -threonine aldolase from Thermotoga maritima. J Biotechnol 2015; 199:69-76. [DOI: 10.1016/j.jbiotec.2015.02.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Revised: 02/10/2015] [Accepted: 02/11/2015] [Indexed: 11/20/2022]
|
14
|
Kmiecik S, Jamroz M, Kolinski M. Structure prediction of the second extracellular loop in G-protein-coupled receptors. Biophys J 2015; 106:2408-16. [PMID: 24896119 PMCID: PMC4052351 DOI: 10.1016/j.bpj.2014.04.022] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Revised: 03/26/2014] [Accepted: 04/17/2014] [Indexed: 12/29/2022] Open
Abstract
G-protein-coupled receptors (GPCRs) play key roles in living organisms. Therefore, it is important to determine their functional structures. The second extracellular loop (ECL2) is a functionally important region of GPCRs, which poses significant challenge for computational structure prediction methods. In this work, we evaluated CABS, a well-established protein modeling tool for predicting ECL2 structure in 13 GPCRs. The ECL2s (with between 13 and 34 residues) are predicted in an environment of other extracellular loops being fully flexible and the transmembrane domain fixed in its x-ray conformation. The modeling procedure used theoretical predictions of ECL2 secondary structure and experimental constraints on disulfide bridges. Our approach yielded ensembles of low-energy conformers and the most populated conformers that contained models close to the available x-ray structures. The level of similarity between the predicted models and x-ray structures is comparable to that of other state-of-the-art computational methods. Our results extend other studies by including newly crystallized GPCRs.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- University of Warsaw, Faculty of Chemistry, Laboratory of Theory of Biopolymers, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Jamroz
- University of Warsaw, Faculty of Chemistry, Laboratory of Theory of Biopolymers, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Mossakowski Medical Research Center, Polish Academy of Sciences, Bioinformatics Laboratory, Pawinskiego 5, 02-106 Warsaw, Poland.
| |
Collapse
|
15
|
Kim H, Kihara D. Detecting local residue environment similarity for recognizing near-native structure models. Proteins 2014; 82:3255-72. [PMID: 25132526 PMCID: PMC4237674 DOI: 10.1002/prot.24658] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 06/10/2014] [Accepted: 07/21/2014] [Indexed: 12/14/2022]
Abstract
We developed a new representation of local amino acid environments in protein structures called the Side-chain Depth Environment (SDE). An SDE defines a local structural environment of a residue considering the coordinates and the depth of amino acids that locate in the vicinity of the side-chain centroid of the residue. SDEs are general enough that similar SDEs are found in protein structures with globally different folds. Using SDEs, we developed a procedure called PRESCO (Protein Residue Environment SCOre) for selecting native or near-native models from a pool of computational models. The procedure searches similar residue environments observed in a query model against a set of representative native protein structures to quantify how native-like SDEs in the model are. When benchmarked on commonly used computational model datasets, our PRESCO compared favorably with the other existing scoring functions in selecting native and near-native models.
Collapse
Affiliation(s)
- Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47906, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette IN, 47906, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
16
|
Gniewek P, Kolinski A, Kloczkowski A, Gront D. BioShell-Threading: versatile Monte Carlo package for protein 3D threading. BMC Bioinformatics 2014; 15:22. [PMID: 24444459 PMCID: PMC3937128 DOI: 10.1186/1471-2105-15-22] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2012] [Accepted: 11/18/2013] [Indexed: 11/26/2022] Open
Abstract
Background The comparative modeling approach to protein structure prediction inherently relies on a template structure. Before building a model such a template protein has to be found and aligned with the query sequence. Any error made on this stage may dramatically affects the quality of result. There is a need, therefore, to develop accurate and sensitive alignment protocols. Results BioShell threading software is a versatile tool for aligning protein structures, protein sequences or sequence profiles and query sequences to a template structures. The software is also capable of sub-optimal alignment generation. It can be executed as an application from the UNIX command line, or as a set of Java classes called from a script or a Java application. The implemented Monte Carlo search engine greatly facilitates the development and benchmarking of new alignment scoring schemes even when the functions exhibit non-deterministic polynomial-time complexity. Conclusions Numerical experiments indicate that the new threading application offers template detection abilities and provides much better alignments than other methods. The package along with documentation and examples is available at: http://bioshell.pl/threading3d.
Collapse
Affiliation(s)
| | | | | | - Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| |
Collapse
|
17
|
Combining coarse-grained protein models with replica-exchange all-atom molecular dynamics. Int J Mol Sci 2013; 14:9893-905. [PMID: 23665897 PMCID: PMC3676820 DOI: 10.3390/ijms14059893] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Revised: 04/09/2013] [Accepted: 04/24/2013] [Indexed: 01/30/2023] Open
Abstract
We describe a combination of all-atom simulations with CABS, a well-established coarse-grained protein modeling tool, into a single multiscale protocol. The simulation method has been tested on the C-terminal beta hairpin of protein G, a model system of protein folding. After reconstructing atomistic details, conformations derived from the CABS simulation were subjected to replica-exchange molecular dynamics simulations with OPLS-AA and AMBER99sb force fields in explicit solvent. Such a combination accelerates system convergence several times in comparison with all-atom simulations starting from the extended chain conformation, demonstrated by the analysis of melting curves, the number of native-like conformations as a function of time and secondary structure propagation. The results strongly suggest that the proposed multiscale method could be an efficient and accurate tool for high-resolution studies of protein folding dynamics in larger systems.
Collapse
|
18
|
patGPCR: a multitemplate approach for improving 3D structure prediction of transmembrane helices of G-protein-coupled receptors. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2013; 2013:486125. [PMID: 23554839 PMCID: PMC3608176 DOI: 10.1155/2013/486125] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2012] [Revised: 01/10/2013] [Accepted: 01/16/2013] [Indexed: 11/17/2022]
Abstract
The structures of the seven transmembrane helices of G-protein-coupled receptors are critically involved in many aspects of these receptors, such as receptor stability, ligand docking, and molecular function. Most of the previous multitemplate approaches have built a "super" template with very little merging of aligned fragments from different templates. Here, we present a parallelized multitemplate approach, patGPCR, to predict the 3D structures of transmembrane helices of G-protein-coupled receptors. patGPCR, which employs a bundle-packing related energy function that extends on the RosettaMem energy, parallelizes eight pipelines for transmembrane helix refinement and exchanges the optimized helix structures from multiple templates. We have investigated the performance of patGPCR on a test set containing eight determined G-protein-coupled receptors. The results indicate that patGPCR improves the TM RMSD of the predicted models by 33.64% on average against a single-template method. Compared with other homology approaches, the best models for five of the eight targets built by patGPCR had a lower TM RMSD than that obtained from SWISS-MODEL; patGPCR also showed lower average TM RMSD than single-template and multiple-template MODELLER.
Collapse
|
19
|
Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures. ACTA ACUST UNITED AC 2012; 13:213-25. [PMID: 23086054 DOI: 10.1007/s10969-012-9146-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2012] [Accepted: 09/26/2012] [Indexed: 12/19/2022]
Abstract
The explosion of the size of the universe of known protein sequences has stimulated two complementary approaches to structural mapping of these sequences: theoretical structure prediction and experimental determination by structural genomics (SG). In this work, we assess the accuracy of structure prediction by two automated template-based structure prediction metaservers (genesilico.pl and bioinfo.pl) by measuring the structural similarity of the predicted models to corresponding experimental models determined a posteriori. Of 199 targets chosen from SG programs, the metaservers predicted the structures of about a fourth of them "correctly." (In this case, "correct" was defined as placing more than 70 % of the alpha carbon atoms in the model within 2 Å of the experimentally determined positions.) Almost all of the targets that could be modeled to this accuracy were those with an available template in the Protein Data Bank (PDB) with more than 25 % sequence identity. The majority of those SG targets with lower sequence identity to structures in the PDB were not predicted by the metaservers with this accuracy. We also compared metaserver results to CASP8 results, finding that the models obtained by participants in the CASP competition were significantly better than those produced by the metaservers.
Collapse
|
20
|
Lü Q, Xia XY, Chen R, Miao DJ, Chen SS, Quan LJ, Li HO. When the lowest energy does not induce native structures: parallel minimization of multi-energy values by hybridizing searching intelligences. PLoS One 2012; 7:e44967. [PMID: 23028708 PMCID: PMC3460973 DOI: 10.1371/journal.pone.0044967] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2012] [Accepted: 08/16/2012] [Indexed: 12/03/2022] Open
Abstract
Background Protein structure prediction (PSP), which is usually modeled as a computational optimization problem, remains one of the biggest challenges in computational biology. PSP encounters two difficult obstacles: the inaccurate energy function problem and the searching problem. Even if the lowest energy has been luckily found by the searching procedure, the correct protein structures are not guaranteed to obtain. Results A general parallel metaheuristic approach is presented to tackle the above two problems. Multi-energy functions are employed to simultaneously guide the parallel searching threads. Searching trajectories are in fact controlled by the parameters of heuristic algorithms. The parallel approach allows the parameters to be perturbed during the searching threads are running in parallel, while each thread is searching the lowest energy value determined by an individual energy function. By hybridizing the intelligences of parallel ant colonies and Monte Carlo Metropolis search, this paper demonstrates an implementation of our parallel approach for PSP. 16 classical instances were tested to show that the parallel approach is competitive for solving PSP problem. Conclusions This parallel approach combines various sources of both searching intelligences and energy functions, and thus predicts protein conformations with good quality jointly determined by all the parallel searching threads and energy functions. It provides a framework to combine different searching intelligence embedded in heuristic algorithms. It also constructs a container to hybridize different not-so-accurate objective functions which are usually derived from the domain expertise.
Collapse
Affiliation(s)
- Qiang Lü
- School of Computer Science and Technology, Soochow University, Suzhou, China.
| | | | | | | | | | | | | |
Collapse
|
21
|
Gniewek P, Kolinski A, Gront D. Optimization of profile-to-profile alignment parameters for one-dimensional threading. J Comput Biol 2012; 19:879-86. [PMID: 22731622 DOI: 10.1089/cmb.2011.0307] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The development of automatic approaches for the comparison of protein sequences has become increasingly important. Methods that compare profiles allow for the use of information about whole protein families, resulting in more sensitive and accurate detection of distantly related sequences. In this contribution, we describe a thorough optimization and tests of a profile-to-profile alignment method. A number of different scoring schemes has been implemented and compared on the basis of their ability to identify a template protein from the same SCOP family as a query. In addition to sequence profiles, secondary structure profiles were used to increase the rate of successful detection. Our results show that a properly tuned one-dimensional threading method can recognize a correct template from the same SCOP family nearly as well as structural alignment. Our benchmark set, which might be useful in other similar studies, as well as the fold-recognition software we developed may be downloaded (www.bioshell.pl/profile-alignments).
Collapse
Affiliation(s)
- Pawel Gniewek
- Faculty of Chemistry, Warsaw University, Warsaw, Poland
| | | | | |
Collapse
|
22
|
Gront D, Blaszczyk M, Wojciechowski P, Kolinski A. BioShell Threader: protein homology detection based on sequence profiles and secondary structure profiles. Nucleic Acids Res 2012; 40:W257-62. [PMID: 22693216 PMCID: PMC3394251 DOI: 10.1093/nar/gks555] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The BioShell package has recently been extended with a web server for protein homology detection based on profile-to-profile alignment (known as 1D threading). Its aim is to assign structural templates to each domain of the query. The server uses sequence profiles that describe observed sequence variability and secondary structure profiles providing expected probability for a certain secondary structure type at a given position in a protein. Three independent predictors are used to increase the rate of successful predictions. Careful evaluation shows that there is nearly 80% chance that the query sequence belongs to the same SCOP family as the top scoring template. The Bioshell Threader server is freely available at: http://www.bioshell.pl/threader/.
Collapse
Affiliation(s)
- Dominik Gront
- University of Warsaw, Faculty of Chemistry, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | | | |
Collapse
|
23
|
Kmiecik S, Gront D, Kouza M, Kolinski A. From coarse-grained to atomic-level characterization of protein dynamics: transition state for the folding of B domain of protein A. J Phys Chem B 2012; 116:7026-32. [PMID: 22486297 DOI: 10.1021/jp301720w] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Atomic-level molecular dynamics simulations are widely used for the characterization of the structural dynamics of proteins; however, they are limited to shorter time scales than the duration of most of the relevant biological processes. Properly designed coarse-grained models that trade atomic resolution for efficient sampling allow access to much longer time-scales. In-depth understanding of the structural dynamics, however, must involve atomic details. In this study, we tested a method for the rapid reconstruction of all-atom models from α carbon atom positions in the application to convert a coarse-grained folding trajectory of a well described model system: the B domain of protein A. The results show that the method and the spatial resolution of the resulting coarse-grained models enable computationally inexpensive reconstruction of realistic all-atom models. Additionally, by means of structural clustering, we determined the most persistent ensembles of the key folding step, the transition state. Importantly, the analysis of the overall structural topologies suggests a dominant folding pathway. This, together with the all-atom characterization of the obtained ensembles, in the form of contact maps, matches the experimental results well.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | | | |
Collapse
|
24
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
25
|
Chruszcz M, Pomés A, Glesner J, Vailes LD, Osinski T, Porebski PJ, Majorek KA, Heymann PW, Platts-Mills TAE, Minor W, Chapman MD. Molecular determinants for antibody binding on group 1 house dust mite allergens. J Biol Chem 2011; 287:7388-98. [PMID: 22210776 DOI: 10.1074/jbc.m111.311159] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
House dust mites produce potent allergens, Der p 1 and Der f 1, that cause allergic sensitization and asthma. Der p 1 and Der f 1 are cysteine proteases that elicit IgE responses in 80% of mite-allergic subjects and have proinflammatory properties. Their antigenic structure is unknown. Here, we present crystal structures of natural Der p 1 and Der f 1 in complex with a monoclonal antibody, 4C1, which binds to a unique cross-reactive epitope on both allergens associated with IgE recognition. The 4C1 epitope is formed by almost identical amino acid sequences and contact residues. Mutations of the contact residues abrogate mAb 4C1 binding and reduce IgE antibody binding. These surface-exposed residues are molecular targets that can be exploited for development of recombinant allergen vaccines.
Collapse
Affiliation(s)
- Maksymilian Chruszcz
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, Virginia 22908, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Plonska-Ocypa K, Sibilska I, Sicinski RR, Sicinska W, Plum LA, DeLuca HF. 13,13-Dimethyl-des-C,D analogues of (20S)-1α,25-dihydroxy-2-methylene-19-norvitamin D₃ (2MD): total synthesis, docking to the VDR, and biological evaluation. Bioorg Med Chem 2011; 19:7205-20. [PMID: 22018918 DOI: 10.1016/j.bmc.2011.09.048] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2011] [Revised: 09/21/2011] [Accepted: 09/24/2011] [Indexed: 11/30/2022]
Abstract
As a continuation of our studies focused on the vitamin D compounds lacking the C,D-hydrindane system, 13,13-dimethyl-des-C,D analogues of (20S)-1α,25-dihydroxy-2-methylene-19-norvitamin D(3) (2, 2MD) were prepared by total synthesis. The known cyclohexanone 30, a precursor of the desired A-ring phosphine oxide 11, was synthesized starting with the keto acetal 13, whereas the aldehyde 12, constituting an acyclic 'upper' building block, was obtained from the isomeric esters 34, prepared previously in our laboratory. The commercial 1,4-cyclohexanedione monoethylene ketal (13) was enantioselectively α-hydroxylated utilizing the α-aminoxylation process catalyzed by l-proline, and the introduced hydroxy group was protected as a TBS, TPDPS, and SEM ether. Then the keto group in the obtained compounds 15-17 was methylenated and the allylic hydroxylation was performed with selenium dioxide and pyridine N-oxide. After separation of the isomers, the newly introduced hydroxy group was protected and the ketal group hydrolyzed to yield the corresponding protected (3R,5R)-3,5-dihydroxycyclohexanones 30-32. The esters 34, starting compounds for the C,D-fragment 12, were first α-methylated, then reduced and the resulted primary alcohols 36 were deoxygenated using the Barton-McCombie protocol. Primary hydroxy group in the obtained diether 38 was deprotected and oxidized to furnish the aldehyde 12. The Wittig-Horner coupling of the latter with the anion of the phosphine oxide 11, followed by hydroxyl deprotection furnished two isomeric 13,13-dimethyl-des-C,D analogues of 2MD (compounds 10 and 42) differing in configuration of their 7,8-double bond. Pure vitamin D analogues were isolated by HPLC and their biological activity was examined. The in vitro tests indicated that, compared to the analogue 7, unsubstituted at C-13, the synthesized vitamin D analogue 10 showed markedly improved VDR binding ability, significantly enhanced HL-60 differentiation activity as well as increased transcriptional potency. Docking simulations provided a rational explanation for the observed binding affinity of these ligands to the VDR. Biological in vivo tests proved that des-C,D compound 10 retained some intestinal activity. Its geometrical isomer 42 was devoid of any biological activity.
Collapse
Affiliation(s)
- Katarzyna Plonska-Ocypa
- Department of Biochemistry, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI 53706, USA
| | | | | | | | | | | |
Collapse
|
27
|
Gront D, Kulp DW, Vernon RM, Strauss CEM, Baker D. Generalized fragment picking in Rosetta: design, protocols and applications. PLoS One 2011; 6:e23294. [PMID: 21887241 PMCID: PMC3160850 DOI: 10.1371/journal.pone.0023294] [Citation(s) in RCA: 132] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2011] [Accepted: 07/12/2011] [Indexed: 11/21/2022] Open
Abstract
The Rosetta de novo structure prediction and loop modeling protocols begin with coarse grained Monte Carlo searches in which the moves are based on short fragments extracted from a database of known structures. Here we describe a new object oriented program for picking fragments that greatly extends the functionality of the previous program (nnmake) and opens the door for new approaches to structure modeling. We provide a detailed description of the code design and architecture, highlighting its modularity, and new features such as extensibility, total control over the fragment picking workflow and scoring system customization. We demonstrate that the program provides at least as good building blocks for ab-initio structure prediction as the previous program, and provide examples of the wide range of applications that are now accessible.
Collapse
Affiliation(s)
- Dominik Gront
- Faculty of Chemistry, University of Warsaw, Warsaw, Poland.
| | | | | | | | | |
Collapse
|
28
|
Kmiecik S, Kolinski A. Simulation of chaperonin effect on protein folding: a shift from nucleation-condensation to framework mechanism. J Am Chem Soc 2011; 133:10283-9. [PMID: 21618995 PMCID: PMC3132998 DOI: 10.1021/ja203275f] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The iterative annealing mechanism (IAM) of chaperonin-assisted protein folding is explored in a framework of a well-established coarse-grained protein modeling tool, which enables the study of protein dynamics in a time-scale well beyond classical all-atom molecular mechanics. The chaperonin mechanism of action is simulated for two paradigm systems of protein folding, B domain of protein A (BdpA) and B1 domain of protein G (GB1), and compared to chaperonin-free simulations presented here for BdpA and recently published for GB1. The prediction of the BdpA transition state ensemble (TSE) is in perfect agreement with experimental findings. It is shown that periodic distortion of the polypeptide chains by hydrophobic chaperonin interactions can promote rapid folding and leads to a decrease in folding temperature. It is also demonstrated how chaperonin action prevents kinetically trapped conformations and modulates the observed folding mechanisms from nucleation-condensation to a more framework-like.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
29
|
Klimecka MM, Chruszcz M, Font J, Skarina T, Shumilin I, Onopryienko O, Porebski PJ, Cymborowski M, Zimmerman MD, Hasseman J, Glomski IJ, Lebioda L, Savchenko A, Edwards A, Minor W. Structural analysis of a putative aminoglycoside N-acetyltransferase from Bacillus anthracis. J Mol Biol 2011; 410:411-23. [PMID: 21601576 DOI: 10.1016/j.jmb.2011.04.076] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Revised: 04/04/2011] [Accepted: 04/29/2011] [Indexed: 11/19/2022]
Abstract
For the last decade, worldwide efforts for the treatment of anthrax infection have focused on developing effective vaccines. Patients that are already infected are still treated traditionally using different types of standard antimicrobial agents. The most popular are antibiotics such as tetracyclines and fluoroquinolones. While aminoglycosides appear to be less effective antimicrobial agents than other antibiotics, synthetic aminoglycosides have been shown to act as potent inhibitors of anthrax lethal factor and may have potential application as antitoxins. Here, we present a structural analysis of the BA2930 protein, a putative aminoglycoside acetyltransferase, which may be a component of the bacterium's aminoglycoside resistance mechanism. The determined structures revealed details of a fold characteristic only for one other protein structure in the Protein Data Bank, namely, YokD from Bacillus subtilis. Both BA2930 and YokD are members of the Antibiotic_NAT superfamily (PF02522). Sequential and structural analyses showed that residues conserved throughout the Antibiotic_NAT superfamily are responsible for the binding of the cofactor acetyl coenzyme A. The interaction of BA2930 with cofactors was characterized by both crystallographic and binding studies.
Collapse
Affiliation(s)
- Maria M Klimecka
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22908, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Kurcinski M, Kolinski A. Theoretical study of molecular mechanism of binding TRAP220 coactivator to Retinoid X Receptor alpha, activated by 9-cis retinoic acid. J Steroid Biochem Mol Biol 2010; 121:124-9. [PMID: 20398753 PMCID: PMC2906686 DOI: 10.1016/j.jsbmb.2010.03.086] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2009] [Accepted: 03/26/2010] [Indexed: 01/22/2023]
Abstract
Study on molecular mechanism of conformational reorientation of RXR-alpha ligand binding domain is presented. We employed CABS--a reduced model of protein dynamics to model folding pathways of binding 9-cis retinoic acid to apo-RXR molecule and TRAP220 peptide fragment to the holo form. Based on obtained results we also propose a sequential model of RXR activation by 9-cis retinoic acid and TRAP220 coactivator. Methodology presented here may be used for investigation of binding pathways of other NR/hormone/cofactor sets.
Collapse
Affiliation(s)
- Mateusz Kurcinski
- Department of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
31
|
Wang S, Kirillova O, Chruszcz M, Gront D, Zimmerman MD, Cymborowski MT, Shumilin IA, Skarina T, Gorodichtchenskaia E, Savchenko A, Edwards AM, Minor W. The crystal structure of the AF2331 protein from Archaeoglobus fulgidus DSM 4304 forms an unusual interdigitated dimer with a new type of alpha + beta fold. Protein Sci 2009; 18:2410-9. [PMID: 19768810 PMCID: PMC2788295 DOI: 10.1002/pro.251] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2009] [Accepted: 09/09/2009] [Indexed: 11/10/2022]
Abstract
The structure of AF2331, a 11-kDa orphan protein of unknown function from Archaeoglobus fulgidus, was solved by Se-Met MAD to 2.4 A resolution. The structure consists of an alpha + beta fold formed by an unusual homodimer, where the two core beta-sheets are interdigitated, containing strands alternating from both subunits. The decrease in solvent-accessible surface area upon dimerization is unusually large (3960 A(2)) for a protein of its size. The percentage of the total surface area buried in the interface (41.1%) is one of the largest observed in a nonredundant set of homodimers in the PDB and is above the mean for nearly all other types of homo-oligomers. AF2331 has no sequence homologs, and no structure similar to AF2331 could be found in the PDB using the CE, TM-align, DALI, or SSM packages. The protein has been identified in Pfam 23.0 as the archetype of a new superfamily and is topologically dissimilar to all other proteins with the "3-Layer (BBA) Sandwich" fold in CATH. Therefore, we propose that AF2331 forms a novel alpha + beta fold. AF2331 contains multiple negatively charged surface clusters and is located on the same operon as the basic protein AF2330. We hypothesize that AF2331 and AF2330 may form a charge-stabilized complex in vivo, though the role of the negatively charged surface clusters is not clear.
Collapse
Affiliation(s)
- Shuren Wang
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Olga Kirillova
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Maksymilian Chruszcz
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Dominik Gront
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Matthew D Zimmerman
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Marcin T Cymborowski
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Igor A Shumilin
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Tatiana Skarina
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
- Banting and Best Department of Medical Research, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Elena Gorodichtchenskaia
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
- Banting and Best Department of Medical Research, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Alexei Savchenko
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
- Banting and Best Department of Medical Research, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Aled M Edwards
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
- Banting and Best Department of Medical Research, University of TorontoToronto, Ontario M5G 1L6, Canada
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of VirginiaCharlottesville, Virginia 22908
- Midwest Center for Structural Genomics, University of TorontoToronto, Ontario M5G 1L6, Canada
| |
Collapse
|
32
|
Gront D, Kolinski A. Fast and accurate methods for predicting short-range constraints in protein models. J Comput Aided Mol Des 2008; 22:783-8. [PMID: 18415023 DOI: 10.1007/s10822-008-9213-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2007] [Accepted: 03/25/2008] [Indexed: 11/28/2022]
Abstract
Protein modeling tools utilize many kinds of structural information that may be predicted from amino acid sequence of a target protein or obtained from experiments. Such data provide geometrical constraints in a modeling process. The main aim is to generate the best possible consensus structure. The quality of models strictly depends on the imposed conditions. In this work we present an algorithm, which predicts short-range distances between Calpha atoms as well as a set of short structural fragments that possibly share structural similarity with a query sequence. The only input of the method is a query sequence profile. The algorithm searches for short protein fragments with high sequence similarity. As a result a statistics of distances observed in the similar fragments is returned. The method can be used also as a scoring function or a short-range knowledge-based potential based on the computed statistics.
Collapse
Affiliation(s)
- Dominik Gront
- Faculty of Chemistry, University of Warsaw, Warsaw, Poland.
| | | |
Collapse
|
33
|
|
34
|
Kmiecik S, Kolinski A. Folding pathway of the b1 domain of protein G explored by multiscale modeling. Biophys J 2007; 94:726-36. [PMID: 17890394 PMCID: PMC2186257 DOI: 10.1529/biophysj.107.116095] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The understanding of the folding mechanisms of single-domain proteins is an essential step in the understanding of protein folding in general. Recently, we developed a mesoscopic CA-CB side-chain protein model, which was successfully applied in protein structure prediction, studies of protein thermodynamics, and modeling of protein complexes. In this research, this model is employed in a detailed characterization of the folding process of a simple globular protein, the B1 domain of IgG-binding protein G (GB1). There is a vast body of experimental facts and theoretical findings for this protein. Performing unbiased, ab initio simulations, we demonstrated that the GB1 folding proceeds via the formation of an extended folding nucleus, followed by slow structure fine-tuning. Remarkably, a subset of native interactions drives the folding from the very beginning. The emerging comprehensive picture of GB1 folding perfectly matches and extends the previous experimental and theoretical studies.
Collapse
Affiliation(s)
| | - Andrzej Kolinski
- Address reprint requests to Andrzej Kolinski, Faculty of Chemistry, University of Warsaw, L. Pasteura 1, 02-093 Warsaw, Poland. Tel.: 48-022-8220211 ext. 320; Fax: 48-022 820221.
| |
Collapse
|
35
|
Abstract
MOTIVATION The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). RESULTS The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.
Collapse
Affiliation(s)
- Andrzej Kolinski
- University of Warsaw, Faculty of Chemistry, Pasteura 1 02-093 Warsaw, Poland
| | | |
Collapse
|
36
|
Kmiecik S, Kolinski A. Characterization of protein-folding pathways by reduced-space modeling. Proc Natl Acad Sci U S A 2007; 104:12330-5. [PMID: 17636132 PMCID: PMC1941469 DOI: 10.1073/pnas.0702265104] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Ab initio simulations of the folding pathways are currently limited to very small proteins. For larger proteins, some approximations or simplifications in protein models need to be introduced. Protein folding and unfolding are among the basic processes in the cell and are very difficult to characterize in detail by experiment or simulation. Chymotrypsin inhibitor 2 (CI2) and barnase are probably the best characterized experimentally in this respect. For these model systems, initial folding stages were simulated by using CA-CB-side chain (CABS), a reduced-space protein-modeling tool. CABS employs knowledge-based potentials that proved to be very successful in protein structure prediction. With the use of isothermal Monte Carlo (MC) dynamics, initiation sites with a residual structure and weak tertiary interactions were identified. Such structures are essential for the initiation of the folding process through a sequential reduction of the protein conformational space, overcoming the Levinthal paradox in this manner. Furthermore, nucleation sites that initiate a tertiary interactions network were located. The MC simulations correspond perfectly to the results of experimental and theoretical research and bring insights into CI2 folding mechanism: unambiguous sequence of folding events was reported as well as cooperative substructures compatible with those obtained in recent molecular dynamics unfolding studies. The correspondence between the simulation and experiment shows that knowledge-based potentials are not only useful in protein structure predictions but are also capable of reproducing the folding pathways. Thus, the results of this work significantly extend the applicability range of reduced models in the theoretical study of proteins.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093, Warsaw, Poland
| | - Andrzej Kolinski
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093, Warsaw, Poland
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
37
|
Towards the high-resolution protein structure prediction. Fast refinement of reduced models with all-atom force field. BMC STRUCTURAL BIOLOGY 2007; 7:43. [PMID: 17603876 PMCID: PMC1933428 DOI: 10.1186/1472-6807-7-43] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Accepted: 06/29/2007] [Indexed: 12/03/2022]
Abstract
Background Although experimental methods for determining protein structure are providing high resolution structures, they cannot keep the pace at which amino acid sequences are resolved on the scale of entire genomes. For a considerable fraction of proteins whose structures will not be determined experimentally, computational methods can provide valuable information. The value of structural models in biological research depends critically on their quality. Development of high-accuracy computational methods that reliably generate near-experimental quality structural models is an important, unsolved problem in the protein structure modeling. Results Large sets of structural decoys have been generated using reduced conformational space protein modeling tool CABS. Subsequently, the reduced models were subject to all-atom reconstruction. Then, the resulting detailed models were energy-minimized using state-of-the-art all-atom force field, assuming fixed positions of the alpha carbons. It has been shown that a very short minimization leads to the proper ranking of the quality of the models (distance from the native structure), when the all-atom energy is used as the ranking criterion. Additionally, we performed test on medium and low accuracy decoys built via classical methods of comparative modeling. The test placed our model evaluation procedure among the state-of-the-art protein model assessment methods. Conclusion These test computations show that a large scale high resolution protein structure prediction is possible, not only for small but also for large protein domains, and that it should be based on a hierarchical approach to the modeling protocol. We employed Molecular Mechanics with fixed alpha carbons to rank-order the all-atom models built on the scaffolds of the reduced models. Our tests show that a physic-based approach, usually considered computationally too demanding for large-scale applications, can be effectively used in such studies.
Collapse
|
38
|
Gront D, Kmiecik S, Kolinski A. Backbone building from quadrilaterals: a fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates. J Comput Chem 2007; 28:1593-1597. [PMID: 17342707 DOI: 10.1002/jcc.20624] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
In this contribution, we present an algorithm for protein backbone reconstruction that comprises very high computational efficiency with high accuracy. Reconstruction of the main chain atomic coordinates from the alpha carbon trace is a common task in protein modeling, including de novo structure prediction, comparative modeling, and processing experimental data. The method employed in this work follows the main idea of some earlier approaches to the problem. The details and careful design of the present approach are new and lead to the algorithm that outperforms all commonly used earlier applications. BBQ (Backbone Building from Quadrilaterals) program has been extensively tested both on native structures as well as on near-native decoy models and compared with the different available existing methods. Obtained results provide a comprehensive benchmark of existing tools and evaluate their applicability to a large scale modeling using a reduced representation of protein conformational space. The BBQ package is available for downloading from our website at http://biocomp.chem.uw.edu.pl/services/BBQ/. This webpage also provides a user manual that describes BBQ functions in detail.
Collapse
Affiliation(s)
- Dominik Gront
- Faculty of Chemistry, Warsaw University, Pasteura 1 02-093, Warsaw
| | | | - Andrzej Kolinski
- Faculty of Chemistry, Warsaw University, Pasteura 1 02-093, Warsaw
| |
Collapse
|
39
|
Abstract
UNLABELLED Molecular dynamics and Monte Carlo, usually conducted in canonical ensemble, deliver a plethora of biomolecular conformations. Proper analysis of the simulation data is a crucial part of biophysical and bioinformatics studies. Sequence alignment problem can be also formulated in terms of Boltzmann distribution. Therefore tools for efficient analysis of canonical ensemble data become extremely valuable. T-Pile package, presented here provides a user-friendly implementation of most important algorithms such as multihistogram analysis and reweighting technique. The package can be used in studies of virtually any system governed by Boltzmann distribution. AVAILABILITY T-Pile can be downloaded from: http://biocomp.chem.uw.edu.pl/services/tpile. These pages provide a comprehensive tutorial and documentation with illustrative examples of applications. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dominik Gront
- Warsaw University, Faculty of Chemistry, Pasteura 1 02-093 Warsaw, Poland.
| | | |
Collapse
|