1
|
Automated Protein Secondary Structure Assignment from C α Positions Using Neural Networks. Biomolecules 2022; 12:biom12060841. [PMID: 35740966 PMCID: PMC9220970 DOI: 10.3390/biom12060841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 11/17/2022] Open
Abstract
The assignment of secondary structure elements in protein conformations is necessary to interpret a protein model that has been established by computational methods. The process essentially involves labeling the amino acid residues with H (Helix), E (Strand), or C (Coil, also known as Loop). When particular atoms are absent from an input protein structure, the procedure becomes more complicated, especially when only the alpha carbon locations are known. Various techniques have been tested and applied to this problem during the last forty years. The application of machine learning techniques is the most recent trend. This contribution presents the HECA classifier, which uses neural networks to assign protein secondary structure types. The technique exclusively employs Cα coordinates. The Keras (TensorFlow) library was used to implement and train the neural network model. The BioShell toolkit was used to calculate the neural network input features from raw coordinates. The study’s findings show that neural network-based methods may be successfully used to take on structure assignment challenges when only Cα trace is available. Thanks to the careful selection of input features, our approach’s accuracy (above 97%) exceeded that of the existing methods.
Collapse
|
2
|
Structure- and ligand- based studies to gain insight into the pharmacological implications of histamine H3 receptor. Struct Chem 2021. [DOI: 10.1007/s11224-020-01711-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
3
|
Acharya V, Chakraborty HJ, Rout AK, Balabantaray S, Behera BK, Das BK. Structural Characterization of Open Reading Frame-Encoded Functional Genes from Tilapia Lake Virus (TiLV). Mol Biotechnol 2020; 61:945-957. [PMID: 31664705 DOI: 10.1007/s12033-019-00217-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In recent years, large-scale mortalities are observed in tilapia due to infection with a novel orthomyxo-like virus named, tilapia lake virus (TiLV) which is marked to be a severe threat to universal tilapia industry. Currently, there are knowledge gaps relating to the antiviral peptide as well as there are no affordable vaccines or drugs available against TiLV yet. To understand the spreading of infection of TiLV in different organs of Oreochromis niloticus, RT-PCR analysis has been carried out. The gene segments of TiLV were retrieved from the NCBI database for computational biology analysis. The 14 functional genes were predicted from the 10 gene segments of TiLV. Phylogenetic analysis was employed to find out a better understanding for the evolution of tilapia lake virus genes. Out of 14 proteins, only six proteins show transmembrane helix region. Moreover, molecular modeling and molecular dynamics simulations of the predicted proteins revealed structural stability of the protein stabilized after 10-ns simulation. Overall, our study provided a basic bioinformatics on functional proteome of TiLV. Further, this study could be useful for development of novel peptide-based therapeutics to control TiLV infection.
Collapse
Affiliation(s)
- Varsha Acharya
- Biotechnology Laboratory, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, West Bengal, 700120, India
| | - Hirak Jyoti Chakraborty
- Biotechnology Laboratory, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, West Bengal, 700120, India
| | - Ajaya Kumar Rout
- Biotechnology Laboratory, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, West Bengal, 700120, India
| | - Sucharita Balabantaray
- Department of Bioinformatics, Odisha University of Agriculture and Technology, Bhubaneswar, Odisha, 751003, India
| | - Bijay Kumar Behera
- Biotechnology Laboratory, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, West Bengal, 700120, India
| | - Basanta Kumar Das
- Biotechnology Laboratory, ICAR-Central Inland Fisheries Research Institute, Barrackpore, Kolkata, West Bengal, 700120, India.
| |
Collapse
|
4
|
Blaszczyk M, Gront D, Kmiecik S, Kurcinski M, Kolinski M, Ciemny MP, Ziolkowska K, Panek M, Kolinski A. Protein Structure Prediction Using Coarse-Grained Models. SPRINGER SERIES ON BIO- AND NEUROSYSTEMS 2019. [DOI: 10.1007/978-3-319-95843-9_2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
5
|
Assessing the accuracy of template-based structure prediction metaservers by comparison with structural genomics structures. ACTA ACUST UNITED AC 2012; 13:213-25. [PMID: 23086054 DOI: 10.1007/s10969-012-9146-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2012] [Accepted: 09/26/2012] [Indexed: 12/19/2022]
Abstract
The explosion of the size of the universe of known protein sequences has stimulated two complementary approaches to structural mapping of these sequences: theoretical structure prediction and experimental determination by structural genomics (SG). In this work, we assess the accuracy of structure prediction by two automated template-based structure prediction metaservers (genesilico.pl and bioinfo.pl) by measuring the structural similarity of the predicted models to corresponding experimental models determined a posteriori. Of 199 targets chosen from SG programs, the metaservers predicted the structures of about a fourth of them "correctly." (In this case, "correct" was defined as placing more than 70 % of the alpha carbon atoms in the model within 2 Å of the experimentally determined positions.) Almost all of the targets that could be modeled to this accuracy were those with an available template in the Protein Data Bank (PDB) with more than 25 % sequence identity. The majority of those SG targets with lower sequence identity to structures in the PDB were not predicted by the metaservers with this accuracy. We also compared metaserver results to CASP8 results, finding that the models obtained by participants in the CASP competition were significantly better than those produced by the metaservers.
Collapse
|
6
|
Gniewek P, Kolinski A, Gront D. Optimization of profile-to-profile alignment parameters for one-dimensional threading. J Comput Biol 2012; 19:879-86. [PMID: 22731622 DOI: 10.1089/cmb.2011.0307] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The development of automatic approaches for the comparison of protein sequences has become increasingly important. Methods that compare profiles allow for the use of information about whole protein families, resulting in more sensitive and accurate detection of distantly related sequences. In this contribution, we describe a thorough optimization and tests of a profile-to-profile alignment method. A number of different scoring schemes has been implemented and compared on the basis of their ability to identify a template protein from the same SCOP family as a query. In addition to sequence profiles, secondary structure profiles were used to increase the rate of successful detection. Our results show that a properly tuned one-dimensional threading method can recognize a correct template from the same SCOP family nearly as well as structural alignment. Our benchmark set, which might be useful in other similar studies, as well as the fold-recognition software we developed may be downloaded (www.bioshell.pl/profile-alignments).
Collapse
Affiliation(s)
- Pawel Gniewek
- Faculty of Chemistry, Warsaw University, Warsaw, Poland
| | | | | |
Collapse
|
7
|
Gront D, Blaszczyk M, Wojciechowski P, Kolinski A. BioShell Threader: protein homology detection based on sequence profiles and secondary structure profiles. Nucleic Acids Res 2012; 40:W257-62. [PMID: 22693216 PMCID: PMC3394251 DOI: 10.1093/nar/gks555] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The BioShell package has recently been extended with a web server for protein homology detection based on profile-to-profile alignment (known as 1D threading). Its aim is to assign structural templates to each domain of the query. The server uses sequence profiles that describe observed sequence variability and secondary structure profiles providing expected probability for a certain secondary structure type at a given position in a protein. Three independent predictors are used to increase the rate of successful predictions. Careful evaluation shows that there is nearly 80% chance that the query sequence belongs to the same SCOP family as the top scoring template. The Bioshell Threader server is freely available at: http://www.bioshell.pl/threader/.
Collapse
Affiliation(s)
- Dominik Gront
- University of Warsaw, Faculty of Chemistry, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | | | |
Collapse
|
8
|
Varzhabetyan LR, Glazachev DV, Nazaryan KB. Molecular dynamics simulation study of tubulin dimer interaction with cytostatics. Mol Biol 2012. [DOI: 10.1134/s0026893312020185] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
9
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
10
|
Kurcinski M, Kolinski A. Theoretical study of molecular mechanism of binding TRAP220 coactivator to Retinoid X Receptor alpha, activated by 9-cis retinoic acid. J Steroid Biochem Mol Biol 2010; 121:124-9. [PMID: 20398753 PMCID: PMC2906686 DOI: 10.1016/j.jsbmb.2010.03.086] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2009] [Accepted: 03/26/2010] [Indexed: 01/22/2023]
Abstract
Study on molecular mechanism of conformational reorientation of RXR-alpha ligand binding domain is presented. We employed CABS--a reduced model of protein dynamics to model folding pathways of binding 9-cis retinoic acid to apo-RXR molecule and TRAP220 peptide fragment to the holo form. Based on obtained results we also propose a sequential model of RXR activation by 9-cis retinoic acid and TRAP220 coactivator. Methodology presented here may be used for investigation of binding pathways of other NR/hormone/cofactor sets.
Collapse
Affiliation(s)
- Mateusz Kurcinski
- Department of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
11
|
Pierri CL, Parisi G, Porcelli V. Computational approaches for protein function prediction: a combined strategy from multiple sequence alignment to molecular docking-based virtual screening. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2010; 1804:1695-712. [PMID: 20433957 DOI: 10.1016/j.bbapap.2010.04.008] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Revised: 03/04/2010] [Accepted: 04/14/2010] [Indexed: 12/12/2022]
Abstract
The functional characterization of proteins represents a daily challenge for biochemical, medical and computational sciences. Although finally proved on the bench, the function of a protein can be successfully predicted by computational approaches that drive the further experimental assays. Current methods for comparative modeling allow the construction of accurate 3D models for proteins of unknown structure, provided that a crystal structure of a homologous protein is available. Binding regions can be proposed by using binding site predictors, data inferred from homologous crystal structures, and data provided from a careful interpretation of the multiple sequence alignment of the investigated protein and its homologs. Once the location of a binding site has been proposed, chemical ligands that have a high likelihood of binding can be identified by using ligand docking and structure-based virtual screening of chemical libraries. Most docking algorithms allow building a list sorted by energy of the lowest energy docking configuration for each ligand of the library. In this review the state-of-the-art of computational approaches in 3D protein comparative modeling and in the study of protein-ligand interactions is provided. Furthermore a possible combined/concerted multistep strategy for protein function prediction, based on multiple sequence alignment, comparative modeling, binding region prediction, and structure-based virtual screening of chemical libraries, is described by using suitable examples. As practical examples, Abl-kinase molecular modeling studies, HPV-E6 protein multiple sequence alignment analysis, and some other model docking-based characterization reports are briefly described to highlight the importance of computational approaches in protein function prediction.
Collapse
Affiliation(s)
- Ciro Leonardo Pierri
- Department of Pharmaco-Biology, Laboratory of Biochemistry and Molecular Biology, University of Bari, Va E. Orabona, 4 - 70125 Bari, Italy.
| | | | | |
Collapse
|
12
|
Jamroz M, Kolinski A. Modeling of loops in proteins: a multi-method approach. BMC STRUCTURAL BIOLOGY 2010; 10:5. [PMID: 20149252 PMCID: PMC2837870 DOI: 10.1186/1472-6807-10-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2009] [Accepted: 02/11/2010] [Indexed: 11/23/2022]
Abstract
Background Template-target sequence alignment and loop modeling are key components of protein comparative modeling. Short loops can be predicted with high accuracy using structural fragments from other, not necessairly homologous proteins, or by various minimization methods. For longer loops multiscale approaches employing coarse-grained de novo modeling techniques should be more effective. Results For a representative set of protein structures of various structural classes test predictions of loop regions have been performed using MODELLER, ROSETTA, and a CABS coarse-grained de novo modeling tool. Loops of various length, from 4 to 25 residues, were modeled assuming an ideal target-template alignment of the remaining portions of the protein. It has been shown that classical modeling with MODELLER is usually better for short loops, while coarse-grained de novo modeling is more effective for longer loops. Even very long missing fragments in protein structures could be effectively modeled. Resolution of such models is usually on the level 2-6 Å, which could be sufficient for guiding protein engineering. Further improvement of modeling accuracy could be achieved by the combination of different methods. In particular, we used 10 top ranked models from sets of 500 models generated by MODELLER as multiple templates for CABS modeling. On average, the resulting molecular models were better than the models from individual methods. Conclusions Accuracy of protein modeling, as demonstrated for the problem of loop modeling, could be improved by the combinations of different modeling techniques.
Collapse
Affiliation(s)
- Michal Jamroz
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | | |
Collapse
|
13
|
Zhu J, Fan H, Periole X, Honig B, Mark AE. Refining homology models by combining replica-exchange molecular dynamics and statistical potentials. Proteins 2009; 72:1171-88. [PMID: 18338384 DOI: 10.1002/prot.22005] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
A protocol is presented for the global refinement of homology models of proteins. It combines the advantages of temperature-based replica-exchange molecular dynamics (REMD) for conformational sampling and the use of statistical potentials for model selection. The protocol was tested using 21 models. Of these 14 were models of 10 small proteins for which high-resolution crystal structures were available, the remainder were targets of the recent CASPR exercise. It was found that REMD in combination with currently available force fields could sample near-native conformational states starting from high-quality homology models. Conformations in which the backbone RMSD of secondary structure elements (SSE-RMSD) was lower than the starting value by 0.5-1.0 A were found for 15 out of the 21 cases (average 0.82 A). Furthermore, when a simple scoring function consisting of two statistical potentials was used to rank the structures, one or more structures with SSE-RMSD of at least 0.2 A lower than the starting value was found among the five best ranked structures in 11 out of the 21 cases. The average improvement in SSE-RMSD for the best models was 0.42 A. However, none of the scoring functions tested identified the structures with the lowest SSE-RMSD as the best models although all identified the native conformation as the one with lowest energy. This suggests that while the proposed protocol proved effective for the refinement of high-quality models of small proteins scoring functions remain one of the major limiting factors in structure refinement. This and other aspects by which the methodology could be further improved are discussed.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute and Columbia University, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, USA
| | | | | | | | | |
Collapse
|