1
|
Bhattacharya S, Roche R, Shuvo MH, Moussad B, Bhattacharya D. Contact-Assisted Threading in Low-Homology Protein Modeling. Methods Mol Biol 2023; 2627:41-59. [PMID: 36959441 DOI: 10.1007/978-1-0716-2974-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The ability to successfully predict the three-dimensional structure of a protein from its amino acid sequence has made considerable progress in the recent past. The progress is propelled by the improved accuracy of deep learning-based inter-residue contact map predictors coupled with the rising growth of protein sequence databases. Contact map encodes interatomic interaction information that can be exploited for highly accurate prediction of protein structures via contact map threading even for the query proteins that are not amenable to direct homology modeling. As such, contact-assisted threading has garnered considerable research effort. In this chapter, we provide an overview of existing contact-assisted threading methods while highlighting the recent advances and discussing some of the current limitations and future prospects in the application of contact-assisted threading for improving the accuracy of low-homology protein modeling.
Collapse
Affiliation(s)
- Sutanu Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | | | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | |
Collapse
|
2
|
Structure Prediction, Evaluation, and Validation of GPR18 Lipid Receptor Using Free Programs. Int J Mol Sci 2022; 23:ijms23147917. [PMID: 35887268 PMCID: PMC9319093 DOI: 10.3390/ijms23147917] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/04/2022] [Accepted: 07/08/2022] [Indexed: 11/30/2022] Open
Abstract
The GPR18 receptor, often referred to as the N-arachidonylglycine receptor, although assigned (along with GPR55 and GPR119) to the new class A GPCR subfamily-lipid receptors, officially still has the status of a class A GPCR orphan. While its signaling pathways and biological significance have not yet been fully elucidated, increasing evidence points to the therapeutic potential of GPR18 in relation to immune, neurodegenerative, and cancer processes to name a few. Therefore, it is necessary to understand the interactions of potential ligands with the receptor and the influence of particular structural elements on their activity. Thus, given the lack of an experimentally solved structure, the goal of the present study was to obtain a homology model of the GPR18 receptor in the inactive state, meeting all requirements in terms of protein structure quality and recognition of active ligands. To increase the reliability and precision of the predictions, different contemporary protein structure prediction methods and software were used and compared herein. To test the usability of the resulting models, we optimized and compared the selected structures followed by the assessment of the ability to recognize known, active ligands. The stability of the predicted poses was then evaluated by means of molecular dynamics simulations. On the other hand, most of the best-ranking contemporary CADD software/platforms for its full usability require rather expensive licenses. To overcome this down-to-earth obstacle, the overarching goal of these studies was to test whether it is possible to perform the thorough CADD experiments with high scientific confidence while using only license-free/academic software and online platforms. The obtained results indicate that a wide range of freely available software and/or academic licenses allow us to carry out meaningful molecular modelling/docking studies.
Collapse
|
3
|
Staritzbichler R, Yaklich E, Sarti E, Ristic N, Hildebrand PW, Forrest LR. AlignMe: an update of the web server for alignment of membrane protein sequences. Nucleic Acids Res 2022; 50:W29-W35. [PMID: 35609986 PMCID: PMC9252776 DOI: 10.1093/nar/gkac391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 04/19/2022] [Accepted: 05/10/2022] [Indexed: 11/14/2022] Open
Abstract
The AlignMe web server is dedicated to accurately aligning sequences of membrane proteins, a particularly challenging task due to the strong evolutionary divergence and the low compositional complexity of hydrophobic membrane-spanning proteins. AlignMe can create pairwise alignments of either two primary amino acid sequences or two hydropathy profiles. The web server for AlignMe has been continuously available for >10 years, supporting 1000s of users per year. Recent improvements include anchoring, multiple submissions, and structure visualization. Anchoring is the ability to constrain a position in an alignment, which allows expert information about related residues in proteins to be incorporated into an alignment without manual modification. The original web interface to the server limited the user to one alignment per submission, hindering larger scale studies. Now, batches of alignments can be initiated with a single submission. Finally, to provide structural context for the relationship between proteins, sequence similarity can now be mapped onto one or more structures (or structural models) of the proteins being aligned, by links to MutationExplorer, a web-based visualization tool. Together with a refreshed user interface, these features further enhance an important resource in the membrane protein community. The AlignMe web server is freely available at https://www.bioinfo.mpg.de/AlignMe/.
Collapse
Affiliation(s)
- René Staritzbichler
- University of Leipzig, Institute of Medical Physics and Biophysics, Härtelstr. 16-18, 04107 Leipzig, Germany
| | - Emily Yaklich
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
| | - Edoardo Sarti
- Algorithms, Biology, Structure Unit Inria Sophia Antipolis - Méditerranée, 06902 Valbonne, France
| | - Nikola Ristic
- University of Leipzig, Institute of Medical Physics and Biophysics, Härtelstr. 16-18, 04107 Leipzig, Germany
| | - Peter W Hildebrand
- University of Leipzig, Institute of Medical Physics and Biophysics, Härtelstr. 16-18, 04107 Leipzig, Germany.,Charité -Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute of Medical Physics and Biophysics, 10117 Berlin, Germany
| | - Lucy R Forrest
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
4
|
Bhattacharya S, Roche R, Shuvo MH, Bhattacharya D. Recent Advances in Protein Homology Detection Propelled by Inter-Residue Interaction Map Threading. Front Mol Biosci 2021; 8:643752. [PMID: 34046429 PMCID: PMC8148041 DOI: 10.3389/fmolb.2021.643752] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open
Abstract
Sequence-based protein homology detection has emerged as one of the most sensitive and accurate approaches to protein structure prediction. Despite the success, homology detection remains very challenging for weakly homologous proteins with divergent evolutionary profile. Very recently, deep neural network architectures have shown promising progress in mining the coevolutionary signal encoded in multiple sequence alignments, leading to reasonably accurate estimation of inter-residue interaction maps, which serve as a rich source of additional information for improved homology detection. Here, we summarize the latest developments in protein homology detection driven by inter-residue interaction map threading. We highlight the emerging trends in distant-homology protein threading through the alignment of predicted interaction maps at various granularities ranging from binary contact maps to finer-grained distance and orientation maps as well as their combination. We also discuss some of the current limitations and possible future avenues to further enhance the sensitivity of protein homology detection.
Collapse
Affiliation(s)
- Sutanu Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, United States
| | - Rahmatullah Roche
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, United States
| | - Md Hossain Shuvo
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, United States
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, United States
- Department of Biological Sciences, Auburn University, Auburn, AL, United States
| |
Collapse
|
5
|
Staritzbichler R, Sarti E, Yaklich E, Aleksandrova A, Stamm M, Khafizov K, Forrest LR. Refining pairwise sequence alignments of membrane proteins by the incorporation of anchors. PLoS One 2021; 16:e0239881. [PMID: 33930031 PMCID: PMC8087094 DOI: 10.1371/journal.pone.0239881] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 04/15/2021] [Indexed: 01/08/2023] Open
Abstract
The alignment of primary sequences is a fundamental step in the analysis of protein structure, function, and evolution, and in the generation of homology-based models. Integral membrane proteins pose a significant challenge for such sequence alignment approaches, because their evolutionary relationships can be very remote, and because a high content of hydrophobic amino acids reduces their complexity. Frequently, biochemical or biophysical data is available that informs the optimum alignment, for example, indicating specific positions that share common functional or structural roles. Currently, if those positions are not correctly matched by a standard pairwise sequence alignment procedure, the incorporation of such information into the alignment is typically addressed in an ad hoc manner, with manual adjustments. However, such modifications are problematic because they reduce the robustness and reproducibility of the aligned regions either side of the newly matched positions. Previous studies have introduced restraints as a means to impose the matching of positions during sequence alignments, originally in the context of genome assembly. Here we introduce position restraints, or "anchors" as a feature in our alignment tool AlignMe, providing an aid to pairwise global sequence alignment of alpha-helical membrane proteins. Applying this approach to realistic scenarios involving distantly-related and low complexity sequences, we illustrate how the addition of anchors can be used to modify alignments, while still maintaining the reproducibility and rigor of the rest of the alignment. Anchored alignments can be generated using the online version of AlignMe available at www.bioinfo.mpg.de/AlignMe/.
Collapse
Affiliation(s)
- René Staritzbichler
- ProteinFormatics Group, Institute of Biophysics and Medical Physics, University of Leipzig, Leipzig, Germany
| | - Edoardo Sarti
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States of America
- Laboratoire de Biologie Computationnelle et Quantitative, Institut de Biologie Paris Seine, Sorbonne Université, Paris, France
| | - Emily Yaklich
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States of America
| | - Antoniya Aleksandrova
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States of America
| | - Marcus Stamm
- Max Planck Institute of Biophysics, Frankfurt am Main, Germany
| | - Kamil Khafizov
- Moscow Institute of Physics and Technology, National Research University, Moscow, Russia
| | - Lucy R. Forrest
- Computational Structural Biology Section, National Institutes of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States of America
| |
Collapse
|
6
|
Lima I, Cino EA. Sequence similarity in 3D for comparison of protein families. J Mol Graph Model 2021; 106:107906. [PMID: 33848948 DOI: 10.1016/j.jmgm.2021.107906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 03/18/2021] [Accepted: 03/18/2021] [Indexed: 11/26/2022]
Abstract
Homologous proteins are often compared by pairwise sequence alignment, and structure superposition if the atomic coordinates are available. Unification of sequence and structure data is an important task in structural biology. Here, we present the Sequence Similarity 3D (SS3D) method of integrating sequence and structure information. SS3D is a distance and substitution matrix-based method for straightforward visualization of regions of similarity and difference between homologous proteins. This work details the SS3D approach, and demonstrates its utility through case studies comparing members of several protein families. The examples show that SS3D can effectively highlight biologically important regions of similarity and dissimilarity. We anticipate that the method will be useful for numerous structural biology applications, including, but not limited to, studies of binding specificity, structure-function relationships, and evolutionary pathways. SS3D is available with a manual and tutorial at https://github.com/0x462e41/SS3D/.
Collapse
Affiliation(s)
- Igor Lima
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Elio A Cino
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, 31270-901, Brazil.
| |
Collapse
|
7
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. In the following chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
|
8
|
González-Durruthy M, Concu R, Vendrame LFO, Zanella I, Ruso JM, Cordeiro MNDS. Targeting Beta-Blocker Drug-Drug Interactions with Fibrinogen Blood Plasma Protein: A Computational and Experimental Study. Molecules 2020; 25:molecules25225425. [PMID: 33228181 PMCID: PMC7699576 DOI: 10.3390/molecules25225425] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 11/16/2020] [Accepted: 11/17/2020] [Indexed: 12/05/2022] Open
Abstract
In this work, one of the most prevalent polypharmacology drug–drug interaction events that occurs between two widely used beta-blocker drugs—i.e., acebutolol and propranolol—with the most abundant blood plasma fibrinogen protein was evaluated. Towards that end, molecular docking and Density Functional Theory (DFT) calculations were used as complementary tools. A fibrinogen crystallographic validation for the three best ranked binding-sites shows 100% of conformationally favored residues with total absence of restricted flexibility. From those three sites, results on both the binding-site druggability and ligand transport analysis-based free energy trajectories pointed out the most preferred biophysical environment site for drug–drug interactions. Furthermore, the total affinity for the stabilization of the drug–drug complexes was mostly influenced by steric energy contributions, based mainly on multiple hydrophobic contacts with critical residues (THR22: P and SER50: Q) in such best-ranked site. Additionally, the DFT calculations revealed that the beta-blocker drug–drug complexes have a spontaneous thermodynamic stabilization following the same affinity order obtained in the docking simulations, without covalent-bond formation between both interacting beta-blockers in the best-ranked site. Lastly, experimental ultrasound density and velocity measurements were performed and allowed us to validate and corroborate the computational obtained results.
Collapse
Affiliation(s)
- Michael González-Durruthy
- LAQV-REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
- Soft Matter and Molecular Biophysics Group, Department of Applied Physics, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain;
- Correspondence: (M.G.-D.); (M.N.D.S.C.); Tel.: +351-220402502 (M.N.D.S.C.)
| | - Riccardo Concu
- LAQV-REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
| | - Laura F. Osmari Vendrame
- Post-Graduate Program in Nanoscience, Franciscana University (UFN), Santa Maria 97010-032, RS, Brazil; (L.F.O.V.); (I.Z.)
| | - Ivana Zanella
- Post-Graduate Program in Nanoscience, Franciscana University (UFN), Santa Maria 97010-032, RS, Brazil; (L.F.O.V.); (I.Z.)
| | - Juan M. Ruso
- Soft Matter and Molecular Biophysics Group, Department of Applied Physics, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain;
| | - M. Natália D. S. Cordeiro
- LAQV-REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal;
- Correspondence: (M.G.-D.); (M.N.D.S.C.); Tel.: +351-220402502 (M.N.D.S.C.)
| |
Collapse
|
9
|
Strokach A, Becerra D, Corbi-Verge C, Perez-Riba A, Kim PM. Fast and Flexible Protein Design Using Deep Graph Neural Networks. Cell Syst 2020; 11:402-411.e4. [DOI: 10.1016/j.cels.2020.08.016] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 06/27/2020] [Accepted: 08/26/2020] [Indexed: 11/15/2022]
|
10
|
Runthala A, Chowdhury S. Refined template selection and combination algorithm significantly improves template-based modeling accuracy. J Bioinform Comput Biol 2020; 17:1950006. [PMID: 31057073 DOI: 10.1142/s0219720019500069] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
In contrast to ab-initio protein modeling methodologies, comparative modeling is considered as the most popular and reliable algorithm to model protein structure. However, the selection of the best set of templates is still a major challenge. An effective template-ranking algorithm is developed to efficiently select only the reliable hits for predicting the protein structures. The algorithm employs the pairwise as well as multiple sequence alignments of template hits to rank and select the best possible set of templates. It captures several key sequences and structural information of template hits and converts into scores to effectively rank them. This selected set of templates is used to model a target. Modeling accuracy of the algorithm is tested and evaluated on TBM-HA domain containing CASP8, CASP9 and CASP10 targets. On an average, this template ranking and selection algorithm improves GDT-TS, GDT-HA and TM_Score by 3.531, 4.814 and 0.022, respectively. Further, it has been shown that the inclusion of structurally similar templates with ample conformational diversity is crucial for the modeling algorithm to maximally as well as reliably span the target sequence and construct its near-native model. The optimal model sampling also holds the key to predict the best possible target structure.
Collapse
Affiliation(s)
- Ashish Runthala
- 1 Department of Biological Sciences, Birla Institute of Technology and Science, Pilani-333031, India
| | - Shibasish Chowdhury
- 1 Department of Biological Sciences, Birla Institute of Technology and Science, Pilani-333031, India
| |
Collapse
|
11
|
Bhattacharya S, Bhattacharya D. Does inclusion of residue-residue contact information boost protein threading? Proteins 2019; 87:596-606. [PMID: 30882932 DOI: 10.1002/prot.25684] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Revised: 02/20/2019] [Accepted: 03/13/2019] [Indexed: 12/26/2022]
Abstract
Template-based modeling is considered as one of the most successful approaches for protein structure prediction. However, reliably and accurately selecting optimal template proteins from a library of known protein structures having similar folds as the target protein and making correct alignments between the target sequence and the template structures, a template-based modeling technique known as threading, remains challenging, particularly for non- or distantly-homologous protein targets. With the recent advancement in protein residue-residue contact map prediction powered by sequence co-evolution and machine learning, here we systematically analyze the effect of inclusion of residue-residue contact information in improving the accuracy and reliability of protein threading. We develop a new threading algorithm by incorporating various sequential and structural features, and subsequently integrate residue-residue contact information as an additional scoring term for threading template selection. We show that the inclusion of contact information attains statistically significantly better threading performance compared to a baseline threading algorithm that does not utilize contact information when everything else remains the same. Experimental results demonstrate that our contact based threading approach outperforms popular threading method MUSTER, contact-assisted ab initio folding method CONFOLD2, and recent state-of-the-art contact-assisted protein threading methods EigenTHREADER and map_align on several benchmarks. Our study illustrates that the inclusion of contact maps is a promising avenue in protein threading to ultimately help to improve the accuracy of protein structure prediction.
Collapse
Affiliation(s)
- Sutanu Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama
| | - Debswapna Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, Alabama
| |
Collapse
|
12
|
Nguyen SP, Li Z, Xu D, Shang Y. New Deep Learning Methods for Protein Loop Modeling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:596-606. [PMID: 29990046 PMCID: PMC6580050 DOI: 10.1109/tcbb.2017.2784434] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Computational protein structure prediction is a long-standing challenge in bioinformatics. In the process of predicting protein 3D structures, it is common that parts of an experimental structure are missing or parts of a predicted structure need to be remodeled. The process of predicting local protein structures of particular regions is called loop modeling. In this paper, five new loop modeling methods based on machine learning techniques, called NearLooper, ConLooper, ResLooper, HyLooper1, and HyLooper2 are proposed. NearLooper is based on the nearest neighbor technique. ConLooper applies deep convolutional neural networks to predict ${\mathrm{C}}_{{{\alpha }}}$Cα atoms distance matrix as an orientation-independent representation of protein structure. ResLooper uses residual neural networks instead of deep convolutional neural networks. HyLooper1 combines the results of NearLooper and ConLooper while HyLooper2 combines NearLooper and ResLooper. Three commonly used benchmarks for loop modeling are used to compare the performance between these methods and existing state-of-the-art methods. The experiment results show promising performance in which our best method improves existing state-of-the-art methods by 28 and 54 percent of average RMSD on two datasets while being comparable on the other one.
Collapse
|
13
|
Nikolaev D, Shtyrov AA, Panov MS, Jamal A, Chakchir OB, Kochemirovsky VA, Olivucci M, Ryazantsev MN. A Comparative Study of Modern Homology Modeling Algorithms for Rhodopsin Structure Prediction. ACS OMEGA 2018; 3:7555-7566. [PMID: 30087916 PMCID: PMC6068592 DOI: 10.1021/acsomega.8b00721] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 06/21/2018] [Indexed: 06/08/2023]
Abstract
Rhodopsins are seven α-helical membrane proteins that are of great importance in chemistry, biology, and modern biotechnology. Any in silico study on rhodopsin properties and functioning requires a high-quality three-dimensional structure. Due to particular difficulties with obtaining membrane protein structures from the experiment, in silico prediction of the three-dimensional rhodopsin structure based only on its primary sequence is an especially important task. For the last few years, significant progress was made in the field of protein structure prediction, especially for methods based on comparative modeling. However, the majority of this progress was made for soluble proteins and further investigations are needed to achieve similar progress for membrane proteins. In this paper, we evaluate the performance of modern protein structure prediction methodologies (implemented in the Medeller, I-TASSER, and Rosetta packages) for their ability to predict rhodopsin structures. Three widely used methodologies were considered: two general methodologies that are commonly applied to soluble proteins and a methodology that uses constraints that are specific for membrane proteins. The test pool consisted of 36 target-template pairs with different sequence similarities that was constructed on the basis of 24 experimental rhodopsin structures taken from the RCSB database. As a result, we showed that all three considered methodologies allow obtaining rhodopsin structures with the quality that is close to the crystallographic one (root mean square deviation (RMSD) of the predicted structure from the corresponding X-ray structure up to 1.5 Å) if the target-template sequence identity is higher than 40%. Moreover, all considered methodologies provided structures of average quality (RMSD < 4.0 Å) if the target-template sequence identity is higher than 20%. Such structures can be subsequently used for further investigation of molecular mechanisms of protein functioning and for the development of modern protein-based biotechnologies.
Collapse
Affiliation(s)
- Dmitrii
M. Nikolaev
- Nanotechnology
Research and Education Centre RAS, Saint-Petersburg
Academic University, 8/3 Khlopina Street, St. Petersburg 194021, Russia
| | - Andrey A. Shtyrov
- Nanotechnology
Research and Education Centre RAS, Saint-Petersburg
Academic University, 8/3 Khlopina Street, St. Petersburg 194021, Russia
| | - Maxim S. Panov
- Institute
of Chemistry, Saint Petersburg State University, 7/9 Universitetskaya emb., St. Petersburg 199034, Russia
| | - Adeel Jamal
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139, United States
| | - Oleg B. Chakchir
- Nanotechnology
Research and Education Centre RAS, Saint-Petersburg
Academic University, 8/3 Khlopina Street, St. Petersburg 194021, Russia
| | - Vladimir A. Kochemirovsky
- Institute
of Chemistry, Saint Petersburg State University, 7/9 Universitetskaya emb., St. Petersburg 199034, Russia
| | - Massimo Olivucci
- Department
of Biotechnology, Chemistry and Pharmacy, Università di Siena, via A. Moro 2, Siena I-53100, Italy
| | - Mikhail N. Ryazantsev
- Institute
of Chemistry, Saint Petersburg State University, 7/9 Universitetskaya emb., St. Petersburg 199034, Russia
- Institute
of Macromolecular Compounds of the Russian Academy of Sciences, 31 Bolshoy pr., St. Petersburg 199004, Russia
| |
Collapse
|
14
|
Morales-Cordovilla JA, Sanchez V, Ratajczak M. Protein alignment based on higher order conditional random fields for template-based modeling. PLoS One 2018; 13:e0197912. [PMID: 29856860 PMCID: PMC5983487 DOI: 10.1371/journal.pone.0197912] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Accepted: 05/10/2018] [Indexed: 11/19/2022] Open
Abstract
The query-template alignment of proteins is one of the most critical steps of template-based modeling methods used to predict the 3D structure of a query protein. This alignment can be interpreted as a temporal classification or structured prediction task and first order Conditional Random Fields have been proposed for protein alignment and proven to be rather successful. Some other popular structured prediction problems, such as speech or image classification, have gained from the use of higher order Conditional Random Fields due to the well known higher order correlations that exist between their labels and features. In this paper, we propose and describe the use of higher order Conditional Random Fields for query-template protein alignment. The experiments carried out on different public datasets validate our proposal, especially on distantly-related protein pairs which are the most difficult to align.
Collapse
Affiliation(s)
| | - Victoria Sanchez
- Dept. of Teoría de la Señal Telemática y Comunicaciones, Universidad de Granada, Granada, Spain
| | - Martin Ratajczak
- Graz University of Technology, Signal Processing and Speech Communication Laboratory, Graz, Austria
| |
Collapse
|
15
|
Khunweeraphong N, Stockner T, Kuchler K. The structure of the human ABC transporter ABCG2 reveals a novel mechanism for drug extrusion. Sci Rep 2017; 7:13767. [PMID: 29061978 PMCID: PMC5653816 DOI: 10.1038/s41598-017-11794-w] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 08/31/2017] [Indexed: 12/13/2022] Open
Abstract
The human ABC transporter ABCG2 (Breast Cancer Resistance Protein, BCRP) is implicated in anticancer resistance, in detoxification across barriers and linked to gout. Here, we generate a novel atomic model of ABCG2 using the crystal structure of ABCG5/G8. Extensive mutagenesis verifies the structure, disclosing hitherto unrecognized essential residues and domains in the homodimeric ABCG2 transporter. The elbow helix, the first intracellular loop (ICL1) and the nucleotide-binding domain (NBD) constitute pivotal elements of the architecture building the transmission interface that borders a central cavity which acts as a drug trap. The transmission interface is stabilized by salt-bridge interactions between the elbow helix and ICL1, as well as within ICL1, which is essential to control the conformational switch of ABCG2 to the outward-open drug-releasing conformation. Importantly, we propose that ICL1 operates like a molecular spring that holds the NBD dimer close to the membrane, thereby enabling efficient coupling of ATP hydrolysis during the catalytic cycle. These novel mechanistic data open new opportunities to therapeutically target ABCG2 in the context of related diseases.
Collapse
Affiliation(s)
- Narakorn Khunweeraphong
- Center for Medical Biochemistry, Max F. Perutz Laboratories, Medical University of Vienna, Campus Vienna Biocenter, Dr. Bohr-Gasse 9/2, A-1030, Vienna, Austria
| | - Thomas Stockner
- Center for Physiology and Pharmacology, Institute of Pharmacology, Medical University Vienna, Währingerstrasse 13A, A-1090, Vienna, Austria
| | - Karl Kuchler
- Center for Medical Biochemistry, Max F. Perutz Laboratories, Medical University of Vienna, Campus Vienna Biocenter, Dr. Bohr-Gasse 9/2, A-1030, Vienna, Austria.
| |
Collapse
|
16
|
In silico probing and biological evaluation of SETDB1/ESET-targeted novel compounds that reduce tri-methylated histone H3K9 (H3K9me3) level. J Comput Aided Mol Des 2017; 31:877-889. [PMID: 28879500 DOI: 10.1007/s10822-017-0052-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Accepted: 08/22/2017] [Indexed: 10/18/2022]
Abstract
ERG-associated protein with the SET domain (ESET/SET domain bifurcated 1/SETDB1/KMT1E) is a histone lysine methyltransferase (HKMT) and it preferentially tri-methylates lysine 9 of histone H3 (H3K9me3). SETDB1/ESET leads to heterochromatin condensation and epigenetic gene silencing. These functional changes are reported to correlate with Huntington's disease (HD) progression and mood-related disorders which make SETDB1/ESET a viable drug target. In this context, the present investigation was performed to identify novel peptide-competitive small molecule inhibitors of the SETDB1/ESET by a combined in silico-in vitro approach. A ligand-based pharmacophore model was built and employed for the virtual screening of ChemDiv and Asinex database. Also, a human SETDB1/ESET homology model was constructed to supplement the data further. Biological evaluation of the selected 21 candidates singled out 5 compounds exhibiting a notable reduction of the H3K9me3 level via inhibitory potential of SETDB1/ESET activity in SETDB1/ESET-inducible cell line and HD striatal cells. Later on, we identified two compounds as final hits that appear to have neuronal effects without cytotoxicity based on the result from MTT assay. These compounds hold the calibre to become the future lead compounds and can provide structural insights into more SETDB1/ESET-focused drug discovery research. Moreover, these SETDB1/ESET inhibitors may be applicable for the preclinical study to ameliorate neurodegenerative disorders via epigenetic regulation.
Collapse
|
17
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. In the following chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
Affiliation(s)
- Benjamin Webb
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, CA, 94143, USA
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences (QB3), University of California San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|
18
|
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
19
|
Skolnick J, Zhou H. Why Is There a Glass Ceiling for Threading Based Protein Structure Prediction Methods? J Phys Chem B 2016; 121:3546-3554. [PMID: 27748116 DOI: 10.1021/acs.jpcb.6b09517] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Despite their different implementations, comparison of the best threading approaches to the prediction of evolutionary distant protein structures reveals that they tend to succeed or fail on the same protein targets. This is true despite the fact that the structural template library has good templates for all cases. Thus, a key question is why are certain protein structures threadable while others are not. Comparison with threading results on a set of artificial sequences selected for stability further argues that the failure of threading is due to the nature of the protein structures themselves. Using a new contact map based alignment algorithm, we demonstrate that certain folds are highly degenerate in that they can have very similar coarse grained fractions of native contacts aligned and yet differ significantly from the native structure. For threadable proteins, this is not the case. Thus, contemporary threading approaches appear to have reached a plateau, and new approaches to structure prediction are required.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology , 950 Atlantic Drive Northwest, Atlanta, Georgia 30318, United States
| |
Collapse
|
20
|
Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1920] [Impact Index Per Article: 240.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
21
|
Gaillard T, Stote RH, Dejaegere A. PSSweb: protein structural statistics web server. Nucleic Acids Res 2016; 44:W401-5. [PMID: 27174930 PMCID: PMC4987900 DOI: 10.1093/nar/gkw332] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2016] [Accepted: 04/17/2016] [Indexed: 11/29/2022] Open
Abstract
With the increasing number of protein structures available, there is a need for tools capable of automating the comparison of ensembles of structures, a common requirement in structural biology and bioinformatics. PSSweb is a web server for protein structural statistics. It takes as input an ensemble of PDB files of protein structures, performs a multiple sequence alignment and computes structural statistics for each position of the alignment. Different optional functionalities are proposed: structure superposition, Cartesian coordinate statistics, dihedral angle calculation and statistics, and a cluster analysis based on dihedral angles. An interactive report is generated, containing a summary of the results, tables, figures and 3D visualization of superposed structures. The server is available at http://pssweb.org.
Collapse
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie, École Polytechnique, CNRS, Université Paris-Saclay, 91128 Palaiseau cedex, France
| | - Roland H Stote
- Department of Integrative Structural Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Institut National de la Santé et de la Recherche Médicale U964, Centre National de la Recherche Scientifique UMR 7104, Université de Strasbourg, 67404 Illkirch, France
| | - Annick Dejaegere
- Department of Integrative Structural Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire, Institut National de la Santé et de la Recherche Médicale U964, Centre National de la Recherche Scientifique UMR 7104, Université de Strasbourg, 67404 Illkirch, France
| |
Collapse
|
22
|
Zhao W, Ho L, Wang J, Bi W, Yemul S, Ward L, Freire D, Mazzola P, Brathwaite J, Mezei M, Sanchez R, Elder GA, Pasinetti GM. In Silico Modeling of Novel Drug Ligands for Treatment of Concussion Associated Tauopathy. J Cell Biochem 2016; 117:2241-8. [PMID: 26910498 DOI: 10.1002/jcb.25521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 02/19/2016] [Indexed: 11/07/2022]
Abstract
The objective of this study was to develop an in silico screening model for characterization of potential novel ligands from commercial drug libraries able to functionally activate certain olfactory receptors (ORs), which are members of the class A rhodopsin-like family of G protein couple receptors (GPCRs), in the brain of murine models of concussion. We previously found that concussions may significantly influence expression of certain ORs, for example, OR4M1 in subjects with a history of concussion/traumatic brain injury (TBI). In this study, we built a 3-D OR4M1 model and used it in in silico screening of potential novel ligands from commercial drug libraries. We report that in vitro activation of OR4M1 with the commercially available ZINC library compound 10915775 led to a significant attenuation of abnormal tau phosphorylation in embryonic cortico-hippocampal neuronal cultures derived from NSE-OR4M1 transgenic mice, possibly through modulation of the JNK signaling pathway. The attenuation of abnormal tau phosphorylation was rather selective since ZINC10915775 significantly decreased tau phosphorylation on tau Ser202/T205 (AT8 epitope) and tau Thr212/Ser214 (AT100 epitope), but not on tau Ser396/404 (PHF-1 epitope). Moreover, no response of ZINC10915775 was found in control hippocampal neuronal cultures derived from wild type littermates. Our in silico model provides novel means to pharmacologically modulate select ubiquitously expressed ORs in the brain through high affinity ligand activation to prevent and eventually to treat concussion induced down regulation of ORs and subsequent cascade of tau pathology. J. Cell. Biochem. 117: 2241-2248, 2016. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Zhao
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York.,Geriatric Research Education Clinical Center at James J. Peters VA Medical Center, Bronx, New York
| | - Lap Ho
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Jun Wang
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York.,Geriatric Research Education Clinical Center at James J. Peters VA Medical Center, Bronx, New York
| | - Weina Bi
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Shrishailam Yemul
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Libby Ward
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Daniel Freire
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Paolo Mazzola
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York.,School of Medicine and Surgery, University of Milano-Bicocca, Monza, Italy
| | - Justin Brathwaite
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York
| | - Mihaly Mezei
- Department of Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York.,Experimental Therapeutics Institute, Icahn School of Medicine at Mount Sinai, New York
| | - Roberto Sanchez
- Department of Structural and Chemical Biology, Icahn School of Medicine at Mount Sinai, New York.,Experimental Therapeutics Institute, Icahn School of Medicine at Mount Sinai, New York
| | - Gregory A Elder
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York
| | - Giulio Maria Pasinetti
- Department of Neurology, Icahn School of Medicine at Mount Sinai, New York.,Geriatric Research Education Clinical Center at James J. Peters VA Medical Center, Bronx, New York.,Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York
| |
Collapse
|
23
|
Park EH, Yeo SH, Kim MD. Cloning of the LEU2 gene from the amylolytic yeast Saccharomycopsis fibuligera. Food Sci Biotechnol 2015. [DOI: 10.1007/s10068-015-0286-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
24
|
Isogai Y, Nakayama K. Alteration of substrate selection of antibiotic acylase from β-lactam to echinocandin. Protein Eng Des Sel 2015; 29:49-56. [PMID: 26590167 DOI: 10.1093/protein/gzv059] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Accepted: 10/09/2015] [Indexed: 11/13/2022] Open
Abstract
The antibiotic acylases belonging to the N-terminal nucleophile hydrolase superfamily are key enzymes for the industrial production of antibiotic drugs. Cephalosporin acylase (CA) and penicillin G acylase (PGA) are two of the most intensively studied enzymes that catalyze the deacylation of β-lactam antibiotics. On the other hand, aculeacin A acylase (AAC) is known to be an alternative acylase class catalyzing the deacylation of echinocandin or cyclic lipopeptide antibiotic compounds, but its structural and enzymatic properties remain to be explored. In the present study, 3D homology models of AAC were constructed, and docking simulation with substrate ligands was performed for AAC, as well as for CA and PGA. The docking models of AAC with aculeacin A suggest that AAC has the deep narrow binding pocket for the long-chain fatty acyl group of the echinocandin molecule. To confirm this, CA mutants have been designed to form the binding pocket for the long acyl chain. Experimentally synthesized mutant enzymes exhibited lower enzymatic activity for cephalosporin but higher activity for aculeacin A, in comparison with the wild-type enzyme. The present results have clarified the difference in mechanisms of substrate selection between the β-lactam and echinocandin acylases and demonstrate the usefulness of the computational approaches for engineering the enzymatic properties of antibiotic acylases.
Collapse
Affiliation(s)
- Yasuhiro Isogai
- Department of Biotechnology, Faculty of Engineering, Toyama Prefectural University, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan
| | - Kazuki Nakayama
- Department of Biotechnology, Faculty of Engineering, Toyama Prefectural University, 5180 Kurokawa, Imizu, Toyama 939-0398, Japan Present address: Fujiyakuhin Co., Ltd, Itakura 682, Toyama, Toyama 939-2721, Japan
| |
Collapse
|
25
|
Palovcak E, Delemotte L, Klein ML, Carnevale V. Comparative sequence analysis suggests a conserved gating mechanism for TRP channels. J Gen Physiol 2015; 146:37-50. [PMID: 26078053 PMCID: PMC4485022 DOI: 10.1085/jgp.201411329] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 05/11/2015] [Indexed: 12/12/2022] Open
Abstract
The transient receptor potential (TRP) channel superfamily plays a central role in transducing diverse sensory stimuli in eukaryotes. Although dissimilar in sequence and domain organization, all known TRP channels act as polymodal cellular sensors and form tetrameric assemblies similar to those of their distant relatives, the voltage-gated potassium (Kv) channels. Here, we investigated the related questions of whether the allosteric mechanism underlying polymodal gating is common to all TRP channels, and how this mechanism differs from that underpinning Kv channel voltage sensitivity. To provide insight into these questions, we performed comparative sequence analysis on large, comprehensive ensembles of TRP and Kv channel sequences, contextualizing the patterns of conservation and correlation observed in the TRP channel sequences in light of the well-studied Kv channels. We report sequence features that are specific to TRP channels and, based on insight from recent TRPV1 structures, we suggest a model of TRP channel gating that differs substantially from the one mediating voltage sensitivity in Kv channels. The common mechanism underlying polymodal gating involves the displacement of a defect in the H-bond network of S6 that changes the orientation of the pore-lining residues at the hydrophobic gate.
Collapse
Affiliation(s)
- Eugene Palovcak
- Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122
| | - Lucie Delemotte
- Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122
| | - Michael L Klein
- Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122
| | - Vincenzo Carnevale
- Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122
| |
Collapse
|
26
|
Tong J, Pei J, Otwinowski Z, Grishin NV. Refinement by shifting secondary structure elements improves sequence alignments. Proteins 2015; 83:411-27. [PMID: 25546158 DOI: 10.1002/prot.24746] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Revised: 11/25/2014] [Accepted: 12/10/2014] [Indexed: 01/09/2023]
Abstract
Constructing a model of a query protein based on its alignment to a homolog with experimentally determined spatial structure (the template) is still the most reliable approach to structure prediction. Alignment errors are the main bottleneck for homology modeling when the query is distantly related to the template. Alignment methods often misalign secondary structural elements by a few residues. Therefore, better alignment solutions can be found within a limited set of local shifts of secondary structures. We present a refinement method to improve pairwise sequence alignments by evaluating alignment variants generated by local shifts of template-defined secondary structures. Our method SFESA is based on a novel scoring function that combines the profile-based sequence score and the structure score derived from residue contacts in a template. Such a combined score frequently selects a better alignment variant among a set of candidate alignments generated by local shifts and leads to overall increase in alignment accuracy. Evaluation of several benchmarks shows that our refinement method significantly improves alignments made by automatic methods such as PROMALS, HHpred and CNFpred. The web server is available at http://prodata.swmed.edu/sfesa.
Collapse
Affiliation(s)
- Jing Tong
- Department of Biophysics, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, 75390; Department of Biochemistry, University of Texas Southwestern Medical Center at Dallas, Dallas, Texas, 75390
| | | | | | | |
Collapse
|
27
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
28
|
Park H, Lee GR, Heo L, Seok C. Protein loop modeling using a new hybrid energy function and its application to modeling in inaccurate structural environments. PLoS One 2014; 9:e113811. [PMID: 25419655 PMCID: PMC4242723 DOI: 10.1371/journal.pone.0113811] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2014] [Accepted: 10/30/2014] [Indexed: 11/19/2022] Open
Abstract
Protein loop modeling is a tool for predicting protein local structures of particular interest, providing opportunities for applications involving protein structure prediction and de novo protein design. Until recently, the majority of loop modeling methods have been developed and tested by reconstructing loops in frameworks of experimentally resolved structures. In many practical applications, however, the protein loops to be modeled are located in inaccurate structural environments. These include loops in model structures, low-resolution experimental structures, or experimental structures of different functional forms. Accordingly, discrepancies in the accuracy of the structural environment assumed in development of the method and that in practical applications present additional challenges to modern loop modeling methods. This study demonstrates a new strategy for employing a hybrid energy function combining physics-based and knowledge-based components to help tackle this challenge. The hybrid energy function is designed to combine the strengths of each energy component, simultaneously maintaining accurate loop structure prediction in a high-resolution framework structure and tolerating minor environmental errors in low-resolution structures. A loop modeling method based on global optimization of this new energy function is tested on loop targets situated in different levels of environmental errors, ranging from experimental structures to structures perturbed in backbone as well as side chains and template-based model structures. The new method performs comparably to force field-based approaches in loop reconstruction in crystal structures and better in loop prediction in inaccurate framework structures. This result suggests that higher-accuracy predictions would be possible for a broader range of applications. The web server for this method is available at http://galaxy.seoklab.org/loop with the PS2 option for the scoring function.
Collapse
Affiliation(s)
- Hahnbeom Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Lim Heo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
- * E-mail:
| |
Collapse
|
29
|
Hayashi T, Chiba S, Kaneta Y, Furuta T, Sakurai M. ATP-induced conformational changes of nucleotide-binding domains in an ABC transporter. Importance of the water-mediated entropic force. J Phys Chem B 2014; 118:12612-20. [PMID: 25302667 DOI: 10.1021/jp507930e] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
ATP binding cassette (ABC) proteins belong to a superfamily of active transporters. Recent experimental and computational studies have shown that binding of ATP to the nucleotide binding domains (NBDs) of ABC proteins drives the dimerization of NBDs, which, in turn, causes large conformational changes within the transmembrane domains (TMDs). To elucidate the active substrate transport mechanism of ABC proteins, it is first necessary to understand how the NBD dimerization is driven by ATP binding. In this study, we selected MalKs (NBDs of a maltose transporter) as a representative NBD and calculated the free-energy change upon dimerization using molecular mechanics calculations combined with a statistical thermodynamic theory of liquids, as well as a method to calculate the translational, rotational, and vibrational entropy change. This combined method is applied to a large number of snapshot structures obtained from molecular dynamics simulations containing explicit water molecules. The results suggest that the NBD dimerization proceeds with a large gain of water entropy when ATP molecules bind to the NBDs. The energetic gain arising from direct NBD-NBD interactions is canceled by the dehydration penalty and the configurational-entropy loss. ATP hydrolysis induces a loss of the shape complementarity between the NBDs, which leads to the dissociation of the dimer, due to a decrease in the water-entropy gain and an increase in the configurational-entropy loss. This interpretation of the NBD dimerization mechanism in concert with ATP, especially focused on the water-mediated entropy force, is potentially applicable to a wide variety of the ABC transporters.
Collapse
Affiliation(s)
- Tomohiko Hayashi
- Center for Biological Resources and Informatics, Tokyo Institute of Technology , 4259-B-62, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | | | | | | | | |
Collapse
|
30
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | | |
Collapse
|
31
|
Mamun K, Sharma A. Importance of Computational Intelligent in Proteomics. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2014. [DOI: 10.20965/jaciii.2014.p0469] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Computational Intelligent (CI) techniques have become an apparent need in many bioinformatics applications. In this article, we make the interested reader aware of the necessity of CI, providing a basic taxonomy of proteomics, and discussing their use, variety and potential in a number of both common as well as upcoming proteomics application.
Collapse
|
32
|
Gniewek P, Kolinski A, Kloczkowski A, Gront D. BioShell-Threading: versatile Monte Carlo package for protein 3D threading. BMC Bioinformatics 2014; 15:22. [PMID: 24444459 PMCID: PMC3937128 DOI: 10.1186/1471-2105-15-22] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2012] [Accepted: 11/18/2013] [Indexed: 11/26/2022] Open
Abstract
Background The comparative modeling approach to protein structure prediction inherently relies on a template structure. Before building a model such a template protein has to be found and aligned with the query sequence. Any error made on this stage may dramatically affects the quality of result. There is a need, therefore, to develop accurate and sensitive alignment protocols. Results BioShell threading software is a versatile tool for aligning protein structures, protein sequences or sequence profiles and query sequences to a template structures. The software is also capable of sub-optimal alignment generation. It can be executed as an application from the UNIX command line, or as a set of Java classes called from a script or a Java application. The implemented Monte Carlo search engine greatly facilitates the development and benchmarking of new alignment scoring schemes even when the functions exhibit non-deterministic polynomial-time complexity. Conclusions Numerical experiments indicate that the new threading application offers template detection abilities and provides much better alignments than other methods. The package along with documentation and examples is available at: http://bioshell.pl/threading3d.
Collapse
Affiliation(s)
| | | | | | - Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| |
Collapse
|
33
|
Abstract
Genome sequencing projects have resulted in a rapid increase in the number of known protein sequences. In contrast, only about one-hundredth of these sequences have been characterized at atomic resolution using experimental structure determination methods. Computational protein structure modeling techniques have the potential to bridge this sequence-structure gap. In this chapter, we present an example that illustrates the use of MODELLER to construct a comparative model for a protein with unknown structure. Automation of a similar protocol has resulted in models of useful accuracy for domains in more than half of all known protein sequences.
Collapse
Affiliation(s)
- Benjamin Webb
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
| | | |
Collapse
|
34
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
35
|
Pieper U, Webb BM, Dong GQ, Schneidman-Duhovny D, Fan H, Kim SJ, Khuri N, Spill YG, Weinkam P, Hammel M, Tainer JA, Nilges M, Sali A. ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 2013; 42:D336-46. [PMID: 24271400 PMCID: PMC3965011 DOI: 10.1093/nar/gkt1144] [Citation(s) in RCA: 219] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence-structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains almost 30 million reliable models for domains in 4.7 million unique protein sequences. ModBase allows users to compute or update comparative models on demand, through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the AllosMod server for modeling ligand-induced protein dynamics (http://salilab.org/allosmod), the AllosMod-FoXS server for predicting a structural ensemble that fits an SAXS profile (http://salilab.org/allosmod-foxs), the FoXSDock server for protein–protein docking filtered by an SAXS profile (http://salilab.org/foxsdock), the SAXS Merge server for automatic merging of SAXS profiles (http://salilab.org/saxsmerge) and the Pose & Rank server for scoring protein–ligand complexes (http://salilab.org/poseandrank). In this update, we also highlight two applications of ModBase: a PSI:Biology initiative to maximize the structural coverage of the human alpha-helical transmembrane proteome and a determination of structural determinants of human immunodeficiency virus-1 protease specificity.
Collapse
Affiliation(s)
- Ursula Pieper
- Department of Bioengineering and Therapeutic Sciences, California Institute for Quantitative Biosciences, Byers Hall at Mission Bay, Office 503B, University of California at San Francisco, 1700 4th Street, San Francisco, CA 94158, USA, Department of Pharmaceutical Chemistry, California Institute for Quantitative Biosciences, Byers Hall at Mission Bay, Office 503B, University of California at San Francisco, 1700 4th Street, San Francisco, CA 94158, USA, Graduate Group in Biophysics, University of California at San Francisco, CA 94158, USA, Structural Bioinformatics Unit, Structural Biology and Chemistry department, Institut Pasteur, 25 rue du Docteur Roux, 75015 Paris, France, Université Paris Diderot-Paris 7, école doctorale iViv, Paris Rive Gauche, 5 rue Thomas Mann, 75013 Paris, France, Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Department of Molecular Biology, Skaggs Institute of Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, USA, Life Sciences Division, Department of Molecular Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Gaillard T, Schwarz BBL, Chebaro Y, Stote RH, Dejaegere A. Protein structural statistics with PSS. J Chem Inf Model 2013; 53:2471-82. [PMID: 23957210 DOI: 10.1021/ci400233j] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Characterizing the variability within an ensemble of protein structures is a common requirement in structural biology and bioinformatics. With the increasing number of protein structures becoming available, there is a need for new tools capable of automating the structural comparison of large ensemble of structures. We present Protein Structural Statistics (PSS), a command-line program written in Perl for Unix-like environments, dedicated to the calculation of structural statistics for a set of proteins. PSS can perform multiple sequence alignments, structure superpositions, calculate Cartesian and dihedral coordinate statistics, and execute cluster analyses. An HTML report that contains a convenient summary of results with figures, tables, and hyperlinks can also be produced. PSS is a new tool providing an automated way to compare multiple structures. It integrates various types of structural analyses through an user-friendly and flexible interface, facilitating the access to powerful but more specialized programs. PSS is easy to modify and extend and is distributed under a free and open source license. The relevance of PSS is illustrated by examples of application to pertinent biological problems.
Collapse
Affiliation(s)
- Thomas Gaillard
- Laboratoire de Biochimie, UMR 7654 CNRS, Ecole Polytechnique , 91128 Palaiseau Cedex, France
| | | | | | | | | |
Collapse
|
37
|
Manning T, Sleator RD, Walsh P. Naturally selecting solutions: the use of genetic algorithms in bioinformatics. Bioengineered 2013; 4:266-78. [PMID: 23222169 PMCID: PMC3813526 DOI: 10.4161/bioe.23041] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Revised: 11/26/2012] [Accepted: 11/28/2012] [Indexed: 11/19/2022] Open
Abstract
For decades, computer scientists have looked to nature for biologically inspired solutions to computational problems; ranging from robotic control to scheduling optimization. Paradoxically, as we move deeper into the post-genomics era, the reverse is occurring, as biologists and bioinformaticians look to computational techniques, to solve a variety of biological problems. One of the most common biologically inspired techniques are genetic algorithms (GAs), which take the Darwinian concept of natural selection as the driving force behind systems for solving real world problems, including those in the bioinformatics domain. Herein, we provide an overview of genetic algorithms and survey some of the most recent applications of this approach to bioinformatics based problems.
Collapse
Affiliation(s)
- Timmy Manning
- Department of Computer Science; Cork Institute of Technology; Cork, Ireland
| | - Roy D Sleator
- Department of Biological Sciences; Cork Institute of Technology; Cork, Ireland
| | - Paul Walsh
- Department of Computer Science; Cork Institute of Technology; Cork, Ireland
| |
Collapse
|
38
|
Bharadwaj VS, Dean AM, Maupin CM. Insights into the Glycyl Radical Enzyme Active Site of Benzylsuccinate Synthase: A Computational Study. J Am Chem Soc 2013; 135:12279-88. [DOI: 10.1021/ja404842r] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Vivek S. Bharadwaj
- Chemical and Biological Engineering Department, Colorado School of Mines, 1500 Illinois Street, Golden,
Colorado 80401, United States
| | - Anthony M. Dean
- Chemical and Biological Engineering Department, Colorado School of Mines, 1500 Illinois Street, Golden,
Colorado 80401, United States
| | - C. Mark Maupin
- Chemical and Biological Engineering Department, Colorado School of Mines, 1500 Illinois Street, Golden,
Colorado 80401, United States
| |
Collapse
|
39
|
Tan KP, Nguyen TB, Patel S, Varadarajan R, Madhusudhan MS. Depth: a web server to compute depth, cavity sizes, detect potential small-molecule ligand-binding cavities and predict the pKa of ionizable residues in proteins. Nucleic Acids Res 2013; 41:W314-21. [PMID: 23766289 PMCID: PMC3692129 DOI: 10.1093/nar/gkt503] [Citation(s) in RCA: 125] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Residue depth accurately measures burial and parameterizes local protein environment. Depth is the distance of any atom/residue to the closest bulk water. We consider the non-bulk waters to occupy cavities, whose volumes are determined using a Voronoi procedure. Our estimation of cavity sizes is statistically superior to estimates made by CASTp and VOIDOO, and on par with McVol over a data set of 40 cavities. Our calculated cavity volumes correlated best with the experimentally determined destabilization of 34 mutants from five proteins. Some of the cavities identified are capable of binding small molecule ligands. In this study, we have enhanced our depth-based predictions of binding sites by including evolutionary information. We have demonstrated that on a database (LigASite) of ∼200 proteins, we perform on par with ConCavity and better than MetaPocket 2.0. Our predictions, while less sensitive, are more specific and precise. Finally, we use depth (and other features) to predict pKas of GLU, ASP, LYS and HIS residues. Our results produce an average error of just <1 pH unit over 60 predictions. Our simple empirical method is statistically on par with two and superior to three other methods while inferior to only one. The DEPTH server (http://mspc.bii.a-star.edu.sg/depth/) is an ideal tool for rapid yet accurate structural analyses of protein structures.
Collapse
Affiliation(s)
- Kuan Pern Tan
- Bioinformatics Institute, 30 Biopolis Street, #07-01, Matrix, Singapore 138671
| | | | | | | | | |
Collapse
|
40
|
Latek D, Pasznik P, Carlomagno T, Filipek S. Towards improved quality of GPCR models by usage of multiple templates and profile-profile comparison. PLoS One 2013; 8:e56742. [PMID: 23468878 PMCID: PMC3585245 DOI: 10.1371/journal.pone.0056742] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Accepted: 01/14/2013] [Indexed: 11/19/2022] Open
Abstract
UNLABELLED G-protein coupled receptors (GPCRs) are targets of nearly one third of the drugs at the current pharmaceutical market. Despite their importance in many cellular processes the crystal structures are available for less than 20 unique GPCRs of the Rhodopsin-like class. Fortunately, even though involved in different signaling cascades, this large group of membrane proteins has preserved a uniform structure comprising seven transmembrane helices that allows quite reliable comparative modeling. Nevertheless, low sequence similarity between the GPCR family members is still a serious obstacle not only in template selection but also in providing theoretical models of acceptable quality. An additional level of difficulty is the prediction of kinks and bulges in transmembrane helices. Usage of multiple templates and generation of alignments based on sequence profiles may increase the rate of success in difficult cases of comparative modeling in which the sequence similarity between GPCRs is exceptionally low. Here, we present GPCRM, a novel method for fast and accurate generation of GPCR models using averaging of multiple template structures and profile-profile comparison. In particular, GPCRM is the first GPCR structure predictor incorporating two distinct loop modeling techniques: Modeller and Rosetta together with the filtering of models based on the Z-coordinate. We tested our approach on all unique GPCR structures determined to date and report its performance in comparison with other computational methods targeting the Rhodopsin-like class. We also provide a database of precomputed GPCR models of the human receptors from that class. AVAILABILITY GPCRM SERVER AND DATABASE: http://gpcrm.biomodellab.eu.
Collapse
Affiliation(s)
- Dorota Latek
- International Institute of Molecular and Cell Biology, Warsaw, Poland
- * E-mail: (DL); (SF)
| | - Pawel Pasznik
- International Institute of Molecular and Cell Biology, Warsaw, Poland
| | - Teresa Carlomagno
- EMBL, Structural and Computational Biology Unit, Heidelberg, Germany
| | - Slawomir Filipek
- Faculty of Chemistry, University of Warsaw, Warsaw, Poland
- * E-mail: (DL); (SF)
| |
Collapse
|
41
|
Abstract
Motivation: Alignment errors are still the main bottleneck for current template-based protein modeling (TM) methods, including protein threading and homology modeling, especially when the sequence identity between two proteins under consideration is low (<30%). Results: We present a novel protein threading method, CNFpred, which achieves much more accurate sequence–template alignment by employing a probabilistic graphical model called a Conditional Neural Field (CNF), which aligns one protein sequence to its remote template using a non-linear scoring function. This scoring function accounts for correlation among a variety of protein sequence and structure features, makes use of information in the neighborhood of two residues to be aligned, and is thus much more sensitive than the widely used linear or profile-based scoring function. To train this CNF threading model, we employ a novel quality-sensitive method, instead of the standard maximum-likelihood method, to maximize directly the expected quality of the training set. Experimental results show that CNFpred generates significantly better alignments than the best profile-based and threading methods on several public (but small) benchmarks as well as our own large dataset. CNFpred outperforms others regardless of the lengths or classes of proteins, and works particularly well for proteins with sparse sequence profiles due to the effective utilization of structure information. Our methodology can also be adapted to protein sequence alignment. Contact:j3xu@ttic.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianzhu Ma
- Toyota Technological Institute at Chicago, IL 60637, USA
| | | | | | | |
Collapse
|
42
|
Vyas VK, Ukawala RD, Ghate M, Chintha C. Homology modeling a fast tool for drug discovery: current perspectives. Indian J Pharm Sci 2012. [PMID: 23204616 PMCID: PMC3507339 DOI: 10.4103/0250-474x.102537] [Citation(s) in RCA: 139] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery.
Collapse
Affiliation(s)
- V K Vyas
- Department of Pharmaceutical Chemistry, Institute of Pharmacy, Nirma University, Ahmedabad-382 481, India
| | | | | | | |
Collapse
|
43
|
Vishnepolsky B, Managadze G, Grigolava M, Pirtskhalava M. Evaluation performance of substitution matrices, based on contacts between residue terminal groups. J Biomol Struct Dyn 2012; 30:180-90. [PMID: 22702729 DOI: 10.1080/07391102.2012.677769] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Sequence alignment is a standard method for the estimation of the evolutionary, structural, and functional relationships among amino acid sequences. The quality of alignments depends on the used similarity matrix. Statistical contact potentials (CPs) contain information on contact propensities among residues in native protein structures. Substitution matrices (SMs) based on CPs are applicable for the comparison of distantly related sequences. Here, contact between amino acids was estimated on the basis of the evaluation of the distances between side-chain terminal groups (SCTGs), which are defined as the group of the side-chain heavy atoms with fixed distances between them. In this paper, two new types of CPs and similarity matrices have been constructed: one based on fixed cutoff distance obtained from geometric characteristics of the SCTGs (TGC1), while the other is distance-dependent potential (TGC2). These matrices are compared with other popular SMs. The performance of the matrices was evaluated by comparing sequence with structural alignments. The obtained results show that TGC2 has the best performance among contact-based matrices, but on the whole, contact-based matrices have slightly lower performance than other SMs except fold-level similarity.
Collapse
Affiliation(s)
- Boris Vishnepolsky
- Life Science Research Centre, Laboratory of Bioinformatics, 14 Gotua St, Tbilisi, 0160, Georgia.
| | | | | | | |
Collapse
|
44
|
Braberg H, Webb BM, Tjioe E, Pieper U, Sali A, Madhusudhan MS. SALIGN: a web server for alignment of multiple protein sequences and structures. Bioinformatics 2012; 28:2072-3. [PMID: 22618536 DOI: 10.1093/bioinformatics/bts302] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
SUMMARY Accurate alignment of protein sequences and/or structures is crucial for many biological analyses, including functional annotation of proteins, classifying protein sequences into families, and comparative protein structure modeling. Described here is a web interface to SALIGN, the versatile protein multiple sequence/structure alignment module of MODELLER. The web server automatically determines the best alignment procedure based on the inputs, while allowing the user to override default parameter values. Multiple alignments are guided by a dendrogram computed from a matrix of all pairwise alignment scores. When aligning sequences to structures, SALIGN uses structural environment information to place gaps optimally. If two multiple sequence alignments of related proteins are input to the server, a profile-profile alignment is performed. All features of the server have been previously optimized for accuracy, especially in the contexts of comparative modeling and identification of interacting protein partners. AVAILABILITY The SALIGN web server is freely accessible to the academic community at http://salilab.org/salign. SALIGN is a module of the MODELLER software, also freely available to academic users (http://salilab.org/modeller). CONTACT sali@salilab.org; madhusudhan@bii.a-star.edu.sg.
Collapse
Affiliation(s)
- Hannes Braberg
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA
| | | | | | | | | | | |
Collapse
|
45
|
Computer-based annotation of putative AraC/XylS-family transcription factors of known structure but unknown function. J Biomed Biotechnol 2012; 2012:103132. [PMID: 22505803 PMCID: PMC3312330 DOI: 10.1155/2012/103132] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Revised: 12/09/2011] [Accepted: 12/13/2011] [Indexed: 12/12/2022] Open
Abstract
Currently, about 20 crystal structures per day are released and deposited in the Protein Data Bank. A significant fraction of these structures is produced by research groups associated with the structural genomics consortium. The biological function of many of these proteins is generally unknown or not validated by experiment. Therefore, a growing need for functional prediction of protein structures has emerged. Here we present an integrated bioinformatics method that combines sequence-based relationships and three-dimensional (3D) structural similarity of transcriptional regulators with computer prediction of their cognate DNA binding sequences. We applied this method to the AraC/XylS family of transcription factors, which is a large family of transcriptional regulators found in many bacteria controlling the expression of genes involved in diverse biological functions. Three putative new members of this family with known 3D structure but unknown function were identified for which a probable functional classification is provided. Our bioinformatics analyses suggest that they could be involved in plant cell wall degradation (Lin2118 protein from Listeria innocua, PDB code 3oou), symbiotic nitrogen fixation (protein from Chromobacterium violaceum, PDB code 3oio), and either metabolism of plant-derived biomass or nitrogen fixation (protein from Rhodopseudomonas palustris, PDB code 3mn2).
Collapse
|
46
|
Phillips G, Grochowski LL, Bonnett S, Xu H, Bailly M, Haas-Blaby C, El Yacoubi B, Iwata-Reuyl D, White RH, de Crécy-Lagard V. Functional promiscuity of the COG0720 family. ACS Chem Biol 2012; 7:197-209. [PMID: 21999246 PMCID: PMC3262898 DOI: 10.1021/cb200329f] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The biosynthesis of GTP derived metabolites such as tetrahydrofolate (THF), biopterin (BH(4)), and the modified tRNA nucleosides queuosine (Q) and archaeosine (G(+)) relies on several enzymes of the Tunnel-fold superfamily. A subset of these proteins includes the 6-pyruvoyltetrahydropterin (PTPS-II), PTPS-III, and PTPS-I homologues, all members of the COG0720 family that have been previously shown to transform 7,8-dihydroneopterin triphosphate (H(2)NTP) into different products. PTPS-II catalyzes the formation of 6-pyruvoyltetrahydropterin in the BH(4) pathway, PTPS-III catalyzes the formation of 6-hydroxylmethyl-7,8-dihydropterin in the THF pathway, and PTPS-I catalyzes the formation of 6-carboxy-5,6,7,8-tetrahydropterin in the Q pathway. Genes of these three enzyme families are often misannotated as they are difficult to differentiate by sequence similarity alone. Using a combination of physical clustering, signature motif, phylogenetic codistribution analyses, in vivo complementation studies, and in vitro enzymatic assays, a complete reannotation of the COG0720 family was performed in prokaryotes. Notably, this work identified and experimentally validated dual function PTPS-I/III enzymes involved in both THF and Q biosynthesis. Both in vivo and in vitro analyses showed that the PTPS-I family could tolerate a translation of the active site cysteine and was inherently promiscuous, catalyzing different reactions on the same substrate or the same reaction on different substrates. Finally, the analysis and experimental validation of several archaeal COG0720 members confirmed the role of PTPS-I in archaeosine biosynthesis and resulted in the identification of PTPS-III enzymes with variant signature sequences in Sulfolobus species. This study reveals an expanded versatility of the COG0720 family members and illustrates that for certain protein families extensive comparative genomic analysis beyond homology is required to correctly predict function.
Collapse
Affiliation(s)
- Gabriela Phillips
- Department of Microbiology and Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611
| | - Laura L. Grochowski
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061
| | - Shilah Bonnett
- Department of Chemistry, Portland State University, Portland, OR 97207
| | - Huimin Xu
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061
| | - Marc Bailly
- Department of Microbiology and Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611
| | - Crysten Haas-Blaby
- Department of Microbiology and Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611
| | - Basma El Yacoubi
- Department of Microbiology and Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611
| | - Dirk Iwata-Reuyl
- Department of Chemistry, Portland State University, Portland, OR 97207
| | - Robert H. White
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061
| | - Valérie de Crécy-Lagard
- Department of Microbiology and Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611
| |
Collapse
|
47
|
Mullins JGL. Structural modelling pipelines in next generation sequencing projects. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012; 89:117-67. [PMID: 23046884 DOI: 10.1016/b978-0-12-394287-6.00005-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Our capacity to reliably predict protein structure from sequence is steadily improving due to the increased numbers and better targeting of protein structures being experimentally determined by structural genomics projects, along with the development of better modeling methodologies. Template-based (homology) modeling and de novo modeling methods are being combined to fill in remaining gaps in template coverage, and powerful automated structural modeling pipelines are being applied to large data sets of protein sequences. The improved quality of 3D models of proteins has led to their routine use in assessing the functional impact of nonsynonymous single nucleotide polymorphisms (nsSNPs) in specific protein systems, with the development of approaches that may be applied in a predictive fashion to nsSNPs emerging from next-generation sequencing projects. The challenges encountered in deriving functionally meaningful deductions from structural modeling can be quite different for proteins of different protein functional classes. The specific challenges to the assessment of the structural and functional impact of nsSNPs in globular proteins such as binding and regulatory proteins, structural proteins, and enzymes are discussed, as well as membrane transport proteins and ion channels. The mapping of reliable predictions of the structural and functional impact of SNPs, generated from automated modeling pipelines, on to protein-protein interaction networks will facilitate new approaches to understanding complex polygenic disorders and predisposition to disease.
Collapse
Affiliation(s)
- Jonathan G L Mullins
- Genome and Structural Bioinformatics, Institute of Life Science, College of Medicine, Swansea University, Singleton Park, Swansea, Wales, UK.
| |
Collapse
|
48
|
Ye X, Wang G, Altschul SF. An assessment of substitution scores for protein profile-profile comparison. Bioinformatics 2011; 27:3356-63. [PMID: 21998158 PMCID: PMC3232366 DOI: 10.1093/bioinformatics/btr565] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 09/22/2011] [Accepted: 10/06/2011] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Pairwise protein sequence alignments are generally evaluated using scores defined as the sum of substitution scores for aligning amino acids to one another, and gap scores for aligning runs of amino acids in one sequence to null characters inserted into the other. Protein profiles may be abstracted from multiple alignments of protein sequences, and substitution and gap scores have been generalized to the alignment of such profiles either to single sequences or to other profiles. Although there is widespread agreement on the general form substitution scores should take for profile-sequence alignment, little consensus has been reached on how best to construct profile-profile substitution scores, and a large number of these scoring systems have been proposed. Here, we assess a variety of such substitution scores. For this evaluation, given a gold standard set of multiple alignments, we calculate the probability that a profile column yields a higher substitution score when aligned to a related than to an unrelated column. We also generalize this measure to sets of two or three adjacent columns. This simple approach has the advantages that it does not depend primarily upon the gold-standard alignment columns with the weakest empirical support, and that it does not need to fit gap and offset costs for use with each substitution score studied. RESULTS A simple symmetrization of mean profile-sequence scores usually performed the best. These were followed closely by several specific scoring systems constructed using a variety of rationales. CONTACT altschul@ncbi.nlm.nih.gov SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xugang Ye
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | | | |
Collapse
|
49
|
Kuziemko A, Honig B, Petrey D. Using structure to explore the sequence alignment space of remote homologs. PLoS Comput Biol 2011; 7:e1002175. [PMID: 21998567 PMCID: PMC3188491 DOI: 10.1371/journal.pcbi.1002175] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2011] [Accepted: 07/14/2011] [Indexed: 11/18/2022] Open
Abstract
Protein structure modeling by homology requires an accurate sequence alignment between the query protein and its structural template. However, sequence alignment methods based on dynamic programming (DP) are typically unable to generate accurate alignments for remote sequence homologs, thus limiting the applicability of modeling methods. A central problem is that the alignment that is “optimal” in terms of the DP score does not necessarily correspond to the alignment that produces the most accurate structural model. That is, the correct alignment based on structural superposition will generally have a lower score than the optimal alignment obtained from sequence. Variations of the DP algorithm have been developed that generate alternative alignments that are “suboptimal” in terms of the DP score, but these still encounter difficulties in detecting the correct structural alignment. We present here a new alternative sequence alignment method that relies heavily on the structure of the template. By initially aligning the query sequence to individual fragments in secondary structure elements and combining high-scoring fragments that pass basic tests for “modelability”, we can generate accurate alignments within a small ensemble. Our results suggest that the set of sequences that can currently be modeled by homology can be greatly extended. It has been suggested that, for nearly every protein sequence, there is already a protein with a similar structure in current protein structure databases. However, with poor or undetectable sequence relationships, it is expected that accurate alignments and models cannot be generated. Here we show that this is not the case, and that whenever structural relationship exists, there are usually local sequence relationships that can be used to generate an accurate alignment, no matter what the global sequence identity. However, this requires an alternative to the traditional dynamic programming algorithm and the consideration of a small ensemble of alignments. We present an algorithm, S4, and demonstrate that it is capable of generating accurate alignments in nearly all cases where a structural relationship exists between two proteins. Our results thus constitute an important advance in the full exploitation of the information in structural databases. That is, the expectation of an accurate alignment suggests that a meaningful model can be generated for nearly every sequence for which a suitable template exists.
Collapse
Affiliation(s)
- Andrew Kuziemko
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Barry Honig
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
50
|
Wang C, Yan RX, Wang XF, Si JN, Zhang Z. Comparison of linear gap penalties and profile-based variable gap penalties in profile–profile alignments. Comput Biol Chem 2011; 35:308-18. [DOI: 10.1016/j.compbiolchem.2011.07.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Revised: 05/06/2011] [Accepted: 07/11/2011] [Indexed: 10/18/2022]
|