1
|
Vishwakarma P, Vattekatte AM, Shinada N, Diharce J, Martins C, Cadet F, Gardebien F, Etchebest C, Nadaradjane AA, de Brevern AG. V HH Structural Modelling Approaches: A Critical Review. Int J Mol Sci 2022; 23:3721. [PMID: 35409081 PMCID: PMC8998791 DOI: 10.3390/ijms23073721] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 03/23/2022] [Accepted: 03/23/2022] [Indexed: 12/20/2022] Open
Abstract
VHH, i.e., VH domains of camelid single-chain antibodies, are very promising therapeutic agents due to their significant physicochemical advantages compared to classical mammalian antibodies. The number of experimentally solved VHH structures has significantly improved recently, which is of great help, because it offers the ability to directly work on 3D structures to humanise or improve them. Unfortunately, most VHHs do not have 3D structures. Thus, it is essential to find alternative ways to get structural information. The methods of structure prediction from the primary amino acid sequence appear essential to bypass this limitation. This review presents the most extensive overview of structure prediction methods applied for the 3D modelling of a given VHH sequence (a total of 21). Besides the historical overview, it aims at showing how model software programs have been shaping the structural predictions of VHHs. A brief explanation of each methodology is supplied, and pertinent examples of their usage are provided. Finally, we present a structure prediction case study of a recently solved VHH structure. According to some recent studies and the present analysis, AlphaFold 2 and NanoNet appear to be the best tools to predict a structural model of VHH from its sequence.
Collapse
Affiliation(s)
- Poonam Vishwakarma
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| | - Akhila Melarkode Vattekatte
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| | | | - Julien Diharce
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
| | - Carla Martins
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| | - Frédéric Cadet
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
- PEACCEL, Artificial Intelligence Department, Square Albin Cachot, F-75013 Paris, France
| | - Fabrice Gardebien
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| | - Catherine Etchebest
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
| | - Aravindan Arun Nadaradjane
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| | - Alexandre G. de Brevern
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-75015 Paris, France; (P.V.); (A.M.V.); (J.D.); (C.M.); (C.E.); (A.A.N.)
- INSERM UMR_S 1134, BIGR, DSIMB Team, Université de Paris and Université de la Réunion, F-97715 Saint Denis Messag, France; (F.C.); (F.G.)
| |
Collapse
|
2
|
Hirte M, Meese N, Mertz M, Fuchs M, Brück TB. Insights Into the Bifunctional Aphidicolan-16-ß-ol Synthase Through Rapid Biomolecular Modeling Approaches. Front Chem 2018; 6:101. [PMID: 29692986 PMCID: PMC5902962 DOI: 10.3389/fchem.2018.00101] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 03/20/2018] [Indexed: 01/23/2023] Open
Abstract
Diterpene synthases catalyze complex, multi-step C-C coupling reactions thereby converting the universal, aliphatic precursor geranylgeranyl diphosphate into diverse olefinic macrocylces that form the basis for the structural diversity of the diterpene natural product family. Since catalytically relevant crystal structures of diterpene synthases are scarce, homology based biomolecular modeling techniques offer an alternative route to study the enzyme's reaction mechanism. However, precise identification of catalytically relevant amino acids is challenging since these models require careful preparation and refinement techniques prior to substrate docking studies. Targeted amino acid substitutions in this protein class can initiate premature quenching of the carbocation centered reaction cascade. The structural characterization of those alternative cyclization products allows for elucidation of the cyclization reaction cascade and provides a new source for complex macrocyclic synthons. In this study, new insights into structure and function of the fungal, bifunctional Aphidicolan-16-ß-ol synthase were achieved using a simplified biomolecular modeling strategy. The applied refinement methodologies could rapidly generate a reliable protein-ligand complex, which provides for an accurate in silico identification of catalytically relevant amino acids. Guided by our modeling data, ACS mutations lead to the identification of the catalytically relevant ACS amino acid network I626, T657, Y658, A786, F789, and Y923. Moreover, the ACS amino acid substitutions Y658L and D661A resulted in a premature termination of the cyclization reaction cascade en-route from syn-copalyl diphosphate to Aphidicolan-16-ß-ol. Both ACS mutants generated the diterpene macrocycle syn-copalol and a minor, non-hydroxylated labdane related diterpene, respectively. Our biomolecular modeling and mutational studies suggest that the ACS substrate cyclization occurs in a spatially restricted location of the enzyme's active site and that the geranylgeranyl diphosphate derived pyrophosphate moiety remains in the ACS active site thereby directing the cyclization process. Our cumulative data confirm that amino acids constituting the G-loop of diterpene synthases are involved in the open to the closed, catalytically active enzyme conformation. This study demonstrates that a simple and rapid biomolecular modeling procedure can predict catalytically relevant amino acids. The approach reduces computational and experimental screening efforts for diterpene synthase structure-function analyses.
Collapse
Affiliation(s)
- Max Hirte
- Werner Siemens Chair of Synthetic Biotechnology, Department of Chemistry, Technical University of Munich, Munich, Germany
| | - Nicolas Meese
- Werner Siemens Chair of Synthetic Biotechnology, Department of Chemistry, Technical University of Munich, Munich, Germany
| | - Michael Mertz
- Werner Siemens Chair of Synthetic Biotechnology, Department of Chemistry, Technical University of Munich, Munich, Germany
| | - Monika Fuchs
- Werner Siemens Chair of Synthetic Biotechnology, Department of Chemistry, Technical University of Munich, Munich, Germany
| | - Thomas B Brück
- Werner Siemens Chair of Synthetic Biotechnology, Department of Chemistry, Technical University of Munich, Munich, Germany
| |
Collapse
|
3
|
Abstract
A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ~35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ~6,000 sequences submitted by ~1,600 users from around the world.
Collapse
|
4
|
Moreno-Hernández S, Levitt M. Comparative modeling and protein-like features of hydrophobic-polar models on a two-dimensional lattice. Proteins 2012; 80:1683-93. [PMID: 22411636 DOI: 10.1002/prot.24067] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Revised: 02/26/2012] [Accepted: 03/03/2012] [Indexed: 11/07/2022]
Abstract
Lattice models of proteins have been extensively used to study protein thermodynamics, folding dynamics, and evolution. Our study considers two different hydrophobic-polar (HP) models on the 2D square lattice: the purely HP model and a model where a compactness-favoring term is added. We exhaustively enumerate all the possible structures in our models and perform the study of their corresponding folds, HP arrangements in space and shapes. The two models considered differ greatly in their numbers of structures, folds, arrangements, and shapes. Despite their differences, both lattice models have distinctive protein-like features: (1) Shapes are compact in both models, especially when a compactness-favoring energy term is added. (2) The residue composition is independent of the chain length and is very close to 50% hydrophobic in both models, as we observe in real proteins. (3) Comparative modeling works well in both models, particularly in the more compact one. The fact that our models show protein-like features suggests that lattice models incorporate the fundamental physical principles of proteins. Our study supports the use of lattice models to study questions about proteins that require exactness and extensive calculations, such as protein design and evolution, which are often too complex and computationally demanding to be addressed with more detailed models.
Collapse
Affiliation(s)
- Sergio Moreno-Hernández
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | | |
Collapse
|
5
|
Hong Y, Chintapalli SV, Ko KD, Bhardwaj G, Zhang Z, van Rossum D, Patterson RL. Predicting protein folds with fold-specific PSSM libraries. PLoS One 2011; 6:e20557. [PMID: 21698189 PMCID: PMC3116844 DOI: 10.1371/journal.pone.0020557] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2011] [Accepted: 05/05/2011] [Indexed: 11/23/2022] Open
Abstract
Accurately assigning folds for divergent protein sequences is a major obstacle to structural studies. Herein, we outline an effective method for fold recognition using sets of PSSMs, each of which is constructed for different protein folds. Our analyses demonstrate that FSL (Fold-specific Position Specific Scoring Matrix Libraries) can predict/relate structures given only their amino acid sequences of highly divergent proteins. This ability to detect distant relationships is dependent on low-identity sequence alignments obtained from FSL. Results from our experiments demonstrate that FSL perform well in recognizing folds from the "twilight-zone" SABmark dataset. Further, this method is capable of accurate fold prediction in newly determined structures. We suggest that by building complete PSSM libraries for all unique folds within the Protein Database (PDB), FSL can be used to rapidly and reliably annotate a large subset of protein folds at proteomic level. The related programs and fold-specific PSSMs for our FSL are publicly available at: http://ccp.psu.edu/download/FSLv1.0/.
Collapse
Affiliation(s)
- Yoojin Hong
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Sree Vamsee Chintapalli
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Kyung Dae Ko
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Gaurav Bhardwaj
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Zhenhai Zhang
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Damian van Rossum
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Biology, The Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Randen L. Patterson
- Center for Computational Proteomics, The Pennsylvania State University, University Park, Pennsylvania, United States of America
- Department of Physiology and Membrane Biology, University of California Davis Medical School, Sacramento, California, United States of America
- Department of Biochemistry, University of California Davis Medical School, Sacramento, California, United States of America
- The Genome Center, University of California Davis Medical School, Sacramento, California, United States of America
| |
Collapse
|
6
|
Peng J, Xu J. A multiple-template approach to protein threading. Proteins 2011; 79:1930-9. [PMID: 21465564 DOI: 10.1002/prot.23016] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Revised: 01/05/2011] [Accepted: 01/28/2011] [Indexed: 12/29/2022]
Abstract
Most threading methods predict the structure of a protein using only a single template. Due to the increasing number of solved structures, a protein without solved structure is very likely to have more than one similar template structures. Therefore, a natural question to ask is if we can improve modeling accuracy using multiple templates. This article describes a new multiple-template threading method to answer this question. At the heart of this multiple-template threading method is a novel probabilistic-consistency algorithm that can accurately align a single protein sequence simultaneously to multiple templates. Experimental results indicate that our multiple-template method can improve pairwise sequence-template alignment accuracy and generate models with better quality than single-template models even if they are built from the best single templates (P-value <10(-6)) while many popular multiple sequence/structure alignment tools fail to do so. The underlying reason is that our probabilistic-consistency algorithm can generate accurate multiple sequence/template alignments. In another word, without an accurate multiple sequence/template alignment, the modeling accuracy cannot be improved by simply using multiple templates to increase alignment coverage. Blindly tested on the CASP9 targets with more than one good template structures, our method outperforms all other CASP9 servers except two (Zhang-Server and QUARK of the same group). Our probabilistic-consistency algorithm can possibly be extended to align multiple protein/RNA sequences and structures.
Collapse
Affiliation(s)
- Jian Peng
- Toyota Technological Institute at Chicago, 6045 S Kenwood, Chicago, Illinois 60637, USA
| | | |
Collapse
|
7
|
Singh R, Park D, Xu J, Hosur R, Berger B. Struct2Net: a web service to predict protein-protein interactions using a structure-based approach. Nucleic Acids Res 2010; 38:W508-15. [PMID: 20513650 PMCID: PMC2896152 DOI: 10.1093/nar/gkq481] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2010] [Revised: 05/02/2010] [Accepted: 05/13/2010] [Indexed: 01/22/2023] Open
Abstract
Struct2Net is a web server for predicting interactions between arbitrary protein pairs using a structure-based approach. Prediction of protein-protein interactions (PPIs) is a central area of interest and successful prediction would provide leads for experiments and drug design; however, the experimental coverage of the PPI interactome remains inadequate. We believe that Struct2Net is the first community-wide resource to provide structure-based PPI predictions that go beyond homology modeling. Also, most web-resources for predicting PPIs currently rely on functional genomic data (e.g. GO annotation, gene expression, cellular localization, etc.). Our structure-based approach is independent of such methods and only requires the sequence information of the proteins being queried. The web service allows multiple querying options, aimed at maximizing flexibility. For the most commonly studied organisms (fly, human and yeast), predictions have been pre-computed and can be retrieved almost instantaneously. For proteins from other species, users have the option of getting a quick-but-approximate result (using orthology over pre-computed results) or having a full-blown computation performed. The web service is freely available at http://struct2net.csail.mit.edu.
Collapse
Affiliation(s)
- Rohit Singh
- Computer Science and Artificial Intelligence Laboratory, Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, Toyota Technological Institute at Chicago, Chicago, IL and Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Daniel Park
- Computer Science and Artificial Intelligence Laboratory, Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, Toyota Technological Institute at Chicago, Chicago, IL and Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Jinbo Xu
- Computer Science and Artificial Intelligence Laboratory, Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, Toyota Technological Institute at Chicago, Chicago, IL and Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Raghavendra Hosur
- Computer Science and Artificial Intelligence Laboratory, Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, Toyota Technological Institute at Chicago, Chicago, IL and Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, MA, Toyota Technological Institute at Chicago, Chicago, IL and Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
8
|
Peng J, Xu J. Boosting Protein Threading Accuracy. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY : ... ANNUAL INTERNATIONAL CONFERENCE, RECOMB ... : PROCEEDINGS. RECOMB (CONFERENCE : 2005- ) 2009; 5541:31-45. [PMID: 22506254 DOI: 10.1007/978-3-642-02008-7_3] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Protein threading is one of the most successful protein structure prediction methods. Most protein threading methods use a scoring function linearly combining sequence and structure features to measure the quality of a sequence-template alignment so that a dynamic programming algorithm can be used to optimize the scoring function. However, a linear scoring function cannot fully exploit interdependency among features and thus, limits alignment accuracy.This paper presents a nonlinear scoring function for protein threading, which not only can model interactions among different protein features, but also can be efficiently optimized using a dynamic programming algorithm. We achieve this by modeling the threading problem using a probabilistic graphical model Conditional Random Fields (CRF) and training the model using the gradient tree boosting algorithm. The resultant model is a nonlinear scoring function consisting of a collection of regression trees. Each regression tree models a type of nonlinear relationship among sequence and structure features. Experimental results indicate that this new threading model can effectively leverage weak biological signals and improve both alignment accuracy and fold recognition rate greatly.
Collapse
|
9
|
Li SC, Bu D, Xu J, Li M. Fragment-HMM: a new approach to protein structure prediction. Protein Sci 2008; 17:1925-34. [PMID: 18723665 DOI: 10.1110/ps.036442.108] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
We designed a simple position-specific hidden Markov model to predict protein structure. Our new framework naturally repeats itself to converge to a final target, conglomerating fragment assembly, clustering, target selection, refinement, and consensus, all in one process. Our initial implementation of this theory converges to within 6 A of the native structures for 100% of decoys on all six standard benchmark proteins used in ROSETTA (discussed by Simons and colleagues in a recent paper), which achieved only 14%-94% for the same data. The qualities of the best decoys and the final decoys our theory converges to are also notably better.
Collapse
Affiliation(s)
- Shuai Cheng Li
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| | | | | | | |
Collapse
|
10
|
A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter presents a broad and a historical overview of the problem of protein structure prediction. Different structure prediction methods, including homology modeling, fold recognition (FR)/protein threading, ab initio/de novo approaches, and hybrid techniques involving multiple types of approaches, are introduced in a historical context. The progress of the field as a whole, especially in the threading/FR area, as reflected by the CASP/CAFASP contests, is reviewed. At the end of the chapter, we discuss the challenging issues ahead in the field of protein structure prediction.
Collapse
|
11
|
Floudas C, Fung H, McAllister S, Mönnigmann M, Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 2006. [DOI: 10.1016/j.ces.2005.04.009] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|