1
|
Morehead A, Liu J, Cheng J. Protein structure accuracy estimation using geometry-complete perceptron networks. Protein Sci 2024; 33:e4932. [PMID: 38380738 PMCID: PMC10880424 DOI: 10.1002/pro.4932] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Revised: 01/05/2024] [Accepted: 02/01/2024] [Indexed: 02/22/2024]
Abstract
Estimating the accuracy of protein structural models is a critical task in protein bioinformatics. The need for robust methods in the estimation of protein model accuracy (EMA) is prevalent in the field of protein structure prediction, where computationally-predicted structures need to be screened rapidly for the reliability of the positions predicted for each of their amino acid residues and their overall quality. Current methods proposed for EMA are either coupled tightly to existing protein structure prediction methods or evaluate protein structures without sufficiently leveraging the rich, geometric information available in such structures to guide accuracy estimation. In this work, we propose a geometric message passing neural network referred to as the geometry-complete perceptron network for protein structure EMA (GCPNet-EMA), where we demonstrate through rigorous computational benchmarks that GCPNet-EMA's accuracy estimations are 47% faster and more than 10% (6%) more correlated with ground-truth measures of per-residue (per-target) structural accuracy compared to baseline state-of-the-art methods for tertiary (multimer) structure EMA including AlphaFold 2. The source code and data for GCPNet-EMA are available on GitHub, and a public web server implementation is freely available.
Collapse
Affiliation(s)
- Alex Morehead
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jian Liu
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaMissouriUSA
| |
Collapse
|
2
|
Kabir KL, Ma B, Nussinov R, Shehu A. Fewer Dimensions, More Structures for Improved Discrete Models of Dynamics of Free versus Antigen-Bound Antibody. Biomolecules 2022; 12:biom12071011. [PMID: 35883567 PMCID: PMC9313177 DOI: 10.3390/biom12071011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 07/12/2022] [Accepted: 07/19/2022] [Indexed: 12/10/2022] Open
Abstract
Over the past decade, Markov State Models (MSM) have emerged as powerful methodologies to build discrete models of dynamics over structures obtained from Molecular Dynamics trajectories. The identification of macrostates for the MSM is a central decision that impacts the quality of the MSM but depends on both the selected representation of a structure and the clustering algorithm utilized over the featurized structures. Motivated by a large molecular system in its free and bound state, this paper investigates two directions of research, further reducing the representation dimensionality in a non-parametric, data-driven manner and including more structures in the computation. Rigorous evaluation of the quality of obtained MSMs via various statistical tests in a comparative setting firmly shows that fewer dimensions and more structures result in a better MSM. Many interesting findings emerge from the best MSM, advancing our understanding of the relationship between antibody dynamics and antibody–antigen recognition.
Collapse
Affiliation(s)
- Kazi Lutful Kabir
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA;
- Correspondence: ; Tel.: +1-571-201-5070
| | - Buyong Ma
- Engineering Research Center of Cell & Therapeutic Antibody School of Pharmacy, Shanghai Jiaotong University, Shanghai 200240, China;
| | - Ruth Nussinov
- Computational Structural Biology Section, Cancer Innovation Laboratory, Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA;
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA;
| |
Collapse
|
3
|
Zhang GJ, Xie TY, Zhou XG, Wang LJ, Hu J. Protein Structure Prediction Using Population-Based Algorithm Guided by Information Entropy. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:697-707. [PMID: 31180869 DOI: 10.1109/tcbb.2019.2921958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Ab initio protein structure prediction is one of the most challenging problems in computational biology. Multistage algorithms are widely used in ab initio protein structure prediction. The different computational costs of a multistage algorithm for different proteins are important to be considered. In this study, a population-based algorithm guided by information entropy (PAIE), which includes exploration and exploitation stages, is proposed for protein structure prediction. In PAIE, an entropy-based stage switch strategy is designed to switch from the exploration stage to the exploitation stage. Torsion angle statistical information is also deduced from the first stage and employed to enhance the exploitation in the second stage. Results indicate that an improvement in the performance of protein structure prediction in a benchmark of 30 proteins and 17 other free modeling targets in CASP.
Collapse
|
4
|
Zaman AB, Kamranfar P, Domeniconi C, Shehu A. Reducing Ensembles of Protein Tertiary Structures Generated De Novo via Clustering. Molecules 2020; 25:E2228. [PMID: 32397410 PMCID: PMC7248879 DOI: 10.3390/molecules25092228] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 04/21/2020] [Accepted: 04/28/2020] [Indexed: 11/16/2022] Open
Abstract
Controlling the quality of tertiary structures computed for a protein molecule remains a central challenge in de-novo protein structure prediction. The rule of thumb is to generate as many structures as can be afforded, effectively acknowledging that having more structures increases the likelihood that some will reside near the sought biologically-active structure. A major drawback with this approach is that computing a large number of structures imposes time and space costs. In this paper, we propose a novel clustering-based approach which we demonstrate to significantly reduce an ensemble of generated structures without sacrificing quality. Evaluations are related on both benchmark and CASP target proteins. Structure ensembles subjected to the proposed approach and the source code of the proposed approach are publicly-available at the links provided in Section 1.
Collapse
Affiliation(s)
- Ahmed Bin Zaman
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Parastoo Kamranfar
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Carlotta Domeniconi
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, VA 22030, USA; (A.B.Z.); (P.K.)
- Center for Advancing Human-Machine Partnerships, George Mason University, Fairfax, VA 22030, USA
- Department of Bioengineering, George Mason University, Fairfax, VA 22030, USA
- School of Systems Biology, George Mason University, Fairfax, VA 22030, USA
| |
Collapse
|
5
|
Kandathil SM, Garza-Fabre M, Handl J, Lovell SC. Reliable Generation of Native-Like Decoys Limits Predictive Ability in Fragment-Based Protein Structure Prediction. Biomolecules 2019; 9:biom9100612. [PMID: 31618996 PMCID: PMC6843117 DOI: 10.3390/biom9100612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 10/08/2019] [Accepted: 10/10/2019] [Indexed: 11/16/2022] Open
Abstract
Our previous work with fragment-assembly methods has demonstrated specific deficiencies in conformational sampling behaviour that, when addressed through improved sampling algorithms, can lead to more reliable prediction of tertiary protein structure when good fragments are available, and when score values can be relied upon to guide the search to the native basin. In this paper, we present preliminary investigations into two important questions arising from more difficult prediction problems. First, we investigated the extent to which native-like conformational states are generated during multiple runs of our search protocols. We determined that, in cases of difficult prediction, native-like decoys are rarely or never generated. Second, we developed a scheme for decoy retention that balances the objectives of retaining low-scoring structures and retaining conformationally diverse structures sampled during the course of the search. Our method succeeds at retaining more diverse sets of structures, and, for a few targets, more native-like solutions are retained as compared to our original, energy-based retention scheme. However, in general, we found that the rate at which native-like structural states are generated has a much stronger effect on eventual distributions of predictive accuracy in the decoy sets, as compared to the specific decoy retention strategy used. We found that our protocols show differences in their ability to access native-like states for some targets, and this may explain some of the differences in predictive performance seen between these methods. There appears to be an interaction between fragment sets and move operators, which influences the accessibility of native-like structures for given targets. Our results point to clear directions for further improvements in fragment-based methods, which are likely to enable higher accuracy predictions.
Collapse
Affiliation(s)
- Shaun M Kandathil
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK.
| | - Mario Garza-Fabre
- Cinvestav Unidad Tamaulipas, Km 5.5 Carretera Cd. Victoria-Soto La Marina, 87130 Cd. Victoria, Tamaulipas, Mexico.
| | - Julia Handl
- Decision and Cognitive Sciences Research Centre, Alliance Manchester Business School, The University of Manchester, Manchester M13 9PL, UK.
- The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK.
| | - Simon C Lovell
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester M13 9PL, UK.
| |
Collapse
|
6
|
Li ZW, Sun K, Hao XH, Hu J, Ma LF, Zhou XG, Zhang GJ. Loop Enhanced Conformational Resampling Method for Protein Structure Prediction. IEEE Trans Nanobioscience 2019; 18:567-577. [PMID: 31180866 DOI: 10.1109/tnb.2019.2922101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Protein structure prediction has been a long-standing problem for the past decades. In particular, the loop region structure remains an obstacle in forming an accurate protein tertiary structure because of its flexibility. In this study, Rama torsion angle and secondary structure feature-guided differential evolution named RSDE is proposed to predict three-dimensional structure with the exploitation on the loop region structure. In RSDE, the structure of the loop region is improved by the following: loop-based cross operator, which interchanges configuration of a randomly selected loop region between individuals, and loop-based mutate operator, which considers torsion angle feature into conformational sampling. A stochastic ranking selective strategy is designed to select conformations with low energy and near-native structure. Moreover, the conformational resampling method, which uses previously learned knowledge to guide subsequent sampling, is proposed to improve the sampling efficiency. Experiments on a total of 28 test proteins reveals that the proposed RSDE is effective and can obtain native-like models.
Collapse
|
7
|
Maximova T, Plaku E, Shehu A. Structure-Guided Protein Transition Modeling with a Probabilistic Roadmap Algorithm. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1783-1796. [PMID: 27411226 DOI: 10.1109/tcbb.2016.2586044] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Proteins are macromolecules in perpetual motion, switching between structural states to modulate their function. A detailed characterization of the precise yet complex relationship between protein structure, dynamics, and function requires elucidating transitions between functionally-relevant states. Doing so challenges both wet and dry laboratories, as protein dynamics involves disparate temporal scales. In this paper, we present a novel, sampling-based algorithm to compute transition paths. The algorithm exploits two main ideas. First, it leverages known structures to initialize its search and define a reduced conformation space for rapid sampling. This is key to address the insufficient sampling issue suffered by sampling-based algorithms. Second, the algorithm embeds samples in a nearest-neighbor graph where transition paths can be efficiently computed via queries. The algorithm adapts the probabilistic roadmap framework that is popular in robot motion planning. In addition to efficiently computing lowest-cost paths between any given structures, the algorithm allows investigating hypotheses regarding the order of experimentally-known structures in a transition event. This novel contribution is likely to open up new venues of research. Detailed analysis is presented on multiple-basin proteins of relevance to human disease. Multiscaling and the AMBER ff14SB force field are used to obtain energetically-credible paths at atomistic detail.
Collapse
|
8
|
Abella JR, Moll M, Kavraki LE. Maintaining and Enhancing Diversity of Sampled Protein Conformations in Robotics-Inspired Methods. J Comput Biol 2018; 25:3-20. [PMID: 29035572 PMCID: PMC5756939 DOI: 10.1089/cmb.2017.0164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The ability to efficiently sample structurally diverse protein conformations allows one to gain a high-level view of a protein's energy landscape. Algorithms from robot motion planning have been used for conformational sampling, and several of these algorithms promote diversity by keeping track of "coverage" in conformational space based on the local sampling density. However, large proteins present special challenges. In particular, larger systems require running many concurrent instances of these algorithms, but these algorithms can quickly become memory intensive because they typically keep previously sampled conformations in memory to maintain coverage estimates. In addition, robotics-inspired algorithms depend on defining useful perturbation strategies for exploring the conformational space, which is a difficult task for large proteins because such systems are typically more constrained and exhibit complex motions. In this article, we introduce two methodologies for maintaining and enhancing diversity in robotics-inspired conformational sampling. The first method addresses algorithms based on coverage estimates and leverages the use of a low-dimensional projection to define a global coverage grid that maintains coverage across concurrent runs of sampling. The second method is an automatic definition of a perturbation strategy through readily available flexibility information derived from B-factors, secondary structure, and rigidity analysis. Our results show a significant increase in the diversity of the conformations sampled for proteins consisting of up to 500 residues when applied to a specific robotics-inspired algorithm for conformational sampling. The methodologies presented in this article may be vital components for the scalability of robotics-inspired approaches.
Collapse
Affiliation(s)
- Jayvee R Abella
- 1 Department of Computer Science, Rice University , Houston, Texas
| | - Mark Moll
- 1 Department of Computer Science, Rice University , Houston, Texas
| | - Lydia E Kavraki
- 1 Department of Computer Science, Rice University , Houston, Texas
| |
Collapse
|
9
|
Maximova T, Zhang Z, Carr DB, Plaku E, Shehu A. Sample-Based Models of Protein Energy Landscapes and Slow Structural Rearrangements. J Comput Biol 2018; 25:33-50. [DOI: 10.1089/cmb.2017.0158] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia
| | - Zijing Zhang
- Department of Statistics, George Mason University, Fairfax, Virginia
| | - Daniel B. Carr
- Department of Statistics, George Mason University, Fairfax, Virginia
| | - Erion Plaku
- Department of Electrical Engineering and Computer Science, The Catholic University of America, Washington, D.C
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia
- Department of Bioengineering, George Mason University, Fairfax, Virginia
- School of Systems Biology, George Mason University, Manassas, Virginia
| |
Collapse
|
10
|
Abstract
Background Understanding protein structure and dynamics is essential for understanding their function. This is a challenging task due to the high complexity of the conformational landscapes of proteins and their rugged energy levels. In particular, it is important to detect highly populated regions which could correspond to intermediate structures or local minima. Results We present a hierarchical clustering and algebraic topology based method that detects regions of interest in protein conformational space. The method is based on several techniques. We use coarse grained protein conformational search, efficient robust dimensionality reduction and topological analysis via persistent homology as the main tools. We use two dimensionality reduction methods as well, robust Principal Component Analysis (PCA) and Isomap, to generate a reduced representation of the data while preserving most of the variance in the data. Conclusions Our hierarchical clustering method was able to produce compact, well separated clusters for all the tested examples.
Collapse
Affiliation(s)
- Nurit Haspel
- Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, 02125, MA, USA.
| | - Dong Luo
- Department of Computer Science, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, 02125, MA, USA
| | - Eduardo González
- Department of Mathematics, University of Massachusetts Boston, 100 Morrissey Blvd., Boston, 02125, MA, USA
| |
Collapse
|
11
|
Zhang GJ, Zhou XG, Yu XF, Hao XH, Yu L. Enhancing Protein Conformational Space Sampling Using Distance Profile-Guided Differential Evolution. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1288-1301. [PMID: 28113726 DOI: 10.1109/tcbb.2016.2566617] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
De novo protein structure prediction aims to search for low-energy conformations as it follows the thermodynamics hypothesis that places native conformations at the global minimum of the protein energy surface. However, the native conformation is not necessarily located in the lowest-energy regions owing to the inaccuracies of the energy model. This study presents a differential evolution algorithm using distance profile-based selection strategy to sample conformations with reasonable structure effectively. In the proposed algorithm, besides energy, the residue-residue distance is considered another measure of the conformation. The average distance errors of decoys between the distance of each residue pair and the corresponding distance in the distance profiles are first calculated when the trial conformation yields a larger energy value than that of the target. Then, the distance acceptance probability of the trial conformation is designed based on distance profiles if the trial conformation obtains a lower average distance error compared with that of the target conformation. The trial conformation is accepted to the next generation in accordance with its distance acceptance probability. By using the dual constraints of energy and distance in guiding sampling, the algorithm can sample conformations with lower energies and more reasonable structures. Experimental results of 28 benchmark proteins show that the proposed algorithm can effectively predict near-native protein structures.
Collapse
|
12
|
Hao XH, Zhang GJ, Zhou XG. Conformational Space Sampling Method Using Multi-Subpopulation Differential Evolution for De novo Protein Structure Prediction. IEEE Trans Nanobioscience 2017; 16:618-633. [DOI: 10.1109/tnb.2017.2749243] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
13
|
Novinskaya A, Devaurs D, Moll M, Kavraki LE. Defining Low-Dimensional Projections to Guide Protein Conformational Sampling. J Comput Biol 2016; 24:79-89. [PMID: 27892695 DOI: 10.1089/cmb.2016.0144] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Exploring the conformational space of proteins is critical to characterize their functions. Numerous methods have been proposed to sample a protein's conformational space, including techniques developed in the field of robotics and known as sampling-based motion-planning algorithms (or sampling-based planners). However, these algorithms suffer from the curse of dimensionality when applied to large proteins. Many sampling-based planners attempt to mitigate this issue by keeping track of sampling density to guide conformational sampling toward unexplored regions of the conformational space. This is often done using low-dimensional projections as an indirect way to reduce the dimensionality of the exploration problem. However, how to choose an appropriate projection and how much it influences the planner's performance are still poorly understood issues. In this article, we introduce two methodologies defining low-dimensional projections that can be used by sampling-based planners for protein conformational sampling. The first method leverages information about a protein's flexibility to construct projections that can efficiently guide conformational sampling, when expert knowledge is available. The second method builds similar projections automatically, without expert intervention. We evaluate the projections produced by both methodologies on two conformational search problems involving three middle-size proteins. Our experiments demonstrate that (i) defining projections based on expert knowledge can benefit conformational sampling and (ii) automatically constructing such projections is a reasonable alternative.
Collapse
Affiliation(s)
| | - Didier Devaurs
- Department of Computer Science, Rice University , Houston, Texas
| | - Mark Moll
- Department of Computer Science, Rice University , Houston, Texas
| | - Lydia E Kavraki
- Department of Computer Science, Rice University , Houston, Texas
| |
Collapse
|
14
|
Hao XH, Zhang GJ, Zhou XG, Yu XF. A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:887-900. [PMID: 26552093 DOI: 10.1109/tcbb.2015.2497226] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
To address the searching problem of protein conformational space in ab-initio protein structure prediction, a novel method using abstract convex underestimation (ACUE) based on the framework of evolutionary algorithm was proposed. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and rugged energy surface of the protein conformational space. As a consequence, the dimension of protein conformational space should be reduced to a proper level. In this paper, the high-dimensionality original conformational space was converted into feature space whose dimension is considerably reduced by feature extraction technique. And, the underestimate space could be constructed according to abstract convex theory. Thus, the entropy effect caused by searching in the high-dimensionality conformational space could be avoided through such conversion. The tight lower bound estimate information was obtained to guide the searching direction, and the invalid searching area in which the global optimal solution is not located could be eliminated in advance. Moreover, instead of expensively calculating the energy of conformations in the original conformational space, the estimate value is employed to judge if the conformation is worth exploring to reduce the evaluation time, thereby making computational cost lower and the searching process more efficient. Additionally, fragment assembly and the Monte Carlo method are combined to generate a series of metastable conformations by sampling in the conformational space. The proposed method provides a novel technique to solve the searching problem of protein conformational space. Twenty small-to-medium structurally diverse proteins were tested, and the proposed ACUE method was compared with It Fix, HEA, Rosetta and the developed method LEDE without underestimate information. Test results show that the ACUE method can more rapidly and more efficiently obtain the near-native protein structure.
Collapse
|
15
|
Li H, Leung KS, Wong MH, Ballester PJ. USR-VS: a web server for large-scale prospective virtual screening using ultrafast shape recognition techniques. Nucleic Acids Res 2016; 44:W436-41. [PMID: 27106057 PMCID: PMC4987897 DOI: 10.1093/nar/gkw320] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Accepted: 04/06/2016] [Indexed: 12/12/2022] Open
Abstract
Ligand-based Virtual Screening (VS) methods aim at identifying molecules with a similar activity profile across phenotypic and macromolecular targets to that of a query molecule used as search template. VS using 3D similarity methods have the advantage of biasing this search toward active molecules with innovative chemical scaffolds, which are highly sought after in drug design to provide novel leads with improved properties over the query molecule (e.g. patentable, of lower toxicity or increased potency). Ultrafast Shape Recognition (USR) has demonstrated excellent performance in the discovery of molecules with previously-unknown phenotypic or target activity, with retrospective studies suggesting that its pharmacophoric extension (USRCAT) should obtain even better hit rates once it is used prospectively. Here we present USR-VS (http://usr.marseille.inserm.fr/), the first web server using these two validated ligand-based 3D methods for large-scale prospective VS. In about 2 s, 93.9 million 3D conformers, expanded from 23.1 million purchasable molecules, are screened and the 100 most similar molecules among them in terms of 3D shape and pharmacophoric properties are shown. USR-VS functionality also provides interactive visualization of the similarity of the query molecule against the hit molecules as well as vendor information to purchase selected hits in order to be experimentally tested.
Collapse
Affiliation(s)
- Hongjian Li
- Institute of Future Cities, Chinese University of Hong Kong, Hong Kong
| | - Kwong-S Leung
- Department of Computer Science and Engineering, Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Man-H Wong
- Department of Computer Science and Engineering, Chinese University of Hong Kong, Sha Tin, New Territories, Hong Kong
| | - Pedro J Ballester
- Cancer Research Center of Marseille, INSERM U1068, 13009-Marseille, France
| |
Collapse
|
16
|
Maximova T, Moffatt R, Ma B, Nussinov R, Shehu A. Principles and Overview of Sampling Methods for Modeling Macromolecular Structure and Dynamics. PLoS Comput Biol 2016; 12:e1004619. [PMID: 27124275 PMCID: PMC4849799 DOI: 10.1371/journal.pcbi.1004619] [Citation(s) in RCA: 132] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Investigation of macromolecular structure and dynamics is fundamental to understanding how macromolecules carry out their functions in the cell. Significant advances have been made toward this end in silico, with a growing number of computational methods proposed yearly to study and simulate various aspects of macromolecular structure and dynamics. This review aims to provide an overview of recent advances, focusing primarily on methods proposed for exploring the structure space of macromolecules in isolation and in assemblies for the purpose of characterizing equilibrium structure and dynamics. In addition to surveying recent applications that showcase current capabilities of computational methods, this review highlights state-of-the-art algorithmic techniques proposed to overcome challenges posed in silico by the disparate spatial and time scales accessed by dynamic macromolecules. This review is not meant to be exhaustive, as such an endeavor is impossible, but rather aims to balance breadth and depth of strategies for modeling macromolecular structure and dynamics for a broad audience of novices and experts.
Collapse
Affiliation(s)
- Tatiana Maximova
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Ryan Moffatt
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
| | - Buyong Ma
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
| | - Ruth Nussinov
- Basic Science Program, Leidos Biomedical Research, Inc. Cancer and Inflammation Program, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Amarda Shehu
- Department of Computer Science, George Mason University, Fairfax, Virginia, United States of America
- Department of Biongineering, George Mason University, Fairfax, Virginia, United States of America
- School of Systems Biology, George Mason University, Manassas, Virginia, United States of America
| |
Collapse
|
17
|
Garza-Fabre M, Kandathil SM, Handl J, Knowles J, Lovell SC. Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction. EVOLUTIONARY COMPUTATION 2016; 24:577-607. [PMID: 26908350 DOI: 10.1162/evco_a_00176] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Computational approaches to de novo protein tertiary structure prediction, including those based on the preeminent "fragment-assembly" technique, have failed to scale up fully to larger proteins (on the order of 100 residues and above). A number of limiting factors are thought to contribute to the scaling problem over and above the simple combinatorial explosion, but the key ones relate to the lack of exploration of properly diverse protein folds, and to an acute form of "deception" in the energy function, whereby low-energy conformations do not reliably equate with native structures. In this article, solutions to both of these problems are investigated through a multistage memetic algorithm incorporating the successful Rosetta method as a local search routine. We found that specialised genetic operators significantly add to structural diversity and that this translates well to reaching low energies. The use of a generalised stochastic ranking procedure for selection enables the memetic algorithm to handle and traverse deep energy wells that can be considered deceptive, which further adds to the ability of the algorithm to obtain a much-improved diversity of folds. The results should translate to a tangible improvement in the performance of protein structure prediction algorithms in blind experiments such as CASP, and potentially to a further step towards the more challenging problem of predicting the three-dimensional shape of large proteins.
Collapse
Affiliation(s)
- Mario Garza-Fabre
- Decision and Cognitive Sciences Research Centre, University of Manchester, Manchester, M15 6PB, UK
| | - Shaun M Kandathil
- Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT, UK
| | - Julia Handl
- Decision and Cognitive Sciences Research Centre, University of Manchester, Manchester, M15 6PB, UK
| | - Joshua Knowles
- School of Computer Science, University of Birmingham, Birmingham, B15 2TT, UK
| | - Simon C Lovell
- Faculty of Life Sciences, University of Manchester, Manchester, M13 9PT, UK
| |
Collapse
|
18
|
Kandathil SM, Handl J, Lovell SC. Toward a detailed understanding of search trajectories in fragment assembly approaches to protein structure prediction. Proteins 2016; 84:411-26. [PMID: 26799916 PMCID: PMC4982100 DOI: 10.1002/prot.24987] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Revised: 12/03/2015] [Accepted: 12/31/2015] [Indexed: 11/30/2022]
Abstract
Energy functions, fragment libraries, and search methods constitute three key components of fragment‐assembly methods for protein structure prediction, which are all crucial for their ability to generate high‐accuracy predictions. All of these components are tightly coupled; efficient searching becomes more important as the quality of fragment libraries decreases. Given these relationships, there is currently a poor understanding of the strengths and weaknesses of the sampling approaches currently used in fragment‐assembly techniques. Here, we determine how the performance of search techniques can be assessed in a meaningful manner, given the above problems. We describe a set of techniques that aim to reduce the impact of the energy function, and assess exploration in view of the search space defined by a given fragment library. We illustrate our approach using Rosetta and EdaFold, and show how certain features of these methods encourage or limit conformational exploration. We demonstrate that individual trajectories of Rosetta are susceptible to local minima in the energy landscape, and that this can be linked to non‐uniform sampling across the protein chain. We show that EdaFold's novel approach can help balance broad exploration with locating good low‐energy conformations. This occurs through two mechanisms which cannot be readily differentiated using standard performance measures: exclusion of false minima, followed by an increasingly focused search in low‐energy regions of conformational space. Measures such as ours can be helpful in characterizing new fragment‐based methods in terms of the quality of conformational exploration realized. Proteins 2016; 84:411–426. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Shaun M Kandathil
- Faculty of Life Sciences, the University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Julia Handl
- Alliance Manchester Business School, Faculty of Humanities, the University of Manchester, Manchester, M13 9PL, United Kingdom
| | - Simon C Lovell
- Faculty of Life Sciences, the University of Manchester, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
19
|
Shehu A. A Review of Evolutionary Algorithms for Computing Functional Conformations of Protein Molecules. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2015. [DOI: 10.1007/7653_2015_47] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
20
|
Molloy K, Shehu A. Elucidating the ensemble of functionally-relevant transitions in protein systems with a robotics-inspired method. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S8. [PMID: 24565158 PMCID: PMC3952944 DOI: 10.1186/1472-6807-13-s1-s8] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Background Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. Methods We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Results and conclusions Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers.
Collapse
|
21
|
Al-Bluwi I, Vaisset M, Siméon T, Cortés J. Modeling protein conformational transitions by a combination of coarse-grained normal mode analysis and robotics-inspired methods. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S2. [PMID: 24564964 PMCID: PMC3953241 DOI: 10.1186/1472-6807-13-s1-s2] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
BACKGROUND Obtaining atomic-scale information about large-amplitude conformational transitions in proteins is a challenging problem for both experimental and computational methods. Such information is, however, important for understanding the mechanisms of interaction of many proteins. METHODS This paper presents a computationally efficient approach, combining methods originating from robotics and computational biophysics, to model protein conformational transitions. The ability of normal mode analysis to predict directions of collective, large-amplitude motions is applied to bias the conformational exploration performed by a motion planning algorithm. To reduce the dimension of the problem, normal modes are computed for a coarse-grained elastic network model built on short fragments of three residues. Nevertheless, the validity of intermediate conformations is checked using the all-atom model, which is accurately reconstructed from the coarse-grained one using closed-form inverse kinematics. RESULTS Tests on a set of ten proteins demonstrate the ability of the method to model conformational transitions of proteins within a few hours of computing time on a single processor. These results also show that the computing time scales linearly with the protein size, independently of the protein topology. Further experiments on adenylate kinase show that main features of the transition between the open and closed conformations of this protein are well captured in the computed path. CONCLUSIONS The proposed method enables the simulation of large-amplitude conformational transitions in proteins using very few computational resources. The resulting paths are a first approximation that can directly provide important information on the molecular mechanisms involved in the conformational transition. This approximation can be subsequently refined and analyzed using state-of-the-art energy models and molecular modeling methods.
Collapse
Affiliation(s)
- Ibrahim Al-Bluwi
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France
- Univ de Toulouse, LAAS, F-31400 Toulouse, France
| | - Marc Vaisset
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France
- Univ de Toulouse, LAAS, F-31400 Toulouse, France
| | - Thierry Siméon
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France
- Univ de Toulouse, LAAS, F-31400 Toulouse, France
| | - Juan Cortés
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France
- Univ de Toulouse, LAAS, F-31400 Toulouse, France
| |
Collapse
|
22
|
Saleh S, Olson B, Shehu A. A population-based evolutionary search approach to the multiple minima problem in de novo protein structure prediction. BMC STRUCTURAL BIOLOGY 2013; 13 Suppl 1:S4. [PMID: 24565020 PMCID: PMC3953177 DOI: 10.1186/1472-6807-13-s1-s4] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Elucidating the native structure of a protein molecule from its sequence of amino acids, a problem known as de novo structure prediction, is a long standing challenge in computational structural biology. Difficulties in silico arise due to the high dimensionality of the protein conformational space and the ruggedness of the associated energy surface. The issue of multiple minima is a particularly troublesome hallmark of energy surfaces probed with current energy functions. In contrast to the true energy surface, these surfaces are weakly-funneled and rich in comparably deep minima populated by non-native structures. For this reason, many algorithms seek to be inclusive and obtain a broad view of the low-energy regions through an ensemble of low-energy (decoy) conformations. Conformational diversity in this ensemble is key to increasing the likelihood that the native structure has been captured. Methods We propose an evolutionary search approach to address the multiple-minima problem in decoy sampling for de novo structure prediction. Two population-based evolutionary search algorithms are presented that follow the basic approach of treating conformations as individuals in an evolving population. Coarse graining and molecular fragment replacement are used to efficiently obtain protein-like child conformations from parents. Potential energy is used both to bias parent selection and determine which subset of parents and children will be retained in the evolving population. The effect on the decoy ensemble of sampling minima directly is measured by additionally mapping a conformation to its nearest local minimum before considering it for retainment. The resulting memetic algorithm thus evolves not just a population of conformations but a population of local minima. Results and conclusions Results show that both algorithms are effective in terms of sampling conformations in proximity of the known native structure. The additional minimization is shown to be key to enhancing sampling capability and obtaining a diverse ensemble of decoy conformations, circumventing premature convergence to sub-optimal regions in the conformational space, and approaching the native structure with proximity that is comparable to state-of-the-art decoy sampling methods. The results are shown to be robust and valid when using two representative state-of-the-art coarse-grained energy functions.
Collapse
|
23
|
Olson BS, Shehu A. Rapid sampling of local minima in protein energy surface and effective reduction through a multi-objective filter. Proteome Sci 2013; 11:S12. [PMID: 24564970 PMCID: PMC3908317 DOI: 10.1186/1477-5956-11-s1-s12] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many problems in protein modeling require obtaining a discrete representation of the protein conformational space as an ensemble of conformations. In ab-initio structure prediction, in particular, where the goal is to predict the native structure of a protein chain given its amino-acid sequence, the ensemble needs to satisfy energetic constraints. Given the thermodynamic hypothesis, an effective ensemble contains low-energy conformations which are similar to the native structure. The high-dimensionality of the conformational space and the ruggedness of the underlying energy surface currently make it very difficult to obtain such an ensemble. Recent studies have proposed that Basin Hopping is a promising probabilistic search framework to obtain a discrete representation of the protein energy surface in terms of local minima. Basin Hopping performs a series of structural perturbations followed by energy minimizations with the goal of hopping between nearby energy minima. This approach has been shown to be effective in obtaining conformations near the native structure for small systems. Recent work by us has extended this framework to larger systems through employment of the molecular fragment replacement technique, resulting in rapid sampling of large ensembles. METHODS This paper investigates the algorithmic components in Basin Hopping to both understand and control their effect on the sampling of near-native minima. Realizing that such an ensemble is reduced before further refinement in full ab-initio protocols, we take an additional step and analyze the quality of the ensemble retained by ensemble reduction techniques. We propose a novel multi-objective technique based on the Pareto front to filter the ensemble of sampled local minima. RESULTS AND CONCLUSIONS We show that controlling the magnitude of the perturbation allows directly controlling the distance between consecutively-sampled local minima and, in turn, steering the exploration towards conformations near the native structure. For the minimization step, we show that the addition of Metropolis Monte Carlo-based minimization is no more effective than a simple greedy search. Finally, we show that the size of the ensemble of sampled local minima can be effectively and efficiently reduced by a multi-objective filter to obtain a simpler representation of the probed energy surface.
Collapse
Affiliation(s)
- Brian S Olson
- Department of Computer Science, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
| | - Amarda Shehu
- Department of Computer Science, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
- Department of Bioengineering, George Mason University, 4400 University Dr., Fairfax, VA, 22030, USA
- School of Systems Biology, George Mason University, 10900 University Blvd., Manassas, VA, 20110, USA
| |
Collapse
|
24
|
Molloy K, Saleh S, Shehu A. Probabilistic search and energy guidance for biased decoy sampling in ab initio protein structure prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:1162-1175. [PMID: 24384705 DOI: 10.1109/tcbb.2013.29] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Adequate sampling of the conformational space is a central challenge in ab initio protein structure prediction. In the absence of a template structure, a conformational search procedure guided by an energy function explores the conformational space, gathering an ensemble of low-energy decoy conformations. If the sampling is inadequate, the native structure may be missed altogether. Even if reproduced, a subsequent stage that selects a subset of decoys for further structural detail and energetic refinement may discard near-native decoys if they are high energy or insufficiently represented in the ensemble. Sampling should produce a decoy ensemble that facilitates the subsequent selection of near-native decoys. In this paper, we investigate a robotics-inspired framework that allows directly measuring the role of energy in guiding sampling. Testing demonstrates that a soft energy bias steers sampling toward a diverse decoy ensemble less prone to exploiting energetic artifacts and thus more likely to facilitate retainment of near-native conformations by selection techniques. We employ two different energy functions, the associative memory Hamiltonian with water and Rosetta. Results show that enhanced sampling provides a rigorous testing of energy functions and exposes different deficiencies in them, thus promising to guide development of more accurate representations and energy functions.
Collapse
|
25
|
Basin Hopping as a General and Versatile Optimization Framework for the Characterization of Biological Macromolecules. ACTA ACUST UNITED AC 2012. [DOI: 10.1155/2012/674832] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Since its introduction, the basin hopping (BH) framework has proven useful for hard nonlinear optimization problems with multiple variables and modalities. Applications span a wide range, from packing problems in geometry to characterization of molecular states in statistical physics. BH is seeing a reemergence in computational structural biology due to its ability to obtain a coarse-grained representation of
the protein energy surface in terms of local minima. In this paper, we show that the BH framework is general and versatile, allowing to address problems related to the characterization of protein structure, assembly, and motion due to its fundamental ability to sample minima in a high-dimensional variable space. We show how specific implementations of the main components in BH yield algorithmic realizations that attain state-of-the-art results in the context of ab initio protein structure prediction and rigid protein-protein docking. We also show that BH can map intermediate minima related with motions connecting diverse stable functionally relevant states in a protein molecule,
thus serving as a first step towards the characterization of transition trajectories connecting these states.
Collapse
|
26
|
Olson B, Molloy K, Hendi SF, Shehu A. Guiding probabilistic search of the protein conformational space with structural profiles. J Bioinform Comput Biol 2012; 10:1242005. [PMID: 22809381 DOI: 10.1142/s021972001242005x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The roughness of the protein energy surface poses a significant challenge to search algorithms that seek to obtain a structural characterization of the native state. Recent research seeks to bias search toward near-native conformations through one-dimensional structural profiles of the protein native state. Here we investigate the effectiveness of such profiles in a structure prediction setting for proteins of various sizes and folds. We pursue two directions. We first investigate the contribution of structural profiles in comparison to or in conjunction with physics-based energy functions in providing an effective energy bias. We conduct this investigation in the context of Metropolis Monte Carlo with fragment-based assembly. Second, we explore the effectiveness of structural profiles in providing projection coordinates through which to organize the conformational space. We do so in the context of a robotics-inspired search framework proposed in our lab that employs projections of the conformational space to guide search. Our findings indicate that structural profiles are most effective in obtaining physically realistic near-native conformations when employed in conjunction with physics-based energy functions. Our findings also show that these profiles are very effective when employed instead as projection coordinates to guide probabilistic search toward undersampled regions of the conformational space.
Collapse
Affiliation(s)
- Brian Olson
- Department of Computer Science, George Mason University, 4400 University Drive Fairfax, VA 22030, USA
| | | | | | | |
Collapse
|
27
|
Paës G, Cortés J, Siméon T, O'Donohue MJ, Tran V. Thumb-loops up for catalysis: a structure/function investigation of a functional loop movement in a GH11 xylanase. Comput Struct Biotechnol J 2012; 1:e201207001. [PMID: 24688637 PMCID: PMC3962102 DOI: 10.5936/csbj.201207001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Revised: 05/23/2012] [Accepted: 05/27/2012] [Indexed: 12/17/2022] Open
Abstract
Dynamics is a key feature of enzyme catalysis. Unfortunately, current experimental and computational techniques do not yet provide a comprehensive understanding and description of functional macromolecular motions. In this work, we have extended a novel computational technique, which combines molecular modeling methods and robotics algorithms, to investigate functional motions of protein loops. This new approach has been applied to study the functional importance of the so-called thumb-loop in the glycoside hydrolase family 11 xylanase from Thermobacillus xylanilyticus (Tx-xyl). The results obtained provide new insight into the role of the loop in the glycosylation/deglycosylation catalytic cycle, and underline the key importance of the nature of the residue located at the tip of the thumb-loop. The effect of mutations predicted in silico has been validated by in vitro site-directed mutagenesis experiments. Overall, we propose a comprehensive model of Tx-xyl catalysis in terms of substrate and product dynamics by identifying the action of the thumb-loop motion during catalysis.
Collapse
Affiliation(s)
- Gabriel Paës
- CNRS, FRE3478 UFIP, Faculté des Sciences et des Techniques, 2 rue de la Houssinière, F-44322 Nantes, France ; University of Nantes, FRE3478 UFIP, Faculté des Sciences et des Techniques, 2 rue de la Houssinière, F-44322 Nantes, France ; INRA, UMR614 FARE, 2 esplanade Roland Garros, F-51686 Reims, France ; University of Reims Champagne-Ardenne, UMR614 FARE, 2 esplanade Roland Garros, F-51686 Reims, France
| | - Juan Cortés
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France ; University of Toulouse, LAAS, F-31400 Toulouse, France
| | - Thierry Siméon
- CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France ; University of Toulouse, LAAS, F-31400 Toulouse, France
| | - Michael J O'Donohue
- INRA, UMR614 FARE, 2 esplanade Roland Garros, F-51686 Reims, France ; University of Reims Champagne-Ardenne, UMR614 FARE, 2 esplanade Roland Garros, F-51686 Reims, France ; INRA, UMR792 LISBP, 137 avenue de Rangueil, F-31077 Toulouse, France ; INSA, UMR792 LISBP, 137 avenue de Rangueil, F-31077 Toulouse, France
| | - Vinh Tran
- CNRS, FRE3478 UFIP, Faculté des Sciences et des Techniques, 2 rue de la Houssinière, F-44322 Nantes, France ; University of Nantes, FRE3478 UFIP, Faculté des Sciences et des Techniques, 2 rue de la Houssinière, F-44322 Nantes, France
| |
Collapse
|
28
|
de Angulo VR, Cortés J, Porta JM. Rigid-CLL: avoiding constant-distance computations in cell linked-lists algorithms. J Comput Chem 2012; 33:294-300. [PMID: 22072568 DOI: 10.1002/jcc.21974] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Accepted: 09/28/2011] [Indexed: 11/12/2022]
Abstract
Many of the existing molecular simulation tools require the efficient identification of the set of nonbonded interacting atoms. This is necessary, for instance, to compute the energy values or the steric contacts between atoms. Cell linked-lists can be used to determine the pairs of atoms closer than a given cutoff distance in asymptotically optimal time. Despite this long-term optimality, many spurious distances are anyway computed with this method. Therefore, several improvements have been proposed, most of them aiming to refine the volume of influence for each atom. Here, we suggest a different improvement strategy based on avoiding to fill cells with those atoms that are always at a constant distance of a given atom. This technique is particularly effective when large groups of the particles in the simulation behave as rigid bodies as it is the case in simplified models considering only few of the degrees of freedom of the molecule. In these cases, the proposed technique can reduce the number of distance computations by more than one order of magnitude, as compared with the standard cell linked-list technique. The benefits of this technique are obtained without incurring in additional computation costs, because it carries out the same operations as the standard cell linked-list algorithm, although in a different order. Since the focus of the technique is the order of the operations, it might be combined with existing improvements based on bounding the volume of influence for each atom.
Collapse
Affiliation(s)
- V Ruiz de Angulo
- Institut de Robòtica i Informàtica Industrial, UPC-CSIC, Llorens Artigas 4-6, 08028 Barcelona, Spain.
| | | | | |
Collapse
|
29
|
OLSON BRIAN, MOLLOY KEVIN, SHEHU AMARDA. IN SEARCH OF THE PROTEIN NATIVE STATE WITH A PROBABILISTIC SAMPLING APPROACH. J Bioinform Comput Biol 2011; 9:383-98. [DOI: 10.1142/s0219720011005574] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 04/07/2011] [Accepted: 04/11/2011] [Indexed: 11/18/2022]
Abstract
The three-dimensional structure of a protein is a key determinant of its biological function. Given the cost and time required to acquire this structure through experimental means, computational models are necessary to complement wet-lab efforts. Many computational techniques exist for navigating the high-dimensional protein conformational search space, which is explored for low-energy conformations that comprise a protein's native states. This work proposes two strategies to enhance the sampling of conformations near the native state. An enhanced fragment library with greater structural diversity is used to expand the search space in the context of fragment-based assembly. To manage the increased complexity of the search space, only a representative subset of the sampled conformations is retained to further guide the search towards the native state. Our results make the case that these two strategies greatly enhance the sampling of the conformational space near the native state. A detailed comparative analysis shows that our approach performs as well as state-of-the-art ab initio structure prediction protocols.
Collapse
Affiliation(s)
- BRIAN OLSON
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| | - KEVIN MOLLOY
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| | - AMARDA SHEHU
- Department of Computer Science, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
- Department of Bioinformatics and Computational Biology, George Mason University 4400 University Drive, Fairfax, VA 22030, USA
| |
Collapse
|