Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA. The CATH classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res 2008;37:D310-4. [PMID: 18996897 PMCID: PMC2686597 DOI: 10.1093/nar/gkn877] [Citation(s) in RCA: 157] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

For:	Cuff AL, Sillitoe I, Lewis T, Redfern OC, Garratt R, Thornton J, Orengo CA. The CATH classification revisited--architectures reviewed and new ways to characterize structural divergence in superfamilies. Nucleic Acids Res 2008;37:D310-4. [PMID: 18996897 PMCID: PMC2686597 DOI: 10.1093/nar/gkn877] [Citation(s) in RCA: 157] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Number

Cited by Other Article(s)

Koga N, Tatsumi-Koga R. Inventing Novel Protein Folds. J Mol Biol 2024:168791. [PMID: 39260686 DOI: 10.1016/j.jmb.2024.168791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 09/04/2024] [Accepted: 09/05/2024] [Indexed: 09/13/2024]

Jacques F, Bolivar P, Pietras K, Hammarlund EU. Roadmap to the study of gene and protein phylogeny and evolution-A practical guide. PLoS One 2023;18:e0279597. [PMID: 36827278 PMCID: PMC9955684 DOI: 10.1371/journal.pone.0279597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 12/12/2022] [Indexed: 02/25/2023] Open

Zhao W, Zhong B, Zheng L, Tan P, Wang Y, Leng H, de Souza N, Liu Z, Hong L, Xiao X. Proteome-wide 3D structure prediction provides insights into the ancestral metabolism of ancient archaea and bacteria. Nat Commun 2022;13:7861. [PMID: 36543797 PMCID: PMC9772386 DOI: 10.1038/s41467-022-35523-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 12/07/2022] [Indexed: 12/24/2022] Open

Affiliation(s)

Weishu Zhao State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
Bozitao Zhong State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
Lirong Zheng Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
Pan Tan Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China
Yinzhao Wang State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
Hao Leng State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China
Nicolas de Souza Australian Nuclear Science and Technology (ANSTO), Locked Bag 2001, Kirrawee DC, Sydney, NSW, 2232, Australia
Zhuo Liu Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China Shanghai Artificial Intelligence Laboratory, 200232, Shanghai, China School of Physics and Astronomy, Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, 200240, Shanghai, China
Liang Hong Institute of Natural Sciences, Shanghai National Center for Applied Mathematics (SJTU Center) and MOE-LSC, Shanghai Jiao Tong University, 200240, Shanghai, China. Shanghai Artificial Intelligence Laboratory, 200232, Shanghai, China. School of Physics and Astronomy, Zhangjiang Institute for Advanced Study, Shanghai Jiao Tong University, 200240, Shanghai, China.
Xiang Xiao State Key Laboratory of Microbial Metabolism, International Center for Deep Life Investigation (IC-DLI), School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 200240, Shanghai, China. Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, Guangdong, China.

Collapse

Paul SK, Saddam M, Rahaman KA, Choi JG, Lee SS, Hasan M. Molecular modeling, molecular dynamics simulation, and essential dynamics analysis of grancalcin: An upregulated biomarker in experimental autoimmune encephalomyelitis mice. Heliyon 2022;8:e11232. [PMID: 36340004 PMCID: PMC9626934 DOI: 10.1016/j.heliyon.2022.e11232] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/30/2022] [Accepted: 10/20/2022] [Indexed: 11/06/2022] Open

Rosa HVD, Leonardo DA, Brognara G, Brandão-Neto J, D'Muniz Pereira H, Araújo APU, Garratt RC. Molecular Recognition at Septin Interfaces: The Switches Hold the Key. J Mol Biol 2020;432:5784-5801. [DOI: 10.1016/j.jmb.2020.09.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/25/2020] [Accepted: 09/01/2020] [Indexed: 01/22/2023]

Decomposing Structural Response Due to Sequence Changes in Protein Domains with Machine Learning. J Mol Biol 2020;432:4435-4446. [PMID: 32485208 DOI: 10.1016/j.jmb.2020.05.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 05/06/2020] [Accepted: 05/27/2020] [Indexed: 10/24/2022]

Dimarogona M, Topakas E, Christakopoulos P, Chrysina ED. The crystal structure of a Fusarium oxysporum feruloyl esterase that belongs to the tannase family. FEBS Lett 2020;594:1738-1749. [PMID: 32297315 DOI: 10.1002/1873-3468.13776] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 03/13/2020] [Accepted: 03/17/2020] [Indexed: 12/31/2022]

Kumar AP, Verma CS, Lukman S. Structural dynamics and allostery of Rab proteins: strategies for drug discovery and design. Brief Bioinform 2020;22:270-287. [PMID: 31950981 DOI: 10.1093/bib/bbz161] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Revised: 08/29/2019] [Accepted: 11/15/2019] [Indexed: 01/09/2023] Open

Abstract

Rab proteins represent the largest family of the Rab superfamily guanosine triphosphatase (GTPase). Aberrant human Rab proteins are associated with multiple diseases, including cancers and neurological disorders. Rab subfamily members display subtle conformational variations that render specificity in their physiological functions and can be targeted for subfamily-specific drug design. However, drug discovery efforts have not focused much on targeting Rab allosteric non-nucleotide binding sites which are subjected to less evolutionary pressures to be conserved, hence are likely to offer subfamily specificity and may be less prone to undesirable off-target interactions and side effects. To discover druggable allosteric binding sites, Rab structural dynamics need to be first incorporated using multiple experimentally and computationally obtained structures. The high-dimensional structural data may necessitate feature extraction methods to identify manageable representative structures for subsequent analyses. We have detailed state-of-the-art computational methods to (i) identify binding sites using data on sequence, shape, energy, etc., (ii) determine the allosteric nature of these binding sites based on structural ensembles, residue networks and correlated motions and (iii) identify small molecule binders through structure- and ligand-based virtual screening. To benefit future studies for targeting Rab allosteric sites, we herein detail a refined workflow comprising multiple available computational methods, which have been successfully used alone or in combinations. This workflow is also applicable for drug discovery efforts targeting other medically important proteins. Depending on the structural dynamics of proteins of interest, researchers can select suitable strategies for allosteric drug discovery and design, from the resources of computational methods and tools enlisted in the workflow.

Collapse

Schaeffer RD, Kinch L, Medvedev KE, Pei J, Cheng H, Grishin N. ECOD: identification of distant homology among multidomain and transmembrane domain proteins. BMC Mol Cell Biol 2019;20:18. [PMID: 31226926 PMCID: PMC6588880 DOI: 10.1186/s12860-019-0204-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2019] [Accepted: 06/02/2019] [Indexed: 12/03/2022] Open

Song K, Zhang J, Lu S. Progress in Allosteric Database. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2019;1163:65-87. [PMID: 31707700 DOI: 10.1007/978-981-13-8719-7_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Shanthirabalan S, Chomilier J, Carpentier M. Structural effects of point mutations in proteins. Proteins 2018;86:853-867. [DOI: 10.1002/prot.25499] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 03/19/2018] [Accepted: 03/20/2018] [Indexed: 12/21/2022]

Classification and Exploration of 3D Protein Domain Interactions Using Kbdock. Methods Mol Biol 2017. [PMID: 27115629 DOI: 10.1007/978-1-4939-3572-7_5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

NSiteMatch: Prediction of Binding Sites of Nucleotides by Identifying the Structure Similarity of Local Surface Patches. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017;2017:5471607. [PMID: 28811833 PMCID: PMC5547728 DOI: 10.1155/2017/5471607] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 06/14/2017] [Indexed: 12/01/2022]

Yoneda JS, Miles AJ, Araujo APU, Wallace BA. Differential dehydration effects on globular proteins and intrinsically disordered proteins during film formation. Protein Sci 2017;26:718-726. [PMID: 28097742 PMCID: PMC5368061 DOI: 10.1002/pro.3118] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 01/06/2017] [Accepted: 01/09/2017] [Indexed: 12/22/2022]

Mavridis L, Janes RW. PDB2CD: a web-based application for the generation of circular dichroism spectra from protein atomic coordinates. Bioinformatics 2016;33:56-63. [PMID: 27651482 PMCID: PMC5408769 DOI: 10.1093/bioinformatics/btw554] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Revised: 08/13/2016] [Accepted: 08/21/2016] [Indexed: 11/12/2022] Open

Nguyen MN, Sim AYL, Wan Y, Madhusudhan MS, Verma C. Topology independent comparison of RNA 3D structures using the CLICK algorithm. Nucleic Acids Res 2016;45:e5. [PMID: 27634929 PMCID: PMC5741206 DOI: 10.1093/nar/gkw819] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Revised: 09/01/2016] [Accepted: 09/02/2016] [Indexed: 01/15/2023] Open

Making sense of genomes of parasitic worms: Tackling bioinformatic challenges. Biotechnol Adv 2016;34:663-686. [DOI: 10.1016/j.biotechadv.2016.03.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 02/25/2016] [Accepted: 03/01/2016] [Indexed: 01/25/2023]

Paz I, Kligun E, Bengad B, Mandel-Gutfreund Y. BindUP: a web server for non-homology-based prediction of DNA and RNA binding proteins. Nucleic Acids Res 2016;44:W568-74. [PMID: 27198220 PMCID: PMC4987955 DOI: 10.1093/nar/gkw454] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 05/11/2016] [Indexed: 12/12/2022] Open

Ramírez-Sarmiento CA, Baez M, Zamora RA, Balasubramaniam D, Babul J, Komives EA, Guixé V. The folding unit of phosphofructokinase-2 as defined by the biophysical properties of a monomeric mutant. Biophys J 2016;108:2350-61. [PMID: 25954892 DOI: 10.1016/j.bpj.2015.04.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Revised: 04/01/2015] [Accepted: 04/02/2015] [Indexed: 10/23/2022] Open

Abstract

Escherichia coli phosphofructokinase-2 (Pfk-2) is an obligate homodimer that follows a highly cooperative three-state folding mechanism N2 ↔ 2I ↔ 2U. The strong coupling between dissociation and unfolding is a consequence of the structural features of its interface: a bimolecular domain formed by intertwining of the small domain of each subunit into a flattened β-barrel. Although isolated monomers of E. coli Pfk-2 have been observed by modification of the environment (changes in temperature, addition of chaotropic agents), no isolated subunits in native conditions have been obtained. Based on in silico estimations of the change in free energy and the local energetic frustration upon binding, we engineered a single-point mutant to destabilize the interface of Pfk-2. This mutant, L93A, is an inactive monomer at protein concentrations below 30 μM, as determined by analytical ultracentrifugation, dynamic light scattering, size exclusion chromatography, small-angle x-ray scattering, and enzyme kinetics. Active dimer formation can be induced by increasing the protein concentration and by addition of its substrate fructose-6-phosphate. Chemical and thermal unfolding of the L93A monomer followed by circular dichroism and dynamic light scattering suggest that it unfolds noncooperatively and that the isolated subunit is partially unstructured and marginally stable. The detailed structural features of the L93A monomer and the F6P-induced dimer were ascertained by high-resolution hydrogen/deuterium exchange mass spectrometry. Our results show that the isolated subunit has overall higher solvent accessibility than the native dimer, with the exception of residues 240-309. These residues correspond to most of the β-meander module and show the same extent of deuterium uptake as the native dimer. Our results support the idea that the hydrophobic core of the isolated monomer of Pfk-2 is solvent-penetrated in native conditions and that the β-meander module is not affected by monomerizing mutations.

Collapse

Chen ASY, Westwood NJ, Brear P, Rogers GW, Mavridis L, Mitchell JBO. A Random Forest Model for Predicting Allosteric and Functional Sites on Proteins. Mol Inform 2016;35:125-35. [DOI: 10.1002/minf.201500108] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 12/28/2015] [Indexed: 01/17/2023]

Daste F, Galli T, Tareste D. Structure and function of longin SNAREs. J Cell Sci 2015;128:4263-72. [PMID: 26567219 DOI: 10.1242/jcs.178574] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Fox NK, Brenner SE, Chandonia JM. The value of protein structure classification information-Surveying the scientific literature. Proteins 2015;83:2025-38. [PMID: 26313554 PMCID: PMC4609302 DOI: 10.1002/prot.24915] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Revised: 08/06/2015] [Accepted: 08/18/2015] [Indexed: 11/08/2022]

Kelley LA, Sternberg MJE. Partial protein domains: evolutionary insights and bioinformatics challenges. Genome Biol 2015;16:100. [PMID: 25986583 PMCID: PMC4436111 DOI: 10.1186/s13059-015-0663-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Banach M, Prudhomme N, Carpentier M, Duprat E, Papandreou N, Kalinowska B, Chomilier J, Roterman I. Contribution to the prediction of the fold code: application to immunoglobulin and flavodoxin cases. PLoS One 2015;10:e0125098. [PMID: 25915049 PMCID: PMC4411048 DOI: 10.1371/journal.pone.0125098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 03/20/2015] [Indexed: 12/19/2022] Open

Abstract

Background

Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model.

Results

We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.

Collapse

A structure-based classification and analysis of protein domain family binding sites and their interactions. BIOLOGY 2015;4:327-43. [PMID: 25860777 PMCID: PMC4498303 DOI: 10.3390/biology4020327] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Revised: 03/24/2015] [Accepted: 03/31/2015] [Indexed: 11/29/2022]

Computational tools for epitope vaccine design and evaluation. Curr Opin Virol 2015;11:103-12. [PMID: 25837467 DOI: 10.1016/j.coviro.2015.03.013] [Citation(s) in RCA: 47] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Revised: 03/13/2015] [Accepted: 03/16/2015] [Indexed: 12/15/2022]

Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 2015;44:1172-239. [PMID: 25503938 PMCID: PMC4349129 DOI: 10.1039/c4cs00351a] [Citation(s) in RCA: 256] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Indexed: 12/21/2022]

Abstract

The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.

Collapse

Rappoport N, Stern A, Linial N, Linial M. Entropy-driven partitioning of the hierarchical protein space. Bioinformatics 2015;30:i624-30. [PMID: 25161256 PMCID: PMC4147929 DOI: 10.1093/bioinformatics/btu478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Abstract

Motivation: Modern protein sequencing techniques have led to the determination of >50 million protein sequences. ProtoNet is a clustering system that provides a continuous hierarchical agglomerative clustering tree for all proteins. While ProtoNet performs unsupervised classification of all included proteins, finding an optimal level of granularity for the purpose of focusing on protein functional groups remain elusive. Here, we ask whether knowledge-based annotations on protein families can support the automatic unsupervised methods for identifying high-quality protein families. We present a method that yields within the ProtoNet hierarchy an optimal partition of clusters, relative to manual annotation schemes. The method’s principle is to minimize the entropy-derived distance between annotation-based partitions and all available hierarchical partitions. We describe the best front (BF) partition of 2 478 328 proteins from UniRef50. Of 4 929 553 ProtoNet tree clusters, BF based on Pfam annotations contain 26 891 clusters. The high quality of the partition is validated by the close correspondence with the set of clusters that best describe thousands of keywords of Pfam. The BF is shown to be superior to naïve cut in the ProtoNet tree that yields a similar number of clusters. Finally, we used parameters intrinsic to the clustering process to enrich a priori the BF’s clusters. We present the entropy-based method’s benefit in overcoming the unavoidable limitations of nested clusters in ProtoNet. We suggest that this automatic information-based cluster selection can be useful for other large-scale annotation schemes, as well as for systematically testing and comparing putative families derived from alternative clustering methods.

Availability and implementation: A catalog of BF clusters for thousands of Pfam keywords is provided at http://protonet.cs.huji.ac.il/bestFront/

Contact: michall@cc.huji.ac.il

Collapse

Gabanyi MJ, Berman HM. Protein structure annotation resources. Methods Mol Biol 2015;1261:3-20. [PMID: 25502191 PMCID: PMC5586544 DOI: 10.1007/978-1-4939-2230-7_1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Nguyen MN, Verma C. Rclick: a web server for comparison of RNA 3D structures. Bioinformatics 2014;31:966-8. [DOI: 10.1093/bioinformatics/btu752] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Hepburn L, Prajsnar TK, Klapholz C, Moreno P, Loynes CA, Ogryzko NV, Brown K, Schiebler M, Hegyi K, Antrobus R, Hammond KL, Connolly J, Ochoa B, Bryant C, Otto M, Surewaard B, Seneviratne SL, Grogono DM, Cachat J, Ny T, Kaser A, Török ME, Peacock SJ, Holden M, Blundell T, Wang L, Ligoxygakis P, Minichiello L, Woods CG, Foster SJ, Renshaw SA, Floto RA. Innate immunity. A Spaetzle-like role for nerve growth factor β in vertebrate immunity to Staphylococcus aureus. Science 2014;346:641-646. [PMID: 25359976 PMCID: PMC4255479 DOI: 10.1126/science.1258705] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Affiliation(s)

Lucy Hepburn Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK
Tomasz K. Prajsnar Krebs Institute, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Molecular Biology and Biotechnology, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Bateson Centre, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
Catherine Klapholz Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK
Pablo Moreno Cambridge Institute for Medical Research, University of Cambridge, UK
Catherine A. Loynes Bateson Centre, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Infection and Immunity, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
Nikolay V. Ogryzko Bateson Centre, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
Karen Brown Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK Cambridge Centre for Lung Infection, Papworth Hospital, Cambridge, UK
Mark Schiebler Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK
Krisztina Hegyi Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK
Robin Antrobus Cambridge Institute for Medical Research, University of Cambridge, UK
Katherine L. Hammond Bateson Centre, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Infection and Immunity, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
John Connolly Krebs Institute, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Molecular Biology and Biotechnology, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
Bernardo Ochoa Department of Biochemistry, University of Cambridge, UK
Clare Bryant Department of Veterinary Medicine, University of Cambridge, UK
Michael Otto Laboratory of Human Bacterial Pathogenesis NIAID, NIH, Bethesda USA
Bas Surewaard Dept of Medical Microbiology, University Medical Centre, Utrecht, Netherlands
Suranjith L. Seneviratne Department of Clinical Immunology, Royal Free Hospital London, UK
Dorothy M. Grogono Department of Medicine, University of Cambridge, UK Cambridge Centre for Lung Infection, Papworth Hospital, Cambridge, UK
Julien Cachat Dept. of Pathology and Immunology, Geneva University, Switzerland
Tor Ny Dept. of Medical Biochemistry and Biophysics, Umea University, Sweden
Arthur Kaser Department of Medicine, University of Cambridge, UK
M. Estée Török Department of Medicine, University of Cambridge, UK
Sharon J. Peacock Department of Medicine, University of Cambridge, UK Wellcome Trust Sanger Institute, Hinxton, UK
Matthew Holden Wellcome Trust Sanger Institute, Hinxton, UK
Tom Blundell Department of Biochemistry, University of Cambridge, UK
Lihui Wang Biochemistry Department, Oxford University. UK
Petros Ligoxygakis Biochemistry Department, Oxford University. UK
Liliana Minichiello Pharmacology Department, Oxford University. UK
C. Geoff Woods Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medical Genetics, University of Cambridge, UK
Simon J. Foster Krebs Institute, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Molecular Biology and Biotechnology, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
Stephen A. Renshaw Krebs Institute, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Bateson Centre, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK Department of Infection and Immunity, University of Sheffield, Western Bank, Sheffield, S10 2TN, UK
R. Andres Floto Cambridge Institute for Medical Research, University of Cambridge, UK Department of Medicine, University of Cambridge, UK Cambridge Centre for Lung Infection, Papworth Hospital, Cambridge, UK

Collapse

Berman HM, Kleywegt GJ, Nakamura H, Markley JL. The Protein Data Bank archive as an open data resource. J Comput Aided Mol Des 2014;28:1009-14. [PMID: 25062767 PMCID: PMC4196035 DOI: 10.1007/s10822-014-9770-y] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 06/23/2014] [Indexed: 02/08/2023]

Ma J, Ma Z, Kang B, Lu K. A method of protein model classification and retrieval using bag-of-visual-features. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2014;2014:269394. [PMID: 25258644 PMCID: PMC4165735 DOI: 10.1155/2014/269394] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Accepted: 07/30/2014] [Indexed: 11/18/2022]

Loong BK, Knotts TA. Communication: Using multiple tethers to stabilize proteins on surfaces. J Chem Phys 2014;141:051104. [DOI: 10.1063/1.4891971] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

Musayeva K, Henderson T, Mitchell JB, Mavridis L. PFClust: an optimised implementation of a parameter-free clustering algorithm. SOURCE CODE FOR BIOLOGY AND MEDICINE 2014;9:5. [PMID: 24490618 PMCID: PMC3940029 DOI: 10.1186/1751-0473-9-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Accepted: 01/28/2014] [Indexed: 11/25/2022]

De Franceschi N, Wild K, Schlacht A, Dacks JB, Sinning I, Filippini F. Longin and GAF domains: structural evolution and adaptation to the subcellular trafficking machinery. Traffic 2013;15:104-21. [PMID: 24107188 DOI: 10.1111/tra.12124] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Revised: 09/18/2013] [Accepted: 09/23/2013] [Indexed: 11/28/2022]

ProtoNet: charting the expanding universe of protein sequences. Nat Biotechnol 2013;31:290-2. [PMID: 23563419 DOI: 10.1038/nbt.2553] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Vance SJ, McDonald RE, Cooper A, Smith BO, Kennedy MW. The structure of latherin, a surfactant allergen protein from horse sweat and saliva. J R Soc Interface 2013;10:20130453. [PMID: 23782536 PMCID: PMC4043175 DOI: 10.1098/rsif.2013.0453] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2013] [Accepted: 05/29/2013] [Indexed: 12/30/2022] Open

Abstract

Latherin is a highly surface-active allergen protein found in the sweat and saliva of horses and other equids. Its surfactant activity is intrinsic to the protein in its native form, and is manifest without associated lipids or glycosylation. Latherin probably functions as a wetting agent in evaporative cooling in horses, but it may also assist in mastication of fibrous food as well as inhibition of microbial biofilms. It is a member of the PLUNC family of proteins abundant in the oral cavity and saliva of mammals, one of which has also been shown to be a surfactant and capable of disrupting microbial biofilms. How these proteins work as surfactants while remaining soluble and cell membrane-compatible is not known. Nor have their structures previously been reported. We have used protein nuclear magnetic resonance spectroscopy to determine the conformation and dynamics of latherin in aqueous solution. The protein is a monomer in solution with a slightly curved cylindrical structure exhibiting a 'super-roll' motif comprising a four-stranded anti-parallel β-sheet and two opposing α-helices which twist along the long axis of the cylinder. One end of the molecule has prominent, flexible loops that contain a number of apolar amino acid side chains. This, together with previous biophysical observations, leads us to a plausible mechanism for surfactant activity in which the molecule is first localized to the non-polar interface via these loops, and then unfolds and flattens to expose its hydrophobic interior to the air or non-polar surface. Intrinsically surface-active proteins are relatively rare in nature, and this is the first structure of such a protein from mammals to be reported. Both its conformation and proposed method of action are different from other, non-mammalian surfactant proteins investigated so far.

Collapse

Mavridis L, Nath N, Mitchell JBO. PFClust: a novel parameter free clustering algorithm. BMC Bioinformatics 2013;14:213. [PMID: 23819480 PMCID: PMC3747858 DOI: 10.1186/1471-2105-14-213] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 07/01/2013] [Indexed: 12/02/2022] Open

Abstract

Background

We present the algorithm PFClust (Parameter Free Clustering), which is able automatically to cluster data and identify a suitable number of clusters to group them into without requiring any parameters to be specified by the user. The algorithm partitions a dataset into a number of clusters that share some common attributes, such as their minimum expectation value and variance of intra-cluster similarity. A set of n objects can be clustered into any number of clusters from one to n, and there are many different hierarchical and partitional, agglomerative and divisive, clustering methodologies available that can be used to do this. Nonetheless, automatically determining the number of clusters present in a dataset constitutes a significant challenge for clustering algorithms. Identifying a putative optimum number of clusters to group the objects into involves computing and evaluating a range of clusterings with different numbers of clusters. However, there is no agreed or unique definition of optimum in this context. Thus, we test PFClust on datasets for which an external gold standard of ‘correct’ cluster definitions exists, noting that this division into clusters may be suboptimal according to other reasonable criteria. PFClust is heuristic in the sense that it cannot be described in terms of optimising any single simply-expressed metric over the space of possible clusterings.

Results

We validate PFClust firstly with reference to a number of synthetic datasets consisting of 2D vectors, showing that its clustering performance is at least equal to that of six other leading methodologies – even though five of the other methods are told in advance how many clusters to use. We also demonstrate the ability of PFClust to classify the three dimensional structures of protein domains, using a set of folds taken from the structural bioinformatics database CATH.

Conclusions

We show that PFClust is able to cluster the test datasets a little better, on average, than any of the other algorithms, and furthermore is able to do this without the need to specify any external parameters. Results on the synthetic datasets demonstrate that PFClust generates meaningful clusters, while our algorithm also shows excellent agreement with the correct assignments for a dataset extracted from the CATH part-manually curated classification of protein domain structures.

Collapse

Hugo W, Sung WK, Ng SK. Discovering interacting domains and motifs in protein-protein interactions. Methods Mol Biol 2013. [PMID: 23192537 DOI: 10.1007/978-1-62703-107-3_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]

Dai Q, Li Y, Liu X, Yao Y, Cao Y, He P. Comparison study on statistical features of predicted secondary structures for protein structural class prediction: From content to position. BMC Bioinformatics 2013;14:152. [PMID: 23641706 PMCID: PMC3652764 DOI: 10.1186/1471-2105-14-152] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 04/03/2013] [Indexed: 11/10/2022] Open

Choi Y, Griswold KE, Bailey-Kellogg C. Structure-based redesign of proteins for minimal T-cell epitope content. J Comput Chem 2013;34:879-91. [PMID: 23299435 PMCID: PMC3763725 DOI: 10.1002/jcc.23213] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2012] [Revised: 11/16/2012] [Accepted: 11/28/2012] [Indexed: 12/31/2022]

Rappoport N, Linial M. Functional inference by ProtoNet family tree: the uncharacterized proteome of Daphnia pulex. BMC Bioinformatics 2013;14 Suppl 3:S11. [PMID: 23514195 PMCID: PMC3584848 DOI: 10.1186/1471-2105-14-s3-s11] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Sadowski MI. Prediction of protein domain boundaries from inverse covariances. Proteins 2013;81:253-60. [PMID: 22987736 PMCID: PMC3563215 DOI: 10.1002/prot.24181] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2012] [Revised: 08/10/2012] [Accepted: 09/04/2012] [Indexed: 01/04/2023]

Li J, Wu J, Chen K. PFP-RFSM: Protein fold prediction by using random forests and sequence motifs. ACTA ACUST UNITED AC 2013. [DOI: 10.4236/jbise.2013.612145] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

PSOMOPOULOS FOTISE, MITKAS PERICLESA. MULTI-LEVEL CLUSTERING OF PHYLOGENETIC PROFILES. INT J ARTIF INTELL T 2012. [DOI: 10.1142/s0218213012400234] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Ritchie DW, Ghoorah AW, Mavridis L, Venkatraman V. Fast protein structure alignment using Gaussian overlap scoring of backbone peptide fragment similarity. Bioinformatics 2012;28:3274-81. [DOI: 10.1093/bioinformatics/bts618] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Usifo E, Leigh SEA, Whittall RA, Lench N, Taylor A, Yeats C, Orengo CA, Martin ACR, Celli J, Humphries SE. Low-Density Lipoprotein Receptor Gene Familial Hypercholesterolemia Variant Database: Update and Pathological Assessment. Ann Hum Genet 2012;76:387-401. [DOI: 10.1111/j.1469-1809.2012.00724.x] [Citation(s) in RCA: 159] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Lewis JI, Moss DJ, Knotts TA. Multiple molecule effects on the cooperativity of protein folding transitions in simulations. J Chem Phys 2012;136:245101. [DOI: 10.1063/1.4729604] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open

Skolnick J, Zhou H, Brylinski M. Further evidence for the likely completeness of the library of solved single domain protein structures. J Phys Chem B 2012;116:6654-64. [PMID: 22272723 DOI: 10.1021/jp211052j] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]