Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Todd AE, Marsden RL, Thornton JM, Orengo CA. Progress of Structural Genomics Initiatives: An Analysis of Solved Target Structures. J Mol Biol 2005;348:1235-60. [PMID: 15854658 DOI: 10.1016/j.jmb.2005.03.037] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2004] [Revised: 02/28/2005] [Accepted: 03/15/2005] [Indexed: 11/27/2022]

For:	Todd AE, Marsden RL, Thornton JM, Orengo CA. Progress of Structural Genomics Initiatives: An Analysis of Solved Target Structures. J Mol Biol 2005;348:1235-60. [PMID: 15854658 DOI: 10.1016/j.jmb.2005.03.037] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2004] [Revised: 02/28/2005] [Accepted: 03/15/2005] [Indexed: 11/27/2022]

Number

Cited by Other Article(s)

Sanchez-Pulido L, Ponting CP. Extending the Horizon of Homology Detection with Coevolution-based Structure Prediction. J Mol Biol 2021;433:167106. [PMID: 34139218 PMCID: PMC8527833 DOI: 10.1016/j.jmb.2021.167106] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Revised: 06/09/2021] [Accepted: 06/09/2021] [Indexed: 12/12/2022]

Abstract

Traditional sequence analysis algorithms fail to identify distant homologies when they lie beyond a detection horizon. In this review, we discuss how co-evolution-based contact and distance prediction methods are pushing back this homology detection horizon, thereby yielding new functional insights and experimentally testable hypotheses. Based on correlated substitutions, these methods divine three-dimensional constraints among amino acids in protein sequences that were previously devoid of all annotated domains and repeats. The new algorithms discern hidden structure in an otherwise featureless sequence landscape. Their revelatory impact promises to be as profound as the use, by archaeologists, of ground-penetrating radar to discern long-hidden, subterranean structures. As examples of this, we describe how triplicated structures reflecting longin domains in MON1A-like proteins, or UVR-like repeats in DISC1, emerge from their predicted contact and distance maps. These methods also help to resolve structures that do not conform to a "beads-on-a-string" model of protein domains. In one such example, we describe CFAP298 whose ubiquitin-like domain was previously challenging to perceive owing to a large sequence insertion within it. More generally, the new algorithms permit an easier appreciation of domain families and folds whose evolution involved structural insertion or rearrangement. As we exemplify with α1-antitrypsin, coevolution-based predicted contacts may also yield insights into protein dynamics and conformational change. This new combination of structure prediction (using innovative co-evolution based methods) and homology inference (using more traditional sequence analysis approaches) shows great promise for bringing into view a sea of evolutionary relationships that had hitherto lain far beyond the horizon of homology detection.

Collapse

Bordin N, Sillitoe I, Lees JG, Orengo C. Tracing Evolution Through Protein Structures: Nature Captured in a Few Thousand Folds. Front Mol Biosci 2021;8:668184. [PMID: 34041266 PMCID: PMC8141709 DOI: 10.3389/fmolb.2021.668184] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open

Wilson IA, Stanfield RL. 50 Years of structural immunology. J Biol Chem 2021;296:100745. [PMID: 33957119 PMCID: PMC8163984 DOI: 10.1016/j.jbc.2021.100745] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 03/24/2021] [Accepted: 04/30/2021] [Indexed: 12/12/2022] Open

The Classification of Protein Domains. Methods Mol Biol 2018;1525:137-164. [PMID: 27896721 DOI: 10.1007/978-1-4939-6622-6_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]

Ferreira de Freitas R, Schapira M. A systematic analysis of atomic protein-ligand interactions in the PDB. MEDCHEMCOMM 2017;8:1970-1981. [PMID: 29308120 PMCID: PMC5708362 DOI: 10.1039/c7md00381a] [Citation(s) in RCA: 232] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 09/15/2017] [Indexed: 12/20/2022]

Murthy T, Wang Y, Reynolds C, Boggon TJ. Automated Protein Crystallization Trials Using the Thermo Scientific Matrix Hydra II eDrop. ACTA ACUST UNITED AC 2016. [DOI: 10.1016/j.jala.2007.04.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Hu J, Han K, Li Y, Yang JY, Shen HB, Yu DJ. TargetCrys: protein crystallization prediction by fusing multi-view features with two-layered SVM. Amino Acids 2016;48:2533-2547. [DOI: 10.1007/s00726-016-2274-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Accepted: 06/07/2016] [Indexed: 12/12/2022]

An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life. Sci Rep 2015;5:14717. [PMID: 26434770 PMCID: PMC4592975 DOI: 10.1038/srep14717] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 09/07/2015] [Indexed: 11/14/2022] Open

Ofer D, Linial M. ProFET: Feature engineering captures high-level protein functions. Bioinformatics 2015;31:3429-36. [DOI: 10.1093/bioinformatics/btv345] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 05/29/2015] [Indexed: 11/13/2022] Open

Molloy K, Van MJ, Barbara D, Shehu A. Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space. BMC Bioinformatics 2014;15 Suppl 8:S4. [PMID: 25080993 PMCID: PMC4120149 DOI: 10.1186/1471-2105-15-s8-s4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space.

METHODS

Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain.

RESULTS

We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership.

CONCLUSIONS

This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools.

Collapse

Functional Genomics. ARCHAEA-AN INTERNATIONAL MICROBIOLOGICAL JOURNAL 2014. [DOI: 10.1128/9781555815516.ch20] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2023]

Dynamical Aspects of Biomacromolecular Multi-resolution Modelling Using the UltraScan Solution Modeler (US-SOMO) Suite. ACTA ACUST UNITED AC 2013. [DOI: 10.1007/978-94-017-8550-1_13] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Brookes E, Pérez J, Cardinali B, Profumo A, Vachette P, Rocco M. Fibrinogen species as resolved by HPLC-SAXS data processing within the UltraScan Solution Modeler (US-SOMO) enhanced SAS module. J Appl Crystallogr 2013;46:1823-1833. [PMID: 24282333 PMCID: PMC3831300 DOI: 10.1107/s0021889813027751] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2013] [Accepted: 10/09/2013] [Indexed: 12/04/2022] Open

Abstract

Fibrinogen is a large heterogeneous aggregation/degradation-prone protein playing a central role in blood coagulation and associated pathologies, whose structure is not completely resolved. When a high-molecular-weight fraction was analyzed by size-exclusion high-performance liquid chromatography/small-angle X-ray scattering (HPLC-SAXS), several composite peaks were apparent and because of the stickiness of fibrinogen the analysis was complicated by severe capillary fouling. Novel SAS analysis tools developed as a part of the UltraScan Solution Modeler (US-SOMO; http://somo.uthscsa.edu/), an open-source suite of utilities with advanced graphical user interfaces whose initial goal was the hydrodynamic modeling of biomacromolecules, were implemented and applied to this problem. They include the correction of baseline drift due to the accumulation of material on the SAXS capillary walls, and the Gaussian decomposition of non-baseline-resolved HPLC-SAXS elution peaks. It was thus possible to resolve at least two species co-eluting under the fibrinogen main monomer peak, probably resulting from in-column degradation, and two others under an oligomers peak. The overall and cross-sectional radii of gyration, molecular mass and mass/length ratio of all species were determined using the manual or semi-automated procedures available within the US-SOMO SAS module. Differences between monomeric species and linear and sideways oligomers were thus identified and rationalized. This new US-SOMO version additionally contains several computational and graphical tools, implementing functionalities such as the mapping of residues contributing to particular regions of P(r), and an advanced module for the comparison of primary I(q) versus q data with model curves computed from atomic level structures or bead models. It should be of great help in multi-resolution studies involving hydrodynamics, solution scattering and crystallographic/NMR data.

Collapse

Lounnas V, Ritschel T, Kelder J, McGuire R, Bywater RP, Foloppe N. Current progress in Structure-Based Rational Drug Design marks a new mindset in drug discovery. Comput Struct Biotechnol J 2013;5:e201302011. [PMID: 24688704 PMCID: PMC3962124 DOI: 10.5936/csbj.201302011] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2012] [Revised: 01/26/2013] [Accepted: 02/08/2013] [Indexed: 12/20/2022] Open

Tiwari MK, Singh R, Singh RK, Kim IW, Lee JK. Computational approaches for rational design of proteins with novel functionalities. Comput Struct Biotechnol J 2012;2:e201209002. [PMID: 24688643 PMCID: PMC3962203 DOI: 10.5936/csbj.201209002] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Revised: 08/17/2012] [Accepted: 08/23/2012] [Indexed: 11/22/2022] Open

Jamroz M, Kolinski A, Kihara D. Structural features that predict real-value fluctuations of globular proteins. Proteins 2012;80:1425-35. [PMID: 22328193 DOI: 10.1002/prot.24040] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2011] [Revised: 01/03/2012] [Accepted: 01/11/2012] [Indexed: 12/20/2022]

Sael L, Chitale M, Kihara D. Structure- and sequence-based function prediction for non-homologous proteins. ACTA ACUST UNITED AC 2012;13:111-23. [PMID: 22270458 DOI: 10.1007/s10969-012-9126-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2011] [Accepted: 01/10/2012] [Indexed: 01/14/2023]

Mullins JGL. Structural modelling pipelines in next generation sequencing projects. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012;89:117-67. [PMID: 23046884 DOI: 10.1016/b978-0-12-394287-6.00005-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

La D, Kihara D. A novel method for protein-protein interaction site prediction using phylogenetic substitution models. Proteins 2011;80:126-41. [PMID: 21989996 DOI: 10.1002/prot.23169] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 07/07/2011] [Accepted: 08/17/2011] [Indexed: 11/10/2022]

Xie L, Xie L, Bourne PE. Structure-based systems biology for analyzing off-target binding. Curr Opin Struct Biol 2011;21:189-99. [PMID: 21292475 PMCID: PMC3070778 DOI: 10.1016/j.sbi.2011.01.004] [Citation(s) in RCA: 110] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/11/2011] [Accepted: 01/13/2011] [Indexed: 12/24/2022]

Lee D, de Beer TAP, Laskowski RA, Thornton JM, Orengo CA. 1,000 structures and more from the MCSG. BMC STRUCTURAL BIOLOGY 2011;11:2. [PMID: 21219649 PMCID: PMC3024214 DOI: 10.1186/1472-6807-11-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/16/2010] [Accepted: 01/10/2011] [Indexed: 11/10/2022]

Sael L, Kihara D. Improved protein surface comparison and application to low-resolution protein structure data. BMC Bioinformatics 2010;11 Suppl 11:S2. [PMID: 21172052 PMCID: PMC3024873 DOI: 10.1186/1471-2105-11-s11-s2] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Abstract

Background

Recent advancements of experimental techniques for determining protein tertiary structures raise significant challenges for protein bioinformatics. With the number of known structures of unknown function expanding at a rapid pace, an urgent task is to provide reliable clues to their biological function on a large scale. Conventional approaches for structure comparison are not suitable for a real-time database search due to their slow speed. Moreover, a new challenge has arisen from recent techniques such as electron microscopy (EM), which provide low-resolution structure data. Previously, we have introduced a method for protein surface shape representation using the 3D Zernike descriptors (3DZDs). The 3DZD enables fast structure database searches, taking advantage of its rotation invariance and compact representation. The search results of protein surface represented with the 3DZD has showngood agreement with the existing structure classifications, but some discrepancies were also observed.

Results

The three new surface representations of backbone atoms, originally devised all-atom-surface representation, and the combination of all-atom surface with the backbone representation are examined. All representations are encoded with the 3DZD. Also, we have investigated the applicability of the 3DZD for searching protein EM density maps of varying resolutions. The surface representations are evaluated on structure retrieval using two existing classifications, SCOP and the CE-based classification.

Conclusions

Overall, the 3DZDs representing backbone atoms show better retrieval performance than the original all-atom surface representation. The performance further improved when the two representations are combined. Moreover, we observed that the 3DZD is also powerful in comparing low-resolution structures obtained by electron microscopy.

Collapse

Brylinski M, Skolnick J. FINDSITE-metal: integrating evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level. Proteins 2010;79:735-51. [PMID: 21287609 DOI: 10.1002/prot.22913] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Revised: 09/27/2010] [Accepted: 10/07/2010] [Indexed: 12/13/2022]

Cuff AL, Sillitoe I, Lewis T, Clegg AB, Rentzsch R, Furnham N, Pellegrini-Calace M, Jones D, Thornton J, Orengo CA. Extending CATH: increasing coverage of the protein structure universe and linking structure with function. Nucleic Acids Res 2010;39:D420-6. [PMID: 21097779 PMCID: PMC3013636 DOI: 10.1093/nar/gkq1001] [Citation(s) in RCA: 118] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Doppelt-Azeroual O, Delfaud F, Moriaud F, de Brevern AG. Fast and automated functional classification with MED-SuMo: an application on purine-binding proteins. Protein Sci 2010;19:847-67. [PMID: 20162627 DOI: 10.1002/pro.364] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Yang YD, Spratt P, Chen H, Park C, Kihara D. Sub-AQUA: real-value quality assessment of protein structure models. Protein Eng Des Sel 2010;23:617-32. [PMID: 20525730 DOI: 10.1093/protein/gzq030] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Abstract

Computational protein tertiary structure prediction has made significant progress over the past years. However, most of the existing structure prediction methods are not equipped with functionality to predict accuracy of constructed models. Knowing the accuracy of a structure model is crucial for its practical use since the accuracy determines potential applications of the model. Here we have developed quality assessment methods, which predict real value of the global and local quality of protein structure models. The global quality of a model is defined as the root mean square deviation (RMSD) and the LGA score to its native structure. The local quality is defined as the distance between the corresponding Calpha positions of a model and its native structure when they are superimposed. Three regression methods are employed to combine different types of quality assessment measures of models, including alignment-level scores, residue-position level scores, atomic-detailed structure level scores and composite scores. The regression models were tested on a large benchmark data set of template-based protein structure models of various qualities. In predicting RMSD and the LGA score, a combination of two terms, length-normalized SPAD, a score that assesses alignment stability by considering suboptimal alignments, and Verify3D normalized by the square of the model length shows a significant performance, achieving 97.1 and 83.6% accuracy in identifying models with an RMSD of <2 and 6 A, respectively. For predicting the local quality of models, we find that a two-step approach, in which the global RMSD predicted in the first step is further combined with the other terms, can dramatically increase the accuracy. Finally, the developed regression equations are applied to assess the quality of structure models of whole E. coli proteome.

Collapse

Schmidt am Busch M, Sedano A, Simonson T. Computational protein design: validation and possible relevance as a tool for homology searching and fold recognition. PLoS One 2010;5:e10410. [PMID: 20463972 PMCID: PMC2864755 DOI: 10.1371/journal.pone.0010410] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2009] [Accepted: 03/31/2010] [Indexed: 11/19/2022] Open

Abstract

BACKGROUND

Protein fold recognition usually relies on a statistical model of each fold; each model is constructed from an ensemble of natural sequences belonging to that fold. A complementary strategy may be to employ sequence ensembles produced by computational protein design. Designed sequences can be more diverse than natural sequences, possibly avoiding some limitations of experimental databases.

METHODOLOGY/PRINCIPAL FINDINGS

WE EXPLORE THIS STRATEGY FOR FOUR SCOP FAMILIES: Small Kunitz-type inhibitors (SKIs), Interleukin-8 chemokines, PDZ domains, and large Caspase catalytic subunits, represented by 43 structures. An automated procedure is used to redesign the 43 proteins. We use the experimental backbones as fixed templates in the folded state and a molecular mechanics model to compute the interaction energies between sidechain and backbone groups. Calculations are done with the Proteins@Home volunteer computing platform. A heuristic algorithm is used to scan the sequence and conformational space, yielding 200,000-300,000 sequences per backbone template. The results confirm and generalize our earlier study of SH2 and SH3 domains. The designed sequences ressemble moderately-distant, natural homologues of the initial templates; e.g., the SUPERFAMILY, profile Hidden-Markov Model library recognizes 85% of the low-energy sequences as native-like. Conversely, Position Specific Scoring Matrices derived from the sequences can be used to detect natural homologues within the SwissProt database: 60% of known PDZ domains are detected and around 90% of known SKIs and chemokines. Energy components and inter-residue correlations are analyzed and ways to improve the method are discussed.

CONCLUSIONS/SIGNIFICANCE

For some families, designed sequences can be a useful complement to experimental ones for homologue searching. However, improved tools are needed to extract more information from the designed profiles before the method can be of general use.

Collapse

Cuff A, Redfern OC, Greene L, Sillitoe I, Lewis T, Dibley M, Reid A, Pearl F, Dallman T, Todd A, Garratt R, Thornton J, Orengo C. The CATH hierarchy revisited-structural divergence in domain superfamilies and the continuity of fold space. Structure 2010;17:1051-62. [PMID: 19679085 PMCID: PMC2741583 DOI: 10.1016/j.str.2009.06.015] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2008] [Revised: 06/24/2009] [Accepted: 06/25/2009] [Indexed: 11/29/2022]

'Unknown' proteins and 'orphan' enzymes: the missing half of the engineering parts list--and how to find it. Biochem J 2009;425:1-11. [PMID: 20001958 DOI: 10.1042/bj20091328] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Chandra N. Computational systems approach for drug target discovery. Expert Opin Drug Discov 2009;4:1221-36. [DOI: 10.1517/17460440903380422] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Grabowski M, Chruszcz M, Zimmerman MD, Kirillova O, Minor W. Benefits of structural genomics for drug discovery research. Infect Disord Drug Targets 2009;9:459-74. [PMID: 19594422 PMCID: PMC2866842 DOI: 10.2174/187152609789105704] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2009] [Accepted: 06/15/2009] [Indexed: 11/22/2022]

am Busch MS, Mignon D, Simonson T. Computational protein design as a tool for fold recognition. Proteins 2009;77:139-58. [PMID: 19408297 DOI: 10.1002/prot.22426] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

The key role of genomics in modern vaccine and drug design for emerging infectious diseases. PLoS Genet 2009;5:e1000612. [PMID: 19855822 PMCID: PMC2752168 DOI: 10.1371/journal.pgen.1000612] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open

Rinaudo CD, Telford JL, Rappuoli R, Seib KL. Vaccinology in the genome era. J Clin Invest 2009;119:2515-25. [PMID: 19729849 DOI: 10.1172/jci38330] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Dessailly BH, Nair R, Jaroszewski L, Fajardo JE, Kouranov A, Lee D, Fiser A, Godzik A, Rost B, Orengo C. PSI-2: structural genomics to cover protein domain family space. Structure 2009;17:869-81. [PMID: 19523904 DOI: 10.1016/j.str.2009.03.015] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2008] [Revised: 03/18/2009] [Accepted: 03/22/2009] [Indexed: 11/25/2022]

Ausiello G, Gherardini PF, Gatti E, Incani O, Helmer-Citterich M. Structural motifs recurring in different folds recognize the same ligand fragments. BMC Bioinformatics 2009;10:182. [PMID: 19527512 PMCID: PMC2704211 DOI: 10.1186/1471-2105-10-182] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2009] [Accepted: 06/15/2009] [Indexed: 12/11/2022] Open

Potential for protein surface shape analysis using spherical harmonics and 3D Zernike descriptors. Cell Biochem Biophys 2009;54:23-32. [PMID: 19521674 DOI: 10.1007/s12013-009-9051-x] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 05/22/2009] [Indexed: 10/20/2022]

Peterson ME, Chen F, Saven JG, Roos DS, Babbitt PC, Sali A. Evolutionary constraints on structural similarity in orthologs and paralogs. Protein Sci 2009;18:1306-15. [PMID: 19472362 PMCID: PMC2774440 DOI: 10.1002/pro.143] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Revised: 03/29/2009] [Accepted: 03/30/2009] [Indexed: 11/10/2022]

Tartaglia GG, Pechmann S, Dobson CM, Vendruscolo M. A relationship between mRNA expression levels and protein solubility in E. coli. J Mol Biol 2009;388:381-9. [PMID: 19281824 DOI: 10.1016/j.jmb.2009.03.002] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Revised: 02/26/2009] [Accepted: 03/03/2009] [Indexed: 10/21/2022]

Nicola G, Smith CA, Abagyan R. New method for the assessment of all drug-like pockets across a structural genome. J Comput Biol 2008;15:231-40. [PMID: 18333758 DOI: 10.1089/cmb.2007.0178] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Redfern OC, Dessailly B, Orengo CA. Exploring the structure and function paradigm. Curr Opin Struct Biol 2008;18:394-402. [PMID: 18554899 DOI: 10.1016/j.sbi.2008.05.007] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Revised: 04/16/2008] [Accepted: 05/07/2008] [Indexed: 11/29/2022]

Target selection for structural genomics: an overview. Methods Mol Biol 2008;426:3-25. [PMID: 18542854 DOI: 10.1007/978-1-60327-058-8_1] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]

Hunjan J, Tovchigrechko A, Gao Y, Vakser IA. The size of the intermolecular energy funnel in protein-protein interactions. Proteins 2008;72:344-52. [PMID: 18214966 DOI: 10.1002/prot.21930] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Powers R, Mercier KA, Copeland JC. The application of FAST-NMR for the identification of novel drug discovery targets. Drug Discov Today 2008;13:172-9. [PMID: 18275915 DOI: 10.1016/j.drudis.2007.11.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2007] [Revised: 10/30/2007] [Accepted: 11/01/2007] [Indexed: 10/22/2022]

An approach to quality management in structural biology: Biophysical selection of proteins for successful crystallization. J Struct Biol 2008;162:451-9. [DOI: 10.1016/j.jsb.2008.03.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2007] [Revised: 03/05/2008] [Accepted: 03/06/2008] [Indexed: 11/23/2022]

Ward RM, Erdin S, Tran TA, Kristensen DM, Lisewski AM, Lichtarge O. De-orphaning the structural proteome through reciprocal comparison of evolutionarily important structural features. PLoS One 2008;3:e2136. [PMID: 18461181 PMCID: PMC2362850 DOI: 10.1371/journal.pone.0002136] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Accepted: 03/25/2008] [Indexed: 12/01/2022] Open

Xie L, Bourne PE. Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments. Proc Natl Acad Sci U S A 2008;105:5441-6. [PMID: 18385384 PMCID: PMC2291117 DOI: 10.1073/pnas.0704422105] [Citation(s) in RCA: 209] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2007] [Indexed: 11/18/2022] Open

Overton IM, Padovani G, Girolami MA, Barton GJ. ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction. Bioinformatics 2008;24:901-7. [DOI: 10.1093/bioinformatics/btn055] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Piedra D, Lois S, de la Cruz X. Preservation of protein clefts in comparative models. BMC STRUCTURAL BIOLOGY 2008;8:2. [PMID: 18199319 PMCID: PMC2249585 DOI: 10.1186/1472-6807-8-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2007] [Accepted: 01/16/2008] [Indexed: 11/29/2022]

Abstract

BACKGROUND

Comparative, or homology, modelling of protein structures is the most widely used prediction method when the target protein has homologues of known structure. Given that the quality of a model may vary greatly, several studies have been devoted to identifying the factors that influence modelling results. These studies usually consider the protein as a whole, and only a few provide a separate discussion of the behaviour of biologically relevant features of the protein. Given the value of the latter for many applications, here we extended previous work by analysing the preservation of native protein clefts in homology models. We chose to examine clefts because of their role in protein function/structure, as they are usually the locus of protein-protein interactions, host the enzymes' active site, or, in the case of protein domains, can also be the locus of domain-domain interactions that lead to the structure of the whole protein.

RESULTS

We studied how the largest cleft of a protein varies in comparative models. To this end, we analysed a set of 53507 homology models that cover the whole sequence identity range, with a special emphasis on medium and low similarities. More precisely we examined how cleft quality - measured using six complementary parameters related to both global shape and local atomic environment, depends on the sequence identity between target and template proteins. In addition to this general analysis, we also explored the impact of a number of factors on cleft quality, and found that the relationship between quality and sequence identity varies depending on cleft rank amongst the set of protein clefts (when ordered according to size), and number of aligned residues.

CONCLUSION

We have examined cleft quality in homology models at a range of seq.id. levels. Our results provide a detailed view of how quality is affected by distinct parameters and thus may help the user of comparative modelling to determine the final quality and applicability of his/her cleft models. In addition, the large variability in model quality that we observed within each sequence bin, with good models present even at low sequence identities (between 20% and 30%), indicates that properly developed identification methods could be used to recover good cleft models in this sequence range.

Collapse

Kristensen DM, Ward RM, Lisewski AM, Erdin S, Chen BY, Fofanov VY, Kimmel M, Kavraki LE, Lichtarge O. Prediction of enzyme function based on 3D templates of evolutionarily important amino acids. BMC Bioinformatics 2008;9:17. [PMID: 18190718 PMCID: PMC2219985 DOI: 10.1186/1471-2105-9-17] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2007] [Accepted: 01/11/2008] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Structural genomics projects such as the Protein Structure Initiative (PSI) yield many new structures, but often these have no known molecular functions. One approach to recover this information is to use 3D templates - structure-function motifs that consist of a few functionally critical amino acids and may suggest functional similarity when geometrically matched to other structures. Since experimentally determined functional sites are not common enough to define 3D templates on a large scale, this work tests a computational strategy to select relevant residues for 3D templates.

RESULTS

Based on evolutionary information and heuristics, an Evolutionary Trace Annotation (ETA) pipeline built templates for 98 enzymes, half taken from the PSI, and sought matches in a non-redundant structure database. On average each template matched 2.7 distinct proteins, of which 2.0 share the first three Enzyme Commission digits as the template's enzyme of origin. In many cases (61%) a single most likely function could be predicted as the annotation with the most matches, and in these cases such a plurality vote identified the correct function with 87% accuracy. ETA was also found to be complementary to sequence homology-based annotations. When matches are required to both geometrically match the 3D template and to be sequence homologs found by BLAST or PSI-BLAST, the annotation accuracy is greater than either method alone, especially in the region of lower sequence identity where homology-based annotations are least reliable.

CONCLUSION

These data suggest that knowledge of evolutionarily important residues improves functional annotation among distant enzyme homologs. Since, unlike other 3D template approaches, the ETA method bypasses the need for experimental knowledge of the catalytic mechanism, it should prove a useful, large scale, and general adjunct to combine with other methods to decipher protein function in the structural proteome.

Collapse