Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

382
(from Reference Citation Analysis)

Article PDFs (99)

Cited by > 0 (348)

Searched Name

Adam Godzik

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Number	Citation Analysis
351	Fetrow JS, Godzik A, Skolnick J. Functional analysis of the Escherichia coli genome using the sequence-to-structure-to-function paradigm: identification of proteins exhibiting the glutaredoxin/thioredoxin disulfide oxidoreductase activity. J Mol Biol 1998;282:703-11. [PMID: 9743619 DOI: 10.1006/jmbi.1998.2061] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Abstract The application of an automated method for the screening of protein activity based on the sequence-to-structure-to-function paradigm is presented for the complete Escherichia coli genome. First, the structure of the protein is identified from its sequence using a threading algorithm, which aligns the sequences to the best matching structure in a structural database and extends sequence analysis well beyond the limits of local sequence identity. Then, the active site is identified in the resulting sequence-to-structure alignment using a "fuzzy functional form" (FFF), a three-dimensional descriptor of the active site of a protein. Here, this sequence-to-structure-to-function concept is applied to analysis of the complete E. coli genome, i.e. all E. coli open reading frames (ORFs) are screened for the thiol-disulfide oxidoreductase activity of the glutaredoxin/thioredoxin protein family. We show that the method can identify the active sites in ten sequences that are known to or proposed to exhibit this activity. Furthermore, oxidoreductase activity is predicted in two other sequences that have not been identified previously. This method distinguishes protein pairs with similar active sites from proteins pairs that are just topological cousins, i.e. those having similar global folds, but not necessarily similar active sites. Thus, this method provides a novel approach for extraction of active site and functional information based on three-dimensional structures, rather than simple sequence analysis. Prediction of protein activity is fully automated and easily extendible to new functions. Finally, it is demonstrated here that the method can be applied to complete genome database analysis. Collapse Key Words Collapse MESH Headings Algorithms Automation Binding Sites Databases, Factual Escherichia coli/chemistry Escherichia coli/enzymology Escherichia coli/genetics Genome, Bacterial Glutaredoxins Open Reading Frames/genetics Oxidoreductases Protein Conformation Protein Disulfide Reductase (Glutathione)/chemistry Protein Disulfide Reductase (Glutathione)/genetics Protein Disulfide Reductase (Glutathione)/metabolism Protein Folding Proteins/chemistry Proteins/genetics Proteins/metabolism Sequence Alignment Software Structure-Activity Relationship Thioredoxins/chemistry Thioredoxins/genetics Thioredoxins/metabolism Collapse Grants Collapse
352	Rychlewski L, Zhang B, Godzik A. Fold and function predictions for Mycoplasma genitalium proteins. FOLDING & DESIGN 1998;3:229-38. [PMID: 9710568 DOI: 10.1016/s1359-0278(98)00034-0] [Citation(s) in RCA: 79] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Abstract BACKGROUND Uncharacterized proteins from newly sequenced genomes provide perfect targets for fold and function prediction. RESULTS For 38% of the entire genome of Mycoplasma genitalium, sequence similarity to a protein with a known structure can be recognized using a new sequence alignment algorithm. When comparing genomes of M. genitalium and Escherichia coli, > 80% of M. genitalium proteins have a significant sequence similarity to a protein in E. coli and there are > 40 examples that have not been recognized before. For all cases of proteins with significant profile similarities, there are strong analogies in their functions, if the functions of both proteins are known. The results presented here and other recent results strongly support the argument that such proteins are actually homologous. Assuming this homology allows one to make tentative functional assignments for > 50 previously uncharacterized proteins, including such intriguing cases as the putative beta-lactam antibiotic resistance protein in M. gentalium. CONCLUSIONS Using a new profile-to-profile alignment algorithm, the three-dimensional fold can be predicted for almost 40% of proteins from a genome of the small bacterium M. genitalium, and tentative function can be assigned to almost 80% of the entire genome. Some predictions lead to new insights about known functions or point to hitherto unexpected features of M. genitalium. Collapse Key Words Collapse MESH Headings Bacterial Proteins/chemistry Databases, Factual Escherichia coli/chemistry Genome, Bacterial Models, Molecular Mycoplasma/chemistry Open Reading Frames/genetics Protein Folding Sequence Homology, Amino Acid Structure-Activity Relationship Collapse Grants GM48835 NIGMS NIH HHS Collapse
353	Fetrow JS, Godzik A. Function driven protein evolution. A possible proto-protein for the RNA-binding proteins. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 1998:485-96. [PMID: 9697206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Abstract We introduce a hypothesis that present day proteins evolved from "proto-proteins," small 15-20 residue peptides with some elements of secondary structure and primitive function. Increasingly stable and functional proteins arose by adding structural elements to produce the small domains or protein modules that we would recognize today. From this point of view, the surprising similarities between small structural fragments of large proteins, that are usually taken as examples of convergent, function-driven evolution, are interpreted in exactly the opposite way--as traces of common evolutionary origin. As an example, a hypothetical evolutionary tree for two families of RNA binding proteins, the OB fold, a family of all beta proteins, and RBD fold, an alpha/beta protein family is presented. We argue that both protein families could have evolved from the same RNA-binding proto-protein, which had a form of beta-loop-beta RNA binding motif. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacteria Binding Sites Computer Graphics Computer Simulation DNA-Binding Proteins/chemistry DNA-Binding Proteins/genetics DNA-Binding Proteins/metabolism Evolution, Molecular Humans Models, Genetic Models, Molecular Molecular Sequence Data Protein Folding Protein Structure, Secondary RNA-Binding Proteins/chemistry RNA-Binding Proteins/genetics RNA-Binding Proteins/metabolism Sequence Alignment Sequence Homology, Amino Acid Collapse Grants Collapse
354	Jaroszewski L, Rychlewski L, Zhang B, Godzik A. Fold prediction by a hierarchy of sequence, threading, and modeling methods. Protein Sci 1998;7:1431-40. [PMID: 9655348 PMCID: PMC2144032 DOI: 10.1002/pro.5560070620] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Abstract Several fold recognition algorithms are compared to each other in terms of prediction accuracy and significance. It is shown that on standard benchmarks, hybrid methods, which combine scoring based on sequence-sequence and sequence-structure matching, surpass both sequence and threading methods in the number of accurate predictions. However, the sequence similarity contributes most to the prediction accuracy. This strongly argues that most examples of apparently nonhomologous proteins with similar folds are actually related by evolution. While disappointing from the perspective of the fundamental understanding of protein folding, this adds a new significance to fold recognition methods as a possible first step in function prediction. Despite hybrid methods being more accurate at fold prediction than either the sequence or threading methods, each of the methods is correct in some cases where others have failed. This partly reflects a different perspective on sequence/structure relationship embedded in various methods. To combine predictions from different methods, estimates of significance of predictions are made for all methods. With the help of such estimates, it is possible to develop a "jury" method, which has accuracy higher than any of the single methods. Finally, building full three-dimensional models for all top predictions helps to eliminate possible false positives where alignments, which are optimal in the one-dimensional sequences, lead to unsolvable sterical conflicts for the full three-dimensional models. Collapse Key Words Collapse MESH Headings Algorithms Chemical Phenomena Chemistry, Physical Databases, Factual Models, Molecular Protein Conformation Protein Folding Protein Structure, Secondary Proteins/chemistry Sensitivity and Specificity Sequence Alignment Collapse Grants GM48835 NIGMS NIH HHS Collapse
355	Zhang B, Jaroszewski L, Rychlewski L, Godzik A. Similarities and differences between nonhomologous proteins with similar folds: evaluation of threading strategies. FOLDING & DESIGN 1998;2:307-17. [PMID: 9377714 DOI: 10.1016/s1359-0278(97)00042-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract BACKGROUND There are many pairs and groups of proteins with similar folds and interaction patterns, but whose sequence similarity is below the threshold of easily recognizable sequence homology. The existence of multiple sequence solutions for a given fold has inspired fold prediction methods in which structural information from one protein is used to estimate the energy of another, putatively similar, structure. RESULTS A set of 68 pairs of proteins with similar folds and sequence identity in the 8-30% range is identified from the literature. for each pair, the energy of one protein, calculated using knowledge-based statistical potentials, is compared to the estimated energy, calculated with the same potentials but using the structural information (burial status and interaction pattern) of another protein with the same fold. Different energy estimates, corresponding to approximations used in various fold recognition algorithms, are calculated and compared to each other, as well as to the correct energy. It is shown that the local energy terms, based on burial and secondary structure preferences, can be reliably estimated with an accuracy close to 70%. At the same time, the two-body nonlocal energy loses over 60% of its value due to the repacking of the structure. Further approximations, such as the 'frozen approximation', can bring it to an essentially random value. CONCLUSIONS Local energy terms could be used safely to improve fold recognition algorithms. To utilize pair interaction information, specially designed pair potentials and/or a self-consistent description of pair interactions is necessary. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Databases, Factual Models, Molecular Models, Theoretical Molecular Sequence Data Protein Structure, Secondary Sequence Alignment Software Thermodynamics Collapse Grants GM48835 NIGMS NIH HHS Collapse
356	Kolinski A, Skolnick J, Godzik A. An algorithm for prediction of structural elements in small proteins. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 1997:446-60. [PMID: 9390250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Abstract A method for predicting the location of surface loops/turns and assigning the intervening secondary structure of the transglobular linkers in small, single domain globular proteins has been developed. Application to a set of 10 proteins of known structure indicates a high level of accuracy. The secondary structure assignment in the center of transglobular connections is correct in more than 85% of the cases. A similar error rate is found for loops. Since more global information about the fold is provided, it is complementary to standard secondary structure prediction approaches. Consequently, it may be useful in early stages of tertiary structure prediction when establishment of the structural class and possible folding topologies is of interest. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Bacterial Proteins/chemistry Computational Biology Computer Simulation Models, Molecular Molecular Sequence Data Monte Carlo Method Protein Folding Protein Structure, Secondary Protein Structure, Tertiary Proteins/chemistry Reproducibility of Results Sequence Alignment Collapse Grants GM-37408 NIGMS NIH HHS Collapse
357	Rychlewski L, Godzik A. Secondary structure prediction using segment similarity. PROTEIN ENGINEERING 1997;10:1143-53. [PMID: 9488139 DOI: 10.1093/protein/10.10.1143] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Abstract We present a secondary structure prediction method based on finding similarities between sequence segments from the target sequence and segments contained in the database of proteins with known structures. The similarity definition is optimized using a genetic algorithm and is based on a 21 x 40 similarity matrix, comparing a target sequence with the sequence and burial status of the proteins from the database. The three-state secondary structure prediction accuracy reaches 72.4% on a non homologous (maximum sequence identity <25%) data set derived from PDB and is reproduced on two independent testing sets, including the set of CASP2 prediction targets and a group of newly solved PDB structures. The prediction method was developed with simplicity and open architecture in mind, allowing for an easy extension to other types of predictions and to the analysis of the contributions to the local structure formation. For instance, the design of the prediction procedure allows us to trace back segments of the database that contributed to the prediction. It can be shown that those segments came from various structural classes and that even complete exclusion of related folds from the database does not result in a significant decrease in prediction accuracy. Collapse Key Words Collapse MESH Headings Databases, Factual Forecasting Models, Chemical Protein Structure, Secondary Proteins/chemistry Reproducibility of Results Sequence Alignment Sequence Homology, Amino Acid Collapse Grants GM48835 NIGMS NIH HHS Collapse
358	Godzik A. Counting and classifying possible protein folds. Trends Biotechnol 1997. [DOI: 10.1016/s0167-7799(97)01030-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
359	Hu WP, Godzik A, Skolnick J. Sequence-structure specificity--how does an inverse folding approach work? PROTEIN ENGINEERING 1997;10:317-31. [PMID: 9194156 DOI: 10.1093/protein/10.4.317] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Abstract The inverse folding approach is a powerful tool in protein structure prediction when the native state of a sequence adopts one of the known protein folds. This is because some proteins show strong sequence-structure specificity in inverse folding experiments that allow gaps and insertions in the sequence-structure alignment. In those cases when structures similar to their native folds are included in the structure database, the z-scores (which measure the sequence-structure specificity) of these folds are well separated from those of other alternative structures. In this paper, we seek to understand the origin of this sequence-structure specificity and to identify how the specificity arises on passing from a short peptide chain to the entire protein sequence. To accomplish this objective, a simplified version of inverse folding, gapless inverse folding, is performed using sequence fragments of different sizes from 53 proteins. The results indicate that usually a significant portion of the entire protein sequence is necessary to show sequence-structure specificity, but there are regions in the sequence that begin to show this specificity at relatively short fragment size (15-20 residues). An island picture, in which the regions in the sequence that recognize their own native structure grow from some seed fragments, is observed as the fragment size increases. Usually, more similar structures to the native states are found in the top-scoring structural fragments in these high-specificity regions. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Models, Chemical Peptide Mapping/methods Protein Conformation Protein Folding Protein Structure, Secondary Structure-Activity Relationship Collapse Grants GM-48835 NIGMS NIH HHS Collapse
360	Skolnick J, Jaroszewski L, Kolinski A, Godzik A. Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? Protein Sci 1997;6:676-88. [PMID: 9070450 PMCID: PMC2143667 DOI: 10.1002/pro.5560060317] [Citation(s) in RCA: 152] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Abstract Many existing derivations of knowledge-based statistical pair potentials invoke the quasichemical approximation to estimate the expected side-chain contact frequency if there were no amino acid pair-specific interactions. At first glance, the quasichemical approximation that treats the residues in a protein as being disconnected and expresses the side-chain contact probability as being proportional to the product of the mole fractions of the pair of residues would appear to be rather severe. To investigate the validity of this approximation, we introduce two new reference states in which no specific pair interactions between amino acids are allowed, but in which the connectivity of the protein chain is retained. The first estimates the expected number of side-chain contracts by treating the protein as a Gaussian random coil polymer. The second, more realistic reference state includes the effects of chain connectivity, secondary structure, and chain compactness by estimating the expected side-chain contrast probability by placing the sequence of interest in each member of a library of structures of comparable compactness to the native conformation. The side-chain contact maps are not allowed to readjust to the sequence of interest, i.e., the side chains cannot repack. This situation would hold rigorously if all amino acids were the same size. Both reference states effectively permit the factorization of the side-chain contact probability into sequence-dependent and structure-dependent terms. Then, because the sequence distribution of amino acids in proteins is random, the quasichemical approximation to each of these reference states is shown to be excellent. Thus, the range of validity of the quasichemical approximation is determined by the magnitude of the side-chain repacking term, which is, at present, unknown. Finally, the performance of these two sets of pair interaction potentials as well as side-chain contact fraction-based interaction scales is assessed by inverse folding tests both without and with allowing for gaps. Collapse Key Words Collapse MESH Headings Models, Chemical Protein Folding Collapse Grants GM-48835 NIGMS NIH HHS Collapse
361	Kolinski A, Skolnick J, Godzik A, Hu WP. A method for the prediction of surface "U"-turns and transglobular connections in small proteins. Proteins 1997;27:290-308. [PMID: 9061792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Abstract A simple method for predicting the location of surface loops/turns that change the overall direction of the chain that is, "U" turns, and assigning the dominant secondary structure of the intervening transglobular blocks in small, single-domain globular proteins has been developed. Since the emphasis of the method is on the prediction of the major topological elements that comprise the global structure of the protein rather than on a detailed local secondary structure description, this approach is complementary to standard secondary structure prediction schemes. Consequently, it may be useful in the early stages of tertiary structure prediction when establishment of the structural class and possible folding topologies is of interest. Application to a set of small proteins of known structure indicates a high level of accuracy. The prediction of the approximate location of the surface turns/loops that are responsible for the change in overall chain direction is correct in more than 95% of the cases. The accuracy for the dominant secondary structure assignment for the linear blocks between such surface turns/loops is in the range of 82%. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Animals Humans Molecular Sequence Data Protein Folding Protein Structure, Secondary Proteins/chemistry Collapse Grants GM-48835 NIGMS NIH HHS Collapse
362	Kolinski A, Skolnick J, Godzik A, Hu WP. A method for the prediction of surface “U”-turns and transglobular connections in small proteins. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(199702)27:2<290::aid-prot14>3.0.co;2-h] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
363	Pawłowski K, Jaroszewski L, Bierzyñski A, Godzik A. Multiple model approach--dealing with alignment ambiguities in protein modeling. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 1997:328-339. [PMID: 9390303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/22/2023] Abstract Sequence alignments for distantly homologous proteins are often ambiguous, which creates a weak link in structure prediction by homology. We address this problem by using several plausible alignments in a modeling procedure, obtaining many models of the target. All are subsequently evaluated by a threading algorithm. It is shown that this approach can identify best alignments and produce reasonable models, whose quality is now limited only by the extent of the structural similarity between the known and predicted protein. Using a similar approach structure prediction for the oxidized dimer of S100A1 protein, for which the structure is not known, is presented. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Calbindins Computer Simulation Dimerization Models, Molecular Molecular Sequence Data Parvalbumins/chemistry Protein Conformation Proteins/chemistry S100 Calcium Binding Protein G/chemistry S100 Proteins/chemistry Sequence Alignment Sequence Homology, Amino Acid Collapse Grants GM-48835 NIGMS NIH HHS Collapse
364	Godzik A. The structural alignment between two proteins: is there a unique answer? Protein Sci 1996;5:1325-38. [PMID: 8819165 PMCID: PMC2143456 DOI: 10.1002/pro.5560050711] [Citation(s) in RCA: 184] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Abstract Structurally similar but sequentially unrelated proteins have been discovered and rediscovered by many researchers, using a variety of structure comparison tools. For several pairs of such proteins, existing structural alignments obtained from the literature, as well as alignments prepared using several different similarity criteria, are compared with each other. It is shown that, in general, they differ from each other, with differences increasing with diminishing sequence similarity. Differences are particularly strong between alignments optimizing global similarity measures, such as RMS deviation between C alpha atoms, and alignments focusing on more local features, such as packing or interaction pattern similarity. Simply speaking, by putting emphasis on different aspects of structure, different structural alignments show the unquestionable similarity in a different way. With differences between various alignments extending to a point where they can differ at all positions, analysis of structural similarities leads to contradictory results reported by groups using different alignment techniques. The problem of uniqueness and stability of structural alignments is further studied with the help of visualization of the suboptimal alignments. It is shown that alignments are often degenerate and whole families of alignments can be generated with almost the same score as the "optimal alignment." However, for some similarity criteria, specially those based on side-chain positions, rather than C alpha positions, alignments in some areas of the protein are unique. This opens the question of how and if the structural alignments can be used as "standards of truth" for protein comparison. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Bacterial Proteins Chemotaxis Copper/metabolism Flavodoxin/chemistry Flavodoxin/metabolism Membrane Proteins/chemistry Membrane Proteins/metabolism Methyl-Accepting Chemotaxis Proteins Molecular Sequence Data Myoglobin/chemistry Myoglobin/metabolism Phycocyanin/chemistry Phycocyanin/metabolism Proteins/chemistry Proteins/metabolism Sequence Alignment Collapse Grants Collapse
365	Pawłowski K, Bierzyński A, Godzik A. Structural diversity in a family of homologous proteins. J Mol Biol 1996;258:349-66. [PMID: 8627631 DOI: 10.1006/jmbi.1996.0255] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Abstract An interesting example of a structurally diverse group of sequentially homologous proteins is analyzed at the level of molecular interactions. In this family, the EF-hand calcium-binding proteins, there are examples of at least three distinct mutual positions of the N and C-terminal domains, despite significant sequence homology between all members of this family. Why does a particular protein choose one arrangement over another? To answer this question, detailed models of all proteins in their native structures as well as all alternative sequence/structure combinations are built by comparative modeling. By studying and comparing interactions stabilizing native structures and destabilizing alternative conformations, it is possible to gain insight into how such conformational diversity is achieved. It is shown that some mechanisms used to achieve it are: correlated mutations on the surface of two units and the presence of additional domains/chain fragments stabilizing desired topologies. The implications of these findings, both for structure predictions for other members of this family as well as the general problem of quaternary structure formation, are discussed. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Calcium-Binding Proteins/chemistry Calmodulin/chemistry Eye Proteins Hippocalcin Lipoproteins Models, Molecular Molecular Sequence Data Myosins/chemistry Nerve Tissue Proteins Protein Conformation Protein Folding Protein Multimerization Recoverin Collapse Grants GM-48835 NIGMS NIH HHS Collapse
366	Godzik A. Knowledge-based potentials for protein folding: what can we learn from known protein structures? Structure 1996;4:363-6. [PMID: 8740358 DOI: 10.1016/s0969-2126(96)00041-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Abstract Empirical potentials capture the essence of regularities seen in protein structures and can be used in simulations and predictions of protein structure or function. Derivations of such potentials require comparisons to be made between experimentally derived protein structures and theoretically constructed reference states. Collapse Key Words Collapse MESH Headings Computer Simulation Models, Molecular Protein Conformation Protein Folding Proteins/chemistry Collapse Grants GM48835 NIGMS NIH HHS Collapse
367	Godzik A, Koliński A, Skolnick J. Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets. Protein Sci 1995;4:2107-17. [PMID: 8535247 PMCID: PMC2142984 DOI: 10.1002/pro.5560041016] [Citation(s) in RCA: 119] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Abstract Various existing derivations of the effective potentials of mean force for the two-body interactions between amino acid side chains in proteins are reviewed and compared to each other. The differences between different parameter sets can be traced to the reference state used to define the zero of energy. Depending on the reference state, the transfer free energy or other pseudo-one-body contributions can be present to various extents in two-body parameter sets. It is, however, possible to compare various derivations directly by concentrating on the "excess" energy-a term that describes the difference between a real protein and an ideal solution of amino acids. Furthermore, the number of protein structures available for analysis allows one to check the consistency of the derivation and the errors by comparing parameters derived from various subsets of the whole database. It is shown that pair interaction preferences are very consistent throughout the database. Independently derived parameter sets have correlation coefficients on the order of 0.8, with the mean difference between equivalent entries of 0.1 kT. Also, the low-quality (low resolution, little or no refinement) structures show similar regularities. There are, however, large differences between interaction parameters derived on the basis of crystallographic structures and structures obtained by the NMR refinement. The origin of the latter difference is not yet understood. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Amino Acids Crystallography, X-Ray Databases, Factual Magnetic Resonance Spectroscopy Mathematics Models, Theoretical Protein Conformation Protein Folding Proteins/chemistry Thermodynamics Collapse Grants GM48835 NIGMS NIH HHS Collapse
368	Godzik A. In search of the ideal protein sequence. PROTEIN ENGINEERING 1995;8:409-16. [PMID: 8532661 DOI: 10.1093/protein/8.5.409] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Abstract The inverse of a folding problem is to find the ideal sequence that folds into a particular protein structure. This problem has been addressed using the topology fingerprint-based threading algorithm, capable of calculating a score (energy) of an arbitrary sequence-structure pair. At first, the search is conducted by unconstrained minimization of the energy in sequence space. It is shown that using energy as the only design criterion leads to spurious solutions with incorrect amino acid composition. The problem lies in the general features of the protein energy surface as a function of both structure and sequence. The proposed solution is to design the sequence by maximizing the difference between its energy in the desired structure and in other known protein structures. Depending on the size of the database of structures 'to avoid', sequences bearing significant similarity to the native sequence of the target protein are obtained using this procedure. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Databases, Factual Molecular Sequence Data Plastocyanin/chemistry Protein Engineering Protein Folding Proteins/chemistry Sequence Alignment Thermodynamics Collapse Grants GM-4883 NIGMS NIH HHS Collapse
369	Godzik A, Skolnick J. Flexible algorithm for direct multiple alignment of protein structures and sequences. COMPUTER APPLICATIONS IN THE BIOSCIENCES : CABIOS 1994;10:587-96. [PMID: 7704657 DOI: 10.1093/bioinformatics/10.6.587] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Abstract The recently described equivalence between the alignment of two proteins and a conformation of a lattice chain on a two-dimensional square lattice is extended to multiple alignments. The search for the optimal multiple alignment between several proteins, which is equivalent to finding the energy minimum in the conformational space of a multi-dimensional lattice chain, is studied by the Monte Carlo approach. This method, while not deterministic, and for two-dimensional problems slower than dynamic programming, can accept arbitrary scoring functions, including non-local ones, and its speed decreases slowly with increasing number of dimensions. For the local scoring functions, the MC algorithm can also reproduce known exact solutions for the direct multiple alignments. As illustrated by examples, both for structure- and sequence-based alignments, direct multi-dimensional alignments are able to capture weak similarities between divergent families much better than ones built from pairwise alignments by a hierarchical approach. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Molecular Structure Monte Carlo Method Protein Conformation Proteins/chemistry Sequence Alignment Software Collapse Grants P01-38794 PHS HHS Collapse
370	Godzik A, Skolnick J, Kolinski A. Regularities in interaction patterns of globular proteins. PROTEIN ENGINEERING 1993;6:801-10. [PMID: 8309927 DOI: 10.1093/protein/6.8.801] [Citation(s) in RCA: 58] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Abstract The description of protein structure in the language of side chain contact maps is shown to offer many advantages over more traditional approaches. Because it focuses on side chain interactions, it aids in the discovery, study and classification of similarities between interactions defining particular protein folds and offers new insights into the rules of protein structure. For example, there is a small number of characteristic patterns of interactions between protein supersecondary structural fragments, which can be seen in various non-related proteins. Furthermore, the overlap of the side chain contact maps of two proteins provides a new measure of protein structure similarity. As shown in several examples, alignments based on contact map overlaps are a powerful alternative to other structure-based alignments. Collapse Key Words Collapse MESH Headings Computer Simulation Hemoglobins/chemistry Monte Carlo Method Myoglobin/chemistry Plastocyanin/chemistry Protein Structure, Secondary Protein Structure, Tertiary Sequence Alignment/methods Collapse Grants 2 PO1 GM38794 NIGMS NIH HHS Collapse
371	Godzik A, Kolinski A, Skolnick J. Lattice representations of globular proteins: How good are they? J Comput Chem 1993. [DOI: 10.1002/jcc.540141009] [Citation(s) in RCA: 76] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
372	Godzik A, Kolinski A, Skolnick J. De novo and inverse folding predictions of protein structure and dynamics. J Comput Aided Mol Des 1993;7:397-438. [PMID: 8229093 DOI: 10.1007/bf02337559] [Citation(s) in RCA: 76] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Abstract In the last two years, the use of simplified models has facilitated major progress in the globular protein folding problem, viz., the prediction of the three-dimensional (3D) structure of a globular protein from its amino acid sequence. A number of groups have addressed the inverse folding problem where one examines the compatibility of a given sequence with a given (and already determined) structure. A comparison of extant inverse protein-folding algorithms is presented, and methodologies for identifying sequences likely to adopt identical folding topologies, even when they lack sequence homology, are described. Extension to produce structural templates or fingerprints from idealized structures is discussed, and for eight-membered beta-barrel proteins, it is shown that idealized fingerprints constructed from simple topology diagrams can correctly identify sequences having the appropriate topology. Furthermore, this inverse folding algorithm is generalized to predict elements of supersecondary structure including beta-hairpins, helical hairpins and alpha/beta/alpha fragments. Then, we describe a very high coordination number lattice model that can predict the 3D structure of a number of globular proteins de novo; i.e. using just the amino acid sequence. Applications to sequences designed by DeGrado and co-workers [Biophys. J., 61 (1992) A265] predict folding intermediates, native states and relative stabilities in accord with experiment. The methodology has also been applied to the four-helix bundle designed by Richardson and co-workers [Science, 249 (1990) 884] and a redesigned monomeric version of a naturally occurring four-helix dimer, rop. Based on comparison to the rop dimer, the simulations predict conformations with rms values of 3-4 A from native. Furthermore, the de novo algorithms can assess the stability of the folds predicted from the inverse algorithm, while the inverse folding algorithms can assess the quality of the de novo models. Thus, the synergism of the de novo and inverse folding algorithm approaches provides a set of complementary tools that will facilitate further progress on the protein-folding problem. Collapse Key Words Collapse MESH Headings Algorithms Amino Acid Sequence Computer Simulation Models, Molecular Molecular Sequence Data Protein Folding Protein Structure, Tertiary Sequence Alignment Collapse Grants Collapse
373	Skolnick J, Kolinski A, Brooks CL, Godzik A, Rey A. A method for predicting protein structure from sequence. Curr Biol 1993;3:414-23. [PMID: 15335708 DOI: 10.1016/0960-9822(93)90348-r] [Citation(s) in RCA: 47] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/1993] [Revised: 06/08/1993] [Accepted: 06/08/1993] [Indexed: 10/26/2022] Abstract BACKGROUND The ability to predict the native conformation of a globular protein from its amino-acid sequence is an important unsolved problem of molecular biology. We have previously reported a method in which reduced representations of proteins are folded on a lattice by Monte Carlo simulation, using statistically-derived potentials. When applied to sequences designed to fold into four-helix bundles, this method generated predicted conformations closely resembling the real ones. RESULTS We now report a hierarchical approach to protein-structure prediction, in which two cycles of the above-mentioned lattice method (the second on a finer lattice) are followed by a full-atom molecular dynamics simulation. The end product of the simulations is thus a full-atom representation of the predicted structure. The application of this procedure to the 60 residue, B domain of staphylococcal protein A predicts a three-helix bundle with a backbone root mean square (rms) deviation of 2.25-3 A from the experimentally determined structure. Further application to a designed, 120 residue monomeric protein, mROP, based on the dimeric ROP protein of Escherichia coli, predicts a left turning, four-helix bundle native state. Although the ultimate assessment of the quality of this prediction awaits the experimental determination of the mROP structure, a comparison of this structure with the set of equivalent residues in the ROP dime- crystal structure indicates that they have a rms deviation of approximately 3.6-4.2 A. CONCLUSION Thus, for a set of helical proteins that have simple native topologies, the native folds of the proteins can be predicted with reasonable accuracy from their sequences alone. Our approach suggest a direction for future work addressing the protein-folding problem. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
374	Kolinski A, Godzik A, Skolnick J. A general method for the prediction of the three dimensional structure and folding pathway of globular proteins: Application to designed helical proteins. J Chem Phys 1993. [DOI: 10.1063/1.464706] [Citation(s) in RCA: 163] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
375	Skolnick J, Kolinski A, Godzik A. From independent modules to molten globules: observations on the nature of protein folding intermediates. Proc Natl Acad Sci U S A 1993;90:2099-100. [PMID: 8460114 PMCID: PMC46030 DOI: 10.1073/pnas.90.6.2099] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open Abstract Collapse Key Words Collapse MESH Headings Binding Sites Isomerases/chemistry Protein Disulfide-Isomerases Protein Folding Protein Structure, Secondary Proteins/chemistry Collapse Grants Collapse
376	Godzik A, Skolnick J. Sequence-structure matching in globular proteins: application to supersecondary and tertiary structure determination. Proc Natl Acad Sci U S A 1992;89:12098-102. [PMID: 1465445 PMCID: PMC50705 DOI: 10.1073/pnas.89.24.12098] [Citation(s) in RCA: 110] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open Abstract A methodology designed to address the inverse globular protein-folding problem (the identification of which sequences are compatible with a given three-dimensional structure) is described. By using a library of protein finger-prints, defined by the side chain interaction pattern, it is possible to match each structure to its own sequence in an exhaustive data base search. It is shown that this is a permissive requirement for the validation of the methodology. To pass the more rigorous test of identifying proteins that are not close sequence homologs, but that have similar structure, the method has been extended to include insertions and deletions in the sequence, which is compared to the fingerprint. This allows for the identification of sequences having little or no sequence homology to the fingerprint. Examples include plastocyanin/azurin/pseudoazurin, the globin family, different families of proteases and cytochromes, including cytochromes c' and b-562, actinidin/papain, and lysozyme/alpha-lactalbumin. Turning to supersecondary structure prediction, we find that alpha/beta/alpha fragments possess sufficient specificity to identify their own and related sequences. By threading a beta-hairpin through a sequence, it is possible to predict the location of such hairpins and turns with remarkable fidelity. Thus, the method greatly extends existing techniques for the prediction of both global structural homology and local supersecondary structure. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Peptide Fragments/chemistry Protein Structure, Secondary Protein Structure, Tertiary Proteins/chemistry Sequence Alignment Solvents Structure-Activity Relationship Thermodynamics Collapse Grants R01 GM037408 NIGMS NIH HHS GM-37408 NIGMS NIH HHS Collapse
377	Godzik A, Kolinski A, Skolnick J. Topology fingerprint approach to the inverse protein folding problem. J Mol Biol 1992;227:227-38. [PMID: 1522587 DOI: 10.1016/0022-2836(92)90693-e] [Citation(s) in RCA: 277] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Abstract We describe the most general solution to date of the problem of matching globular protein sequences to the appropriate three-dimensional structures. The screening template, against which sequences are tested, is provided by a protein "structural fingerprint" library based on the contact map and the buried/exposed pattern of residues. Then, a lattice Monte Carlo algorithm validates or dismisses the stability of the proposed fold. Examples of known structural similarities between proteins having weakly or unrelated sequences such as the globins and phycocyanins, the eight-member alpha/beta fold of triose phosphate isomerase and even a close structural equivalence between azurin and immunoglobulins are found. Collapse Key Words Collapse MESH Headings Algorithms Azurin/chemistry Bacterial Outer Membrane Proteins/chemistry Bacterial Proteins/chemistry Databases, Factual Globins/chemistry Immunoglobulin lambda-Chains/genetics Models, Molecular Phycocyanin/chemistry Plant Proteins/chemistry Plastocyanin/chemistry Protein Conformation Sequence Alignment Structure-Activity Relationship Thermodynamics Collapse Grants GM-37408 NIGMS NIH HHS Collapse
378	Godzik A, Skolnick J, Kolinski A. Simulations of the folding pathway of triose phosphate isomerase-type alpha/beta barrel proteins. Proc Natl Acad Sci U S A 1992;89:2629-33. [PMID: 1557367 PMCID: PMC48715 DOI: 10.1073/pnas.89.7.2629] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open Abstract Simulations of the folding pathways of two large alpha/beta proteins, the alpha subunit of tryptophan synthase and triose phosphate isomerase, are reported using the knight's walk lattice model of globular proteins and Monte Carlo dynamics. Starting from randomly generated unfolded states and with no assumptions regarding the nature of the folding intermediates, for the tryptophan synthase subunit these simulations predict, in agreement with experiment, the existence and location of a stable equilibrium intermediate comprised of six beta strands on the amino terminus of the molecule. For the case of triose phosphate isomerase, the simulations predict that both amino- and carboxyl-terminal intermediates should be observed. In a significant modification of previous lattice models, this model includes a full heavy atom side chain description and is capable of representing native conformations at the level of 2.5- to 3-A rms deviation for the C alpha positions, as compared to the crystal structure. With a well-balanced compromise between accuracy of the protein description and the computer requirements necessary to perform simulations spanning biologically significant amounts of time, the lattice model described here brings the possibility of studying important biological processes to present-day computers. Collapse Key Words Collapse MESH Headings Animals Chickens Computer Simulation Models, Molecular Monte Carlo Method Protein Conformation Salmonella typhimurium/enzymology Solubility Triose-Phosphate Isomerase/chemistry Triose-Phosphate Isomerase/ultrastructure Tryptophan Synthase/chemistry Tryptophan Synthase/ultrastructure Collapse Grants R01 GM037408 NIGMS NIH HHS GM-37408 NIGMS NIH HHS Collapse
379	Godzik A. An estimation of energy parameters for the soliton movement in hydrogen-bonded chains. Chem Phys Lett 1990. [DOI: 10.1016/0009-2614(90)85229-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
380	Godzik A, Sander C. Conservation of residue interactions in a family of Ca-binding proteins. PROTEIN ENGINEERING 1989;2:589-96. [PMID: 2813336 DOI: 10.1093/protein/2.8.589] [Citation(s) in RCA: 33] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Abstract In the TNC family of Ca-binding proteins (calmodulin, parvalbumin, intestinal calcium binding protein and troponin C) approximately 70 well-conserved amino acid sequences and six crystal structures are known. We find a clear correlation between residue contacts in the structures and residue conservation in the sequences: residues with strong sidechain-sidechain contacts in the three-dimenesional structure tend to be the more conserved in the sequence. This is one way to quantify the intuitive notion of the importance of sidechain interactions for maintaining protein three-dimensional structure in evolution and may usefully be taken into account in planning point mutations in protein engineering. Collapse Key Words Collapse MESH Headings Amino Acid Sequence Animals Calcium-Binding Proteins/genetics Computer Graphics Electronic Data Processing Humans Molecular Sequence Data Mutation Protein Conformation Rats X-Ray Diffraction Collapse Grants Collapse
381	Dadlez M, Bierzyński A, Godzik A, Sobocińska M, Kupryszewski G. Conformational role of His-12 in C-peptide of ribonuclease A. Biophys Chem 1988;31:175-81. [PMID: 3233287 DOI: 10.1016/0301-4622(88)80023-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Abstract Possible interactions of the His-12 ring with other side chain and backbone groups of C-peptide lactone (CPL) are discussed. The works published so far are critically reviewed and compared with the latest results obtained by the authors. The main new conclusion is that in the helical conformation of CPL, the Phe-8 and His-12 rings are clustered together. Studies of Phe-8----Ala analogs of CPL and calculations of ring current effects satisfactorily explain the observed environmental shifts of Phe-8 and His-12 protons in NMR spectra of CPL. Interaction between both rings is favorable for alpha-helix formation, but cannot explain an increase in helix stability related with protonation of His-12. This effect arises from favorable interactions of the charged His+-12 ring with the helix backbone. Collapse Key Words Collapse MESH Headings Histidine Macromolecular Substances Models, Molecular Oligopeptides Protein Conformation Ribonuclease, Pancreatic Collapse Grants Collapse
382	Godzik A, Wesolowski T. On the interactions of charged side chains with the alpha-helix backbone. Biophys Chem 1988;31:29-34. [PMID: 3233289 DOI: 10.1016/0301-4622(88)80005-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Abstract The effects of the position of charged amino acid side chains on the stability of the alpha-helix are investigated. Calculations for the model polyAla 13 residue alpha-helix, with modifications based on experimental work, are performed at three levels of approximation. The observed stabilization of the alpha-helix could be explained by interactions between its macrodipole and charged amino acid side chains. Limitations of the model are discussed. Collapse Key Words Collapse MESH Headings Amino Acids Calorimetry Models, Molecular Protein Conformation Proteins Collapse Grants Collapse