1
|
Kouza M, Banerji A, Kolinski A, Buhimschi IA, Kloczkowski A. Oligomerization of FVFLM peptides and their ability to inhibit beta amyloid peptides aggregation: consideration as a possible model. Phys Chem Chem Phys 2017; 19:2990-2999. [PMID: 28079198 PMCID: PMC5305032 DOI: 10.1039/c6cp07145g] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Preeclampsia, a pregnancy-specific disorder, shares typical pathophysiological features with protein misfolding disorders including Alzheimer's disease. Characteristic for preeclampsia is the involvement of multiple proteins of which fragments of SERPINA1 and β-amyloid co-aggregate in urine and placenta of preeclamptic women. To explore the biophysical basis of this interaction, we investigated the multidimensional efficacy of the FVFLM sequence in SERPINA1, as a model inhibitory agent of β-amyloid aggregation. After studying the oligomerization of FVFLM peptides using all-atom molecular dynamics simulations with the GROMOS43a1 force field and explicit water, we report that FVFLM can aggregate and its aggregation is spontaneous with a remarkably faster rate than that recorded for KLVFF (aggregation "hot-spot" from β-amyloid). The fast kinetics of FVFLM aggregation was found to be driven primarily by core-like aromatic interactions originating from the anti-parallel orientation of complementarily uncharged strands. The conspicuously stable aggregation mechanism observed for FVFLM peptides is found not to conform to the popular 'dock-lock' scheme. We also found high propensity of FVFLM for KLVFF binding. When present, FVFLM disrupts the β-amyloid aggregation pathway and we propose that FVFLM-like peptides might be used to prevent the assembly of full-length Aβ or other pro-amyloidogenic peptides into amyloid fibrils.
Collapse
Affiliation(s)
- M Kouza
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland. and Nationwide Children's Hospital, Battelle Center for Mathematical Medicine, Columbus, OH 43215, USA
| | - A Banerji
- Nationwide Children's Hospital, Battelle Center for Mathematical Medicine, Columbus, OH 43215, USA
| | - A Kolinski
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | - I A Buhimschi
- Center for Perinatal Research, The Research Institute at Nationwide Children's Hospital, Columbus, OH 43215, USA and Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH 43215, USA
| | - A Kloczkowski
- Nationwide Children's Hospital, Battelle Center for Mathematical Medicine, Columbus, OH 43215, USA and Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH 43215, USA
| |
Collapse
|
2
|
|
3
|
Abstract
Protein folding, the problem of how an amino acid sequence folds into a unique three-dimensional shape, has been a long-standing problem in biology. The success of genome-wide sequencing efforts has increased the interest in understanding the protein folding enigma, because realizing the value of the genomic sequences rests on the accuracy with which the encoded gene products are understood. Although a complete understanding of the kinetics and thermodynamics of protein folding has remained elusive, there has been considerable progress in techniques to predict protein structure from amino acid sequences. The prediction techniques fall into three general classes: comparative modeling, threading and ab initio folding. The current state of research in each of these three areas is reviewed here in detail. Efforts to apply each method to proteome-wide analysis are reviewed, and some of the key technical hurdles that remain are presented. Protein folding technologies, while not yet providing a full understanding of the protein folding process, have clearly progressed to the point of being useful in enabling structure-based annotation of genomic sequences.
Collapse
Affiliation(s)
- J S Fetrow
- GeneFormatics, Incorporated, 5830 Oberlin Drive, Suite 200, San Diego, CA 92121, USA.
| | | | | | | |
Collapse
|
4
|
Skolnick J, Kolinski A, Kihara D, Betancourt M, Rotkiewicz P, Boniecki M. Ab initio protein structure prediction via a combination of threading, lattice folding, clustering, and structure refinement. Proteins 2002; Suppl 5:149-56. [PMID: 11835492 DOI: 10.1002/prot.1172] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A combination of sequence comparison, threading, lattice, and off-lattice Monte Carlo (MC) simulations and clustering of MC trajectories was used to predict the structure of all (but one) targets of the CASP4 experiment on protein structure prediction. Although this method is automated and is operationally the same regardless of the level of uniqueness of the query proteins, here we focus on the more difficult targets at the border of the fold recognition and new fold categories. For a few targets (T0110 is probably the best example), the ab initio method produced more accurate models than models obtained by the fold recognition techniques. For the most difficult targets from the new fold categories, substantial fragments of structures have been correctly predicted. Possible improvements of the method are briefly discussed.
Collapse
Affiliation(s)
- J Skolnick
- Donald Danforth Plant Science Center, Saint Louis, Missouri 63141, USA.
| | | | | | | | | | | |
Collapse
|
5
|
Bujnicki JM, Rotkiewicz P, Kolinski A, Rychlewski L. Three-dimensional modeling of the I-TevI homing endonuclease catalytic domain, a GIY-YIG superfamily member, using NMR restraints and Monte Carlo dynamics. Protein Eng 2001; 14:717-21. [PMID: 11739889 DOI: 10.1093/protein/14.10.717] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Using a recent version of the SICHO algorithm for in silico protein folding, we made a blind prediction of the tertiary structure of the N-terminal, independently folded, catalytic domain (CD) of the I-TevI homing endonuclease, a representative of the GIY-YIG superfamily of homing endonucleases. The secondary structure of the I-TevI CD has been determined using NMR spectroscopy, but computational sequence analysis failed to detect any protein of known tertiary structure related to the GIY-YIG nucleases (Kowalski et al., Nucleic Acids Res., 1999, 27, 2115-2125). To provide further insight into the structure-function relationships of all GIY-YIG superfamily members, including the recently described subfamily of type II restriction enzymes (Bujnicki et al., Trends Biochem. Sci., 2000, 26, 9-11), we incorporated the experimentally determined and predicted secondary and tertiary restraints in a reduced (side chain only) protein model, which was minimized by Monte Carlo dynamics and simulated annealing. The subsequently elaborated full atomic model of the I-TevI CD allows the available experimental data to be put into a structural context and suggests that the GIY-YIG domain may dimerize in order to bring together the conserved residues of the active site.
Collapse
Affiliation(s)
- J M Bujnicki
- Bioinformatics Laboratory, International Institute of Molecular and Cell Biology, ul. ks. Trojdena 4, 02-109 Warsaw, Poland.
| | | | | | | |
Collapse
|
6
|
|
7
|
Kihara D, Lu H, Kolinski A, Skolnick J. TOUCHSTONE: an ab initio protein structure prediction method that uses threading-based tertiary restraints. Proc Natl Acad Sci U S A 2001; 98:10125-30. [PMID: 11504922 PMCID: PMC56926 DOI: 10.1073/pnas.181328398] [Citation(s) in RCA: 100] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2001] [Accepted: 06/28/2001] [Indexed: 11/18/2022] Open
Abstract
The successful prediction of protein structure from amino acid sequence requires two features: an efficient conformational search algorithm and an energy function with a global minimum in the native state. As a step toward addressing both issues, a threading-based method of secondary and tertiary restraint prediction has been developed and applied to ab initio folding. Such restraints are derived by extracting consensus contacts and local secondary structure from at least weakly scoring structures that, in some cases, can lack any global similarity to the sequence of interest. Furthermore, to generate representative protein structures, a reduced lattice-based protein model is used with replica exchange Monte Carlo to explore conformational space. We report results on the application of this methodology, termed TOUCHSTONE, to 65 proteins whose lengths range from 39 to 146 residues. For 47 (40) proteins, a cluster centroid whose rms deviation from native is below 6.5 (5) A is found in one of the five lowest energy centroids. The number of correctly predicted proteins increases to 50 when atomic detail is added and a knowledge-based atomic potential is combined with clustered and nonclustered structures for candidate selection. The combination of the ratio of the relative number of contacts to the protein length and the number of clusters generated by the folding algorithm is a reliable indicator of the likelihood of successful fold prediction, thereby opening the way for genome-scale ab initio folding.
Collapse
Affiliation(s)
- D Kihara
- Laboratory of Computational Genomics, Donald Danforth Plant Science Center, 893 North Warson Road, St. Louis, MO 63141, USA
| | | | | | | |
Collapse
|
8
|
Rotkiewicz P, Sicinska W, Kolinski A, DeLuca HF. Model of three-dimensional structure of vitamin D receptor and its binding mechanism with 1alpha,25-dihydroxyvitamin D(3). Proteins 2001; 44:188-99. [PMID: 11455592 DOI: 10.1002/prot.1084] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Comparative modeling of the vitamin D receptor three-dimensional structure and computational docking of 1alpha,25-dihydroxyvitamin D(3) into the putative binding pocket of the two deletion mutant receptors: (207-423) and (120-422, Delta [164-207]) are reported and evaluated in the context of extensive mutagenic analysis and crystal structure of holo hVDR deletion protein published recently. The obtained molecular model agrees well with the experimentally determined structure. Six different conformers of 1alpha,25-dihydroxyvitamin D(3) were used to study flexible docking to the receptor. On the basis of values of conformational energy of various complexes and their consistency with functional activity, it appears that 1alpha,25-dihydroxyvitamin D(3) binds the receptor in its 6-s-trans form. The two lowest energy complexes obtained from docking the hormone into the deletion protein (207-423) differ in conformation of ring A and orientation of the ligand molecule in the VDR pocket. 1alpha,25-Dihydroxyvitamin D(3) possessing the A-ring conformation with axially oriented 1alpha-hydroxy group binds receptor with its 25-hydroxy substituent oriented toward the center of the receptor cavity, whereas ligand possessing equatorial conformation of 1alpha-hydroxy enters the pocket with A ring directed inward. The latter conformation and orientation of the ligand is consistent with the crystal structure of hVDR deletion mutant (118-425, Delta [165-215]). The lattice model of rVDR (120-422, Delta [164-207]) shows excellent agreement with the crystal structure of the hVDR mutant. The complex obtained from docking the hormone into the receptor has lower energy than complexes for which homology modeling was used. Thus, a simple model of vitamin D receptor with the first two helices deleted can be potentially useful for designing a general structure of ligand, whereas the advanced lattice model is suitable for examining binding sites in the pocket.
Collapse
Affiliation(s)
- P Rotkiewicz
- Department of Chemistry, University of Warsaw, Warsaw, Poland
| | | | | | | |
Collapse
|
9
|
Kolinski A, Betancourt MR, Kihara D, Rotkiewicz P, Skolnick J. Generalized comparative modeling (GENECOMP): a combination of sequence comparison, threading, and lattice modeling for protein structure prediction and refinement. Proteins 2001; 44:133-49. [PMID: 11391776 DOI: 10.1002/prot.1080] [Citation(s) in RCA: 76] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
An improved generalized comparative modeling method, GENECOMP, for the refinement of threading models is developed and validated on the Fischer database of 68 probe-template pairs, a standard benchmark used to evaluate threading approaches. The basic idea is to perform ab initio folding using a lattice protein model, SICHO, near the template provided by the new threading algorithm PROSPECTOR. PROSPECTOR also provides predicted contacts and secondary structure for the template-aligned regions, and possibly for the unaligned regions by garnering additional information from other top-scoring threaded structures. Since the lowest-energy structure generated by the simulations is not necessarily the best structure, we employed two structure-selection protocols: distance geometry and clustering. In general, clustering is found to generate somewhat better quality structures in 38 of 68 cases. When applied to the Fischer database, the protocol does no harm and in a significant number of cases improves upon the initial threading model, sometimes dramatically. The procedure is readily automated and can be implemented on a genomic scale.
Collapse
Affiliation(s)
- A Kolinski
- Laboratory of Computational Genomics, Donald Danforth Plant Science Center, St. Louis, Missouri 63141, USA
| | | | | | | | | |
Collapse
|
10
|
Vieth M, Kolinski A, Brooks CL, Skolnick J. Prediction of the quaternary structure of coiled coils: GCN4 leucine zipper and its mutants. Pac Symp Biocomput 2001:653-62. [PMID: 9390265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A methodology for predicting coiled coil quaternary structure and for the dissection of the interactions responsible for the global fold is described. Application is made to the equilibrium between different oligomeric species of the wild type GCN4 leucine zipper and seven of its mutants that were studied by Harbury et al. Over the entire experimental concentration range, agreement with experiment is found in five cases, while in two other cases, agreement is found over a portion of the concentration range. These simulations suggest that the degree of chain association is determined by the balance between specific side chain packing preferences and the entropy reduction associated with side chain burial in higher order multimers.
Collapse
Affiliation(s)
- M Vieth
- Department of Molecular Biology and Chemistry, Scripps Research Institute, La Jolla, California 92037, USA
| | | | | | | |
Collapse
|
11
|
|
12
|
Sikorski A, Kolinski A, Skolnick J. Monte Carlo simulation of designed helical proteins. Acta Pol Pharm 2000; 57 Suppl:119-21. [PMID: 11293239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]
Affiliation(s)
- A Sikorski
- Department of Chemistry, University of Warsaw, 1 Pasteura Str., 02-093 Warsaw, Poland
| | | | | |
Collapse
|
13
|
Abstract
A procedure for the reconstruction of all-atom protein structures from side-chain center-based low-resolution models is introduced and applied to a set of test proteins with high-resolution X-ray structures. The accuracy of the rebuilt all-atom models is measured by root mean square deviations to the corresponding X-ray structures and percentages of correct chi(1) and chi(2) side-chain dihedrals. The benefit of including C(alpha) positions in the low-resolution model is examined, and the effect of lattice-based models on the reconstruction accuracy is discussed. Programs and scripts implementing the reconstruction procedure are made available through the NIH research resource for Multiscale Modeling Tools in Structural Biology (http://mmtsb.scripps.edu).
Collapse
Affiliation(s)
- M Feig
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | | | | | |
Collapse
|
14
|
Abstract
Structural genomics projects aim to solve the experimental structures of all possible protein folds. Such projects entail a conceptual shift from traditional structural biology in which structural information is obtained on known proteins to one in which the structure of a protein is determined first and the function assigned only later. Whereas the goal of converting protein structure into function can be accomplished by traditional sequence motif-based approaches, recent studies have shown that assignment of a protein's biochemical function can also be achieved by scanning its structure for a match to the geometry and chemical identity of a known active site. Importantly, this approach can use low-resolution structures provided by contemporary structure prediction methods. When applied to genomes, structural information (either experimental or predicted) is likely to play an important role in high-throughput function assignment.
Collapse
Affiliation(s)
- J Skolnick
- Laboratory of Computational Genomics, The Danforth Plant Science Center, 893 N, Warson Rd., St. Louis, MO 63141, USA.
| | | | | |
Collapse
|
15
|
Sikorski A, Kolinski A, Skolnick J. Computer simulations of the properties of the alpha2, alpha2C, and alpha2D de novo designed helical proteins. Proteins 2000; 38:17-28. [PMID: 10651035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Reduced lattice models of the three de novo designed helical proteins alpha2, alpha2C, and alpha2D were studied. Low temperature stable folds were obtained for all three proteins. In all cases, the lowest energy folds were four-helix bundles. The folding pathway is qualitatively the same for all proteins studied. The energies of various topologies are similar, especially for the alpha2 polypeptide. The simulated crossover from molten globule to native-like behavior is very similar to that seen in experimental studies. Simulations on a reduced protein model reproduce most of the experimental properties of the alpha2, alpha2C, and alpha2D proteins. Stable four-helix bundle structures were obtained, with increasing native-like behavior on-going from alpha2 to alpha2D that mimics experiment.
Collapse
Affiliation(s)
- A Sikorski
- Department of Chemistry, University of Warsaw, Poland.
| | | | | |
Collapse
|
16
|
Skolnick J, Kolinski A, Ortiz A. Derivation of protein-specific pair potentials based on weak sequence fragment similarity. Proteins 2000; 38:3-16. [PMID: 10651034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
A method is presented for the derivation of knowledge-based pair potentials that corrects for the various compositions of different proteins. The resulting statistical pair potential is more specific than that derived from previous approaches as assessed by gapless threading results. Additionally, a methodology is presented that interpolates between statistical potentials when no homologous examples to the protein of interest are in the structural database used to derive the potential, to a Go-like potential (in which native interactions are favorable and all nonnative interactions are not) when homologous proteins are present. For cases in which no protein exceeds 30% sequence identity, pairs of weakly homologous interacting fragments are employed to enhance the specificity of the potential. In gapless threading, the mean z score increases from -10.4 for the best statistical pair potential to -12.8 when the local sequence similarity, fragment-based pair potentials are used. Examination of the ab initio structure prediction of four representative globular proteins consistently reveals a qualitative improvement in the yield of structures in the 4 to 6 A rmsd from native range when the fragment-based pair potential is used relative to that when the quasichemical pair potential is employed. This suggests that such protein-specific potentials provide a significant advantage relative to generic quasichemical potentials.
Collapse
Affiliation(s)
- J Skolnick
- Laboratory of Computational Genomics, Danforth Plant Science Center, St. Louis, Missouri 63108, USA.
| | | | | |
Collapse
|
17
|
Abstract
Small peptides that might have some features of globular proteins can provide important insights into the protein folding problem. Two simulation methods, Monte Carlo Dynamics (MCD), based on the Metropolis sampling scheme, and Entropy Sampling Monte Carlo (ESMC), were applied in a study of a high-resolution lattice model of the C-terminal fragment of the B1 domain of protein G. The results provide a detailed description of folding dynamics and thermodynamics and agree with recent experimental findings (. Nature. 390:196-197). In particular, it was found that the folding is cooperative and has features of an all-or-none transition. Hairpin assembly is usually initiated by turn formation; however, hydrophobic collapse, followed by the system rearrangement, was also observed. The denatured state exhibits a substantial amount of fluctuating helical conformations, despite the strong beta-type secondary structure propensities encoded in the sequence.
Collapse
Affiliation(s)
- A Kolinski
- Department of Chemistry, University of Warsaw, 02-093 Warsaw, Poland.
| | | | | |
Collapse
|
18
|
Abstract
A new method for the homology-based modeling of protein three-dimensional structures is proposed and evaluated. The alignment of a query sequence to a structural template produced by threading algorithms usually produces low-resolution molecular models. The proposed method attempts to improve these models. In the first stage, a high-coordination lattice approximation of the query protein fold is built by suitable tracking of the incomplete alignment of the structural template and connection of the alignment gaps. These initial lattice folds are very similar to the structures resulting from standard molecular modeling protocols. Then, a Monte Carlo simulated annealing procedure is used to refine the initial structure. The process is controlled by the model's internal force field and a set of loosely defined restraints that keep the lattice chain in the vicinity of the template conformation. The internal force field consists of several knowledge-based statistical potentials that are enhanced by a proper analysis of multiple sequence alignments. The template restraints are implemented such that the model chain can slide along the template structure or even ignore a substantial fraction of the initial alignment. The resulting lattice models are, in most cases, closer (sometimes much closer) to the target structure than the initial threading-based models. All atom models could easily be built from the lattice chains. The method is illustrated on 12 examples of target/template pairs whose initial threading alignments are of varying quality. Possible applications of the proposed method for use in protein function annotation are briefly discussed.
Collapse
Affiliation(s)
- A Kolinski
- Laboratory of Computational Genomics and Bioinformatics, Danforth Plant Science Center, CET, St. Louis, Missouri 63108, USA
| | | | | | | |
Collapse
|
19
|
Abstract
We present our predictions in the ab initio structure prediction category of CASP3. Eleven targets were folded, using a method based on a Monte Carlo search driven by secondary and tertiary restraints derived from multiple sequence alignments. Our results can be qualitatively summarized as follows: The global fold can be considered "correct" for targets 65 and 74, "almost correct" for targets 64, 75, and 77, "half-correct" for target 79, and "wrong" for targets 52, 56, 59, and 63. Target 72 has not yet been solved experimentally. On average, for small helical and alpha/beta proteins (on the order of 110 residues or smaller), the method predicted low resolution structures with a reasonably good prediction of the global topology. Most encouraging is that in some situations, such as with target 75 and, particularly, target 77, the method can predict a substantial portion of a rare or even a novel fold. However, the current method still fails on some beta proteins, proteins over the 110-residue threshold, and sequences in which only a poor multiple sequence alignment can be built. On the other hand, for small proteins, the method gives results of quality at least similar to that of threading, with the advantage of not being restricted to known folds in the protein database. Overall, these results indicate that some progress has been made on the ab initio protein folding problem. Detailed information about our results can be obtained by connecting to http:/(/)www.bioinformatics.danforthcenter.org/+ ++CASP3.
Collapse
Affiliation(s)
- A R Ortiz
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California, USA
| | | | | | | | | |
Collapse
|
20
|
Abstract
Entropy Sampling Monte Carlo (ESMC) simulations were carried out to study the thermodynamics of the folding transition in the GCN4 leucine zipper (GCN4-lz) in the context of a reduced model. Using the calculated partition functions for the monomer and dimer, and taking into account the equilibrium between the monomer and dimer, the average helix content of the GCN4-lz was computed over a range of temperatures and chain concentrations. The predicted helix contents for the native and denatured states of GCN4-lz agree with the experimental values. Similar to experimental results, our helix content versus temperature curves show a small linear decline in helix content with an increase in temperature in the native region. This is followed by a sharp transition to the denatured state. van't Hoff analysis of the helix content versus temperature curves indicates that the folding transition can be described using a two-state model. This indicates that knowledge-based potentials can be used to describe the properties of the folded and unfolded states of proteins.
Collapse
Affiliation(s)
- D Mohanty
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
21
|
Mohanty D, Dominy BN, Kolinski A, Brooks CL, Skolnick J. Correlation between knowledge-based and detailed atomic potentials: application to the unfolding of the GCN4 leucine zipper. Proteins 1999; 35:447-52. [PMID: 10382672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
The relationship between the unfolding pseudo free energies of reduced and detailed atomic models of the GCN4 leucine zipper is examined. Starting from the native crystal structure, a large number of conformations ranging from folded to unfolded were generated by all-atom molecular dynamics unfolding simulations in an aqueous environment at elevated temperatures. For the detailed atomic model, the pseudo free energies are obtained by combining the CHARMM all-atom potential with a solvation component from the generalized Born, surface accessibility, GB/SA, model. Reduced model energies were evaluated using a knowledge-based potential. Both energies are highly correlated. In addition, both show a good correlation with the root mean square deviation, RMSD, of the backbone from native. These results suggest that knowledge-based potentials are capable of describing at least some of the properties of the folded as well as the unfolded states of proteins, even though they are derived from a database of native protein structures. Since only conformations generated from an unfolding simulation are used, we cannot assess whether these potentials can discriminate the native conformation from the manifold of alternative, low-energy misfolded states. Nevertheless, these results also have significant implications for the development of a methodology for multiscale modeling of proteins that combines reduced and detailed atomic models.
Collapse
Affiliation(s)
- D Mohanty
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | | | | | |
Collapse
|
22
|
Abstract
One of the most important unsolved problems of computational biology is prediction of the three-dimensional structure of a protein from its amino acid sequence. In practice, the solution to the protein folding problem demands that two interrelated problems be simultaneously addressed. Potentials that recognize the native state from the myriad of misfolded conformations are required, and the multiple minima conformational search problem must be solved. A means of partly surmounting both problems is to use reduced protein models and knowledge-based potentials. Such models have been employed to elucidate a number of general features of protein folding, including the nature of the energy landscape, the factors responsible for the uniqueness of the native state and the origin of the two-state thermodynamic behavior of globular proteins. Reduced models have also been used to predict protein tertiary and quaternary structure. When combined with a limited amount of experimental information about secondary and tertiary structure, molecules of substantial complexity can be assembled. If predicted secondary structure and tertiary restraints are employed, low resolution models of single domain proteins can be successfully predicted. Thus, simplified protein models have played an important role in furthering the understanding of the physical properties of proteins.
Collapse
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | | | | |
Collapse
|
23
|
Kolinski A, Skolnick J. Assembly of protein structure from sparse experimental data: an efficient Monte Carlo model. Proteins 1998; 32:475-94. [PMID: 9726417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
A new, efficient method for the assembly of protein tertiary structure from known, loosely encoded secondary structure restraints and sparse information about exact side chain contacts is proposed and evaluated. The method is based on a new, very simple method for the reduced modeling of protein structure and dynamics, where the protein is described as a lattice chain connecting side chain centers of mass rather than Calphas. The model has implicit built-in multibody correlations that simulate short- and long-range packing preferences, hydrogen bonding cooperativity and a mean force potential describing hydrophobic interactions. Due to the simplicity of the protein representation and definition of the model force field, the Monte Carlo algorithm is at least an order of magnitude faster than previously published Monte Carlo algorithms for structure assembly. In contrast to existing algorithms, the new method requires a smaller number of tertiary restraints for successful fold assembly; on average, one for every seven residues as compared to one for every four residues. For example, for smaller proteins such as the B domain of protein G, the resulting structures have a coordinate root mean square deviation (cRMSD), which is about 3 A from the experimental structure; for myoglobin, structures whose backbone cRMSD is 4.3 A are produced, and for a 247-residue TIM barrel, the cRMSD of the resulting folds is about 6 A. As would be expected, increasing the number of tertiary restraints improves the accuracy of the assembled structures. The reliability and robustness of the new method should enable its routine application in model building protocols based on various (very sparse) experimentally derived structural restraints.
Collapse
Affiliation(s)
- A Kolinski
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | |
Collapse
|
24
|
Abstract
In the context of reduced protein models, Monte Carlo simulations of three de novo designed helical proteins (four-member helical bundle) were performed. At low temperatures, for all proteins under consideration, protein-like folds having different topologies were obtained from random starting conformations. These simulations are consistent with experimental evidence indicating that these de novo designed proteins have the features of a molten globule state. The results of Monte Carlo simulations suggest that these molecules adopt four-helix bundle topologies. They also give insight into the possible mechanism of folding and association, which occurs in these simulations by on-site assembly of the helices. The low-temperature conformations of all three sequences have the features of a molten globule state.
Collapse
Affiliation(s)
- A Sikorski
- Department of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | |
Collapse
|
25
|
Ortiz AR, Kolinski A, Skolnick J. Fold assembly of small proteins using monte carlo simulations driven by restraints derived from multiple sequence alignments. J Mol Biol 1998; 277:419-48. [PMID: 9514747 DOI: 10.1006/jmbi.1997.1595] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The feasibility of predicting the global fold of small proteins by incorporating predicted secondary and tertiary restraints into ab initio folding simulations has been demonstrated on a test set comprised of 20 non-homologous proteins, of which one was a blind prediction of target 42 in the recent CASP2 contest. These proteins contain from 37 to 100 residues and represent all secondary structural classes and a representative variety of global topologies. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Predicted tertiary restraints are derived from multiple sequence alignments via a two-step process. First, seed side-chain contacts are identified from correlated mutation analysis, and then a threading-based algorithm is used to expand the number of these seed contacts. A lattice-based reduced protein model and a folding algorithm designed to incorporate these predicted restraints is described. Depending upon fold complexity, it is possible to assemble native-like topologies whose coordinate root-mean-square deviation from native is between 3.0 A and 6.5 A. The requisite level of accuracy in side-chain contact map prediction can be roughly 25% on average, provided that about 60% of the contact predictions are correct within +/-1 residue and 95% of the predictions are correct within +/-4 residues. Precision in tertiary contact prediction is more critical than absolute accuracy. Furthermore, only a subset of the tertiary contacts, on the order of 25% of the total, is sufficient for successful topology assembly. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm holds considerable promise for the prediction of the global topology of small proteins.
Collapse
Affiliation(s)
- A R Ortiz
- TPC-5, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
26
|
Abstract
Using a recently developed protein folding algorithm, a prediction of the tertiary structure of the KIX domain of the CREB binding protein is described. The method incorporates predicted secondary and tertiary restraints derived from multiple sequence alignments in a reduced protein model whose conformational space is explored by Monte Carlo dynamics. Secondary structure restraints are provided by the PHD secondary structure prediction algorithm that was modified for the presence of predicted U-turns, i.e., regions where the chain reverses global direction. Tertiary restraints are obtained via a two-step process: First, seed side-chain contacts are identified from a correlated mutation analysis, and then, a threading-based algorithm expands the number of these seed contacts. Blind predictions indicate that the KIX domain is a putative three-helix bundle, although the chirality of the bundle could not be uniquely determined. The expected root-mean-square deviation for the correct chirality of the KIX domain is between 5.0 and 6.2 A. This is to be compared with the estimate of 12.9 A that would be expected by a random prediction, using the model of F. Cohen and M. Sternberg (J. Mol. Biol. 138:321-333, 1980).
Collapse
Affiliation(s)
- A R Ortiz
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
27
|
Ortiz AR, Kolinski A, Skolnick J. Nativelike topology assembly of small proteins using predicted restraints in Monte Carlo folding simulations. Proc Natl Acad Sci U S A 1998; 95:1020-5. [PMID: 9448278 PMCID: PMC18658 DOI: 10.1073/pnas.95.3.1020] [Citation(s) in RCA: 47] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
By incorporating predicted secondary and tertiary restraints derived from multiple sequence alignments into ab initio folding simulations, it has been possible to assemble native-like tertiary structures for a test set of 19 nonhomologous proteins ranging from 29 to 100 residues in length and representing all secondary structural classes. Secondary structural restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Multiple sequence alignments also provide predicted tertiary restraints via a two-step process: First, seed side chain contacts are selected from a correlated mutation analysis, and then an inverse folding algorithm expands these seed contacts. The predicted secondary and tertiary restraints are incorporated into a lattice-based, reduced protein model for structure assembly and refinement. The resulting native-like topologies exhibit a coordinate root-mean-square deviation from native for the whole chain between 3.1 and 6.7 A, with values ranging from 2.6 to 4.1 A over approximately 80% of the structure. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm is a promising approach to the prediction of the global topology of small proteins.
Collapse
Affiliation(s)
- A R Ortiz
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
28
|
Ortiz AR, Kolinski A, Skolnick J. Combined multiple sequence reduced protein model approach to predict the tertiary structure of small proteins. Pac Symp Biocomput 1998:377-388. [PMID: 9697197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
By incorporating predicted secondary and tertiary restraints into ab initio folding simulations, low resolution tertiary structures of a test set of 20 nonhomologous proteins have been predicted. These proteins, which represent all secondary structural classes, contain from 37 to 100 residues. Secondary structural restraints are provided by the PHD secondary structure prediction algorithm that incorporates multiple sequence information. Predicted tertiary restraints are obtained from multiple sequence alignments via a two-step process: First, "seed" side chain contacts are identified from a correlated mutation analysis, and then, the seed contacts are "expanded" by an inverse folding algorithm. These predicted restraints are then incorporated into a lattice based, reduced protein model. Depending upon fold complexity, the resulting nativelike topologies exhibit a coordinate root-mean-square deviation, cRMSD, from native between 3.1 and 6.7 A. Overall, this study suggests that the use of restraints derived from multiple sequence alignments combined with a fold assembly algorithm is a promising approach to the prediction of the global topology of small proteins.
Collapse
Affiliation(s)
- A R Ortiz
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
29
|
Kolinski A, Skolnick J, Godzik A. An algorithm for prediction of structural elements in small proteins. Pac Symp Biocomput 1997:446-60. [PMID: 9390250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A method for predicting the location of surface loops/turns and assigning the intervening secondary structure of the transglobular linkers in small, single domain globular proteins has been developed. Application to a set of 10 proteins of known structure indicates a high level of accuracy. The secondary structure assignment in the center of transglobular connections is correct in more than 85% of the cases. A similar error rate is found for loops. Since more global information about the fold is provided, it is complementary to standard secondary structure prediction approaches. Consequently, it may be useful in early stages of tertiary structure prediction when establishment of the structural class and possible folding topologies is of interest.
Collapse
Affiliation(s)
- A Kolinski
- Scripps Research Institute, Department of Molecular Biology, La Jolla, California 92037, USA.
| | | | | |
Collapse
|
30
|
Abstract
A new and more accurate method has been developed for predicting the backbone U-turn positions (where the chain reverses global direction) and the dominant secondary structure elements between U-turns in globular proteins. The current approach uses sequence-specific secondary structure propensities and multiple sequence information. The latter plays an important role in the enhanced success of this approach. Application to two sets (total 108) of small to medium-sized, single-domain proteins indicates that approximately 94% of the U-turn locations are correctly predicted within three residues, as are 88% of dominant secondary structure elements. These results are significantly better than our previous method (Kolinski et al., Proteins 27:290-308, 1997). The current study strongly suggests that the U-turn locations are primarily determined by local interactions. Furthermore, both global length constraints and local interactions contribute significantly to the determination of the secondary structure types between U-turns. Accurate U-turn predictions are crucial for accurate secondary structure predictions in the current method. Protein structure modeling, tertiary structure predictions, and possibly, fold recognition should benefit from the predicted structural data provided by this new method.
Collapse
Affiliation(s)
- W P Hu
- Scripps Research Institute, Department of Molecular Biology, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
31
|
Skolnick J, Jaroszewski L, Kolinski A, Godzik A. Derivation and testing of pair potentials for protein folding. When is the quasichemical approximation correct? Protein Sci 1997; 6:676-88. [PMID: 9070450 PMCID: PMC2143667 DOI: 10.1002/pro.5560060317] [Citation(s) in RCA: 152] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Many existing derivations of knowledge-based statistical pair potentials invoke the quasichemical approximation to estimate the expected side-chain contact frequency if there were no amino acid pair-specific interactions. At first glance, the quasichemical approximation that treats the residues in a protein as being disconnected and expresses the side-chain contact probability as being proportional to the product of the mole fractions of the pair of residues would appear to be rather severe. To investigate the validity of this approximation, we introduce two new reference states in which no specific pair interactions between amino acids are allowed, but in which the connectivity of the protein chain is retained. The first estimates the expected number of side-chain contracts by treating the protein as a Gaussian random coil polymer. The second, more realistic reference state includes the effects of chain connectivity, secondary structure, and chain compactness by estimating the expected side-chain contrast probability by placing the sequence of interest in each member of a library of structures of comparable compactness to the native conformation. The side-chain contact maps are not allowed to readjust to the sequence of interest, i.e., the side chains cannot repack. This situation would hold rigorously if all amino acids were the same size. Both reference states effectively permit the factorization of the side-chain contact probability into sequence-dependent and structure-dependent terms. Then, because the sequence distribution of amino acids in proteins is random, the quasichemical approximation to each of these reference states is shown to be excellent. Thus, the range of validity of the quasichemical approximation is determined by the magnitude of the side-chain repacking term, which is, at present, unknown. Finally, the performance of these two sets of pair interaction potentials as well as side-chain contact fraction-based interaction scales is assessed by inverse folding tests both without and with allowing for gaps.
Collapse
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92037, USA.
| | | | | | | |
Collapse
|
32
|
Kolinski A, Skolnick J, Godzik A, Hu WP. A method for the prediction of surface "U"-turns and transglobular connections in small proteins. Proteins 1997; 27:290-308. [PMID: 9061792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
A simple method for predicting the location of surface loops/turns that change the overall direction of the chain that is, "U" turns, and assigning the dominant secondary structure of the intervening transglobular blocks in small, single-domain globular proteins has been developed. Since the emphasis of the method is on the prediction of the major topological elements that comprise the global structure of the protein rather than on a detailed local secondary structure description, this approach is complementary to standard secondary structure prediction schemes. Consequently, it may be useful in the early stages of tertiary structure prediction when establishment of the structural class and possible folding topologies is of interest. Application to a set of small proteins of known structure indicates a high level of accuracy. The prediction of the approximate location of the surface turns/loops that are responsible for the change in overall chain direction is correct in more than 95% of the cases. The accuracy for the dominant secondary structure assignment for the linear blocks between such surface turns/loops is in the range of 82%.
Collapse
Affiliation(s)
- A Kolinski
- Scripps Research Institute, Department of Molecular Biology, La Jolla, California, USA.
| | | | | | | |
Collapse
|
33
|
Abstract
The MONSSTER (MOdeling of New Structures from Secondary and TEritary Restraints) method for folding of proteins using a small number of long-distance restraints (which can be up to seven times less than the total number of residues) and some knowledge of the secondary structure of regular fragments is described. The method employs a high-coordination lattice representation of the protein chain that incorporates a variety of potentials designed to produce protein-like behaviour. These include statistical preferences for secondary structure, side-chain burial interactions, and a hydrogen-bond potential. Using this algorithm, several globular proteins (1ctf, 2gbl, 2trx, 3fxn, 1mba, 1pcy and 6pti) have been folded to moderate-resolution, native-like compact states. For example, the 68 residue 1ctf molecule having ten loosely defined, long-range restraints was reproducibly obtained with a C alpha-backbone root-mean-square deviation (RMSD) from native of about 4. A. Flavodoxin with 35 restraints has been folded to structures whose average RMSD is 4.28 A. Furthermore, using just 20 restraints, myoglobin, which is a 146 residue helical protein, has been folded to structures whose average RMSD from native is 5.65 A. Plastocyanin with 25 long-range restraints adopts conformations whose average RMSD is 5.44 A. Possible applications of the proposed approach to the refinement of structures from NMR data, homology model-building and the determination of tertiary structure when the secondary structure and a small number of restraints are predicted are briefly discussed.
Collapse
Affiliation(s)
- J Skolnick
- Scripps Research Institute, Department of Molecular Biology, La Jolla, California 92037, USA
| | | | | |
Collapse
|
34
|
Ortiz AR, Hu WP, Kolinski A, Skolnick J. Method for low resolution prediction of small protein tertiary structure. Pac Symp Biocomput 1997:316-327. [PMID: 9390302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
A new method for the de novo prediction of protein structures at low resolution has been developed. Starting from a multiple sequence alignment, protein secondary structure is predicted, and only those topological elements with high reliability are selected. Then, the multiple sequence alignment and the secondary structure prediction are combined to predict side chain contacts. Such contact map prediction is carried out in two stages. First, an analysis of correlated mutations is carried out to identify pairs of topological elements of secondary structure which are in contact. Then, inverse folding is used to select compatible fragments in contact, thereby enriching the number and identity of predicted side chain contacts. The final outcome of the procedure is a set of noisy secondary and tertiary restraints. These are used as a restrained potential in a Monte Carlo simulation of simplified protein models driven by statistical potentials. Low energy structures are then searched for by using simulated annealing techniques. Implementation of the restraints is carried out so as to take into account of their low resolution. Using this procedure, it has been possible to predict de novo the structure of three very different protein topologies: an alpha/beta protein, the bovine pancreatic trypsin inhibitor (6pti), an alpha-helical protein, calbindin (3icb), and an all beta- protein, the SH3 domain of spectrin (1shg). In all cases, low resolution folds have been obtained with a root mean square deviation (RMSD) of 4.5-5.5 A with respect to the native structure. Some misfolded topologies appear in the simulations, but it is possible to select the native one on energetic grounds. Thus, it is demonstrated that the methodology is general for all protein motifs. Work is in progress in order to test the methodology on a larger set of protein structures.
Collapse
Affiliation(s)
- A R Ortiz
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | | | |
Collapse
|
35
|
Abstract
There is considerable experimental evidence that the cooperativity of protein folding resides in the transition from the molten globule to the native state. The objective of this study is to examine whether simplified models can reproduce this cooperativity and if so, to identify its origin. In particular, the thermodynamics of the conformational transition of a previously designed sequence (A. Kolinski, W. Galazka, and J. Skolnick, J. Chem. Phys. 103: 10286-10297, 1995), which adopts a very stable Greek-key beta-barrel fold has been investigated using the entropy Monte Carlo sampling (ESMC) technique of Hao and Scheraga (M.-H. Hao and H.A. Scheraga, J. Phys. Chem. 98: 9882-9883, 1994). Here, in addition to the original potential, which includes one body and pair interactions between side chains, the force field has been supplemented by two types of multi-body potentials describing side chain interactions. These potentials facilitate the protein-like pattern of side chain packing and consequently increase the cooperativity of the folding process. Those models that include an explicit cooperative side chain packing term exhibit a well-defined all-or-none transition from a denatured, random coil state to a high-density, well-defined, nativelike low-energy state. By contrast, models lacking such a term exhibit a conformational transition that is essentially continuous. Finally, an examination of the conformations at the free-energy barrier between the native and denatured states reveals that they contain a substantial amount of native-state secondary structure, about 50% of the native contacts, and have an average root mean square radius of gyration that is about 15% larger than native.
Collapse
Affiliation(s)
- A Kolinski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
36
|
Abstract
In solution, the B domain of protein A from Staphylococcus aureus (B domain) possesses a three-helix bundle structure. This simple motif has been previously reproduced by Kolinski and Skolnick (Proteins 18: 353-366, 1994) using a reduced representation lattice model of proteins with a statistical interaction scheme. In this paper, an improved version of the potential has been used, and the robustness of this result has been tested by folding from the random state a set of three-helix bundle proteins that are highly homologous to the B domain of protein A. Furthermore, an attempt to redesign the B domain native structure to its topological mirror image fold has been made by multiple mutations of the hydrophobic core and the turn region between helices I and II. A sieve method for scanning a large set of mutations to search for this desired property has been proposed. It has been shown that mutations of native B domain hydrophobic core do not introduce significant changes in the protein motif. Mutations in the turn region were also very conservative; nevertheless, a few mutants acquired the desired topological mirror image motif. A set of all atom models of the most probable mutant was reconstructed from the reduced models and refined using a molecular dynamics algorithm in the presence of water. The packing of all atom structures obtained corroborates the lattice model results. We conclude that the change in the handedness of the turn induced by the mutations, augmented by the repacking of hydrophobic core and the additional burial of the second helix N-cap side chain, are responsible for the predicted preferential adoption of the mirror image structure.
Collapse
Affiliation(s)
- K A Olszewski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
37
|
Abstract
A method that employs a transfer matrix treatment combined with Monte Carlo sampling has been used to calculate the configurational free energies of folded and unfolded states of lattice models of proteins. The method is successfully applied to study the monomer-dimer equilibria in various coiled coils. For the short coiled coils, GCN4 leucine zipper, and its fragments, Fos and Jun, very good agreement is found with experiment. Experimentally, some subdomains of the GCN4 leucine zipper form stable dimeric structures, suggesting the regions of differential stability in the parent structure. Our calculations suggest that the stabilities of the subdomains are in general different from the values expected simply from the stability of the corresponding fragment in the wild type molecule. Furthermore, parts of the fragments structurally rearrange in some regions with respect to their corresponding wild type positions. Our results suggest for an Asn in the dimerization interface at least a pair of hydrophobic interacting helical turns at each side is required to stabilize the stable coiled coil. Finally, the specificity of heterodimer formation in the Fos-Jun system comes from the relative instability of Fos homodimers, resulting from unfavorable intra- and interhelical interactions in the interfacial coiled coil region.
Collapse
Affiliation(s)
- M Vieth
- Department of Chemistry, Scripps Research Institute, La Jolla, California 92037, USA
| | | | | |
Collapse
|
38
|
Abstract
Amino acid sequences of native proteins are generally not palindromic. Nevertheless, the protein molecule obtained as a result of reading the sequence backwards, i.e. a retro-protein, obviously has the same amino acid composition and the same hydrophobicity profile as the native sequence. The important questions which arise in the context of retro-proteins are: does a retro-protein fold to a well defined native-like structure as natural proteins do and, if the answer is positive, does a retro-protein fold to a structure similar to the native conformation of the original protein? In this work, the fold of retro-protein A, originated from the retro-sequence of the B domain of Staphylococcal protein A, was studied. As a result of lattice model simulations, it is conjectured that the retro-protein A also forms a three-helix bundle structure in solution. It is also predicted that the topology of the retro-protein A three-helix bundle is that of the native protein A, rather than that corresponding to the mirror image of native protein A. Secondary structure elements in the retro-protein do not exactly match their counterparts in the original protein structure; however, the amino acid side chain contract pattern of the hydrophobic core is partly conserved.
Collapse
Affiliation(s)
- K A Olszewski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
39
|
Abstract
Using a simplified protein model, the equilibrium between different oligomeric species of the wild-type GCN4 leucine zipper and seven of its mutants have been predicted. Over the entire experimental concentration range, agreement with experiment is found in five cases, while in two cases agreement is found over a portion of the concentration range. These studies demonstrate a methodology for predicting coiled coil quaternary structure and allow for the dissection of the interactions responsible for the global fold. In agreement with the conclusion of Harbury et al., the results of the simulations indicate that the pattern of hydrophobic and hydrophilic residues alone is insufficient to define a protein's three-dimensional structure. In addition, these simulations indicate that the degree of chain association is determined by the balance between specific side-chain packing preferences and the entropy reduction associated with side-chain burial in higher-order multimers.
Collapse
Affiliation(s)
- M Vieth
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | | | |
Collapse
|
40
|
Abstract
An artificial neural network system is used for pattern recognition in protein side-chain-side-chain contact maps. A back-propagation network was trained on a set of patterns which are popular in side-chain contact maps of protein structures. Several neural network architectures and different training parameters were tested to decide on the best combination for the neural network. The resulting network can distinguish between original (from protein structures) and randomized patterns with an accuracy of 84.5% and a Matthews' coefficient of 0.72 for the testing set. Applications of this system for protein structure evaluation and refinement are also proposed. Examples include structures obtained after the application of molecular dynamics to crystal structures, structures obtained from X-ray crystallography at various stages of refinement, structures obtained from a de novo folding algorithm and deliberately misfolded structures.
Collapse
Affiliation(s)
- M Milik
- R. W. Johnson Pharmaceutical Research Institute, San Diego, CA 92121, USA
| | | | | |
Collapse
|
41
|
Abstract
A hierarchical approach is described for the prediction of the three-dimensional structure and folding pathway of the GCN4 leucine zipper. Dimer assembly is simulated by Monte Carlo dynamics. The resulting lowest energy structures undergo cooperative rearrangement of their hydrophobic core leading to side-chain fixation. The coarse-grained structures are further refined using a molecular dynamics annealing protocol. This produces full atom models with a backbone root-mean-square deviation from the crystal structure of 0.81 A. Thus, we demonstrate the predictive ability of our approach to yield high resolution structures of small coiled coils from their sequence.
Collapse
Affiliation(s)
- M Vieth
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | | | |
Collapse
|
42
|
Abstract
The hierarchy of lattice Monte Carlo models described in the accompanying paper (Kolinski, A., Skolnick, J. Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins 18:338-352, 1994) is applied to the simulation of protein folding and the prediction of 3-dimensional structure. Using sequence information alone, three proteins have been successfully folded: the B domain of staphylococcal protein A, a 120 residue, monomeric version of ROP dimer, and crambin. Starting from a random expanded conformation, the model proteins fold along relatively well-defined folding pathways. These involve a collection of early intermediates, which are followed by the final (and rate-determining) transition from compact intermediates closely resembling the molten globule state to the native-like state. The predicted structures are rather unique, with native-like packing of the side chains. The accuracy of the predicted native conformations is better than those obtained in previous folding simulations. The best (but by no means atypical) folds of protein A have a coordinate rms of 2.25 A from the native C alpha trace, and the best coordinate rms from crambin is 3.18 A. For ROP monomer, the lowest coordinate rms from equivalent C alpha s of ROP dimer is 3.65 A. Thus, for two simple helical proteins and a small alpha/beta protein, the ability to predict protein structure from sequence has been demonstrated.
Collapse
Affiliation(s)
- A Kolinski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92037
| | | |
Collapse
|
43
|
Abstract
A new hierarchical method for the simulation of the protein folding process and the de novo prediction of protein three-dimensional structure is proposed. The reduced representation of the protein alpha-carbon backbone employs lattice discretizations of increasing geometrical resolution and a single ball representation of side chain rotamers. In particular, coarser and finer lattice backbone descriptions are used. The coarser (finer) lattice represents C alpha traces of native proteins with an accuracy of 1.0 (0.7) A rms. Folding is simulated by means of very fast Monte Carlo lattice dynamics. The potential of mean force, predominantly of statistical origin, contains several novel terms that facilitate the cooperative assembly of secondary structure elements and the cooperative packing of the side chains. Particular contributions to the interaction scheme are discussed in detail. In the accompanying paper (Kolinski, A., Skolnick, J. Monte Carlo simulation of protein folding. II. Application to protein A, ROP, and crambin. Proteins 18:353-366, 1994), the method is applied to three small globular proteins.
Collapse
Affiliation(s)
- A Kolinski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92037
| | | |
Collapse
|
44
|
Abstract
The description of protein structure in the language of side chain contact maps is shown to offer many advantages over more traditional approaches. Because it focuses on side chain interactions, it aids in the discovery, study and classification of similarities between interactions defining particular protein folds and offers new insights into the rules of protein structure. For example, there is a small number of characteristic patterns of interactions between protein supersecondary structural fragments, which can be seen in various non-related proteins. Furthermore, the overlap of the side chain contact maps of two proteins provides a new measure of protein structure similarity. As shown in several examples, alignments based on contact map overlaps are a powerful alternative to other structure-based alignments.
Collapse
Affiliation(s)
- A Godzik
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | |
Collapse
|
45
|
Abstract
In the last two years, the use of simplified models has facilitated major progress in the globular protein folding problem, viz., the prediction of the three-dimensional (3D) structure of a globular protein from its amino acid sequence. A number of groups have addressed the inverse folding problem where one examines the compatibility of a given sequence with a given (and already determined) structure. A comparison of extant inverse protein-folding algorithms is presented, and methodologies for identifying sequences likely to adopt identical folding topologies, even when they lack sequence homology, are described. Extension to produce structural templates or fingerprints from idealized structures is discussed, and for eight-membered beta-barrel proteins, it is shown that idealized fingerprints constructed from simple topology diagrams can correctly identify sequences having the appropriate topology. Furthermore, this inverse folding algorithm is generalized to predict elements of supersecondary structure including beta-hairpins, helical hairpins and alpha/beta/alpha fragments. Then, we describe a very high coordination number lattice model that can predict the 3D structure of a number of globular proteins de novo; i.e. using just the amino acid sequence. Applications to sequences designed by DeGrado and co-workers [Biophys. J., 61 (1992) A265] predict folding intermediates, native states and relative stabilities in accord with experiment. The methodology has also been applied to the four-helix bundle designed by Richardson and co-workers [Science, 249 (1990) 884] and a redesigned monomeric version of a naturally occurring four-helix dimer, rop. Based on comparison to the rop dimer, the simulations predict conformations with rms values of 3-4 A from native. Furthermore, the de novo algorithms can assess the stability of the folds predicted from the inverse algorithm, while the inverse folding algorithms can assess the quality of the de novo models. Thus, the synergism of the de novo and inverse folding algorithm approaches provides a set of complementary tools that will facilitate further progress on the protein-folding problem.
Collapse
Affiliation(s)
- A Godzik
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | |
Collapse
|
46
|
Skolnick J, Kolinski A, Brooks CL, Godzik A, Rey A. A method for predicting protein structure from sequence. Curr Biol 1993; 3:414-23. [PMID: 15335708 DOI: 10.1016/0960-9822(93)90348-r] [Citation(s) in RCA: 47] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/1993] [Revised: 06/08/1993] [Accepted: 06/08/1993] [Indexed: 10/26/2022]
Abstract
BACKGROUND The ability to predict the native conformation of a globular protein from its amino-acid sequence is an important unsolved problem of molecular biology. We have previously reported a method in which reduced representations of proteins are folded on a lattice by Monte Carlo simulation, using statistically-derived potentials. When applied to sequences designed to fold into four-helix bundles, this method generated predicted conformations closely resembling the real ones. RESULTS We now report a hierarchical approach to protein-structure prediction, in which two cycles of the above-mentioned lattice method (the second on a finer lattice) are followed by a full-atom molecular dynamics simulation. The end product of the simulations is thus a full-atom representation of the predicted structure. The application of this procedure to the 60 residue, B domain of staphylococcal protein A predicts a three-helix bundle with a backbone root mean square (rms) deviation of 2.25-3 A from the experimentally determined structure. Further application to a designed, 120 residue monomeric protein, mROP, based on the dimeric ROP protein of Escherichia coli, predicts a left turning, four-helix bundle native state. Although the ultimate assessment of the quality of this prediction awaits the experimental determination of the mROP structure, a comparison of this structure with the set of equivalent residues in the ROP dime- crystal structure indicates that they have a rms deviation of approximately 3.6-4.2 A. CONCLUSION Thus, for a set of helical proteins that have simple native topologies, the native folds of the proteins can be predicted with reasonable accuracy from their sequences alone. Our approach suggest a direction for future work addressing the protein-folding problem.
Collapse
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, The Scripps Research Institute, 10666 N. Torrey Pines Road, La Jolla, CA 92037, USA
| | | | | | | | | |
Collapse
|
47
|
Skolnick J, Kolinski A, Godzik A. From independent modules to molten globules: observations on the nature of protein folding intermediates. Proc Natl Acad Sci U S A 1993; 90:2099-100. [PMID: 8460114 PMCID: PMC46030 DOI: 10.1073/pnas.90.6.2099] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | |
Collapse
|
48
|
Abstract
We describe the most general solution to date of the problem of matching globular protein sequences to the appropriate three-dimensional structures. The screening template, against which sequences are tested, is provided by a protein "structural fingerprint" library based on the contact map and the buried/exposed pattern of residues. Then, a lattice Monte Carlo algorithm validates or dismisses the stability of the proposed fold. Examples of known structural similarities between proteins having weakly or unrelated sequences such as the globins and phycocyanins, the eight-member alpha/beta fold of triose phosphate isomerase and even a close structural equivalence between azurin and immunoglobulins are found.
Collapse
Affiliation(s)
- A Godzik
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | |
Collapse
|
49
|
Godzik A, Skolnick J, Kolinski A. Simulations of the folding pathway of triose phosphate isomerase-type alpha/beta barrel proteins. Proc Natl Acad Sci U S A 1992; 89:2629-33. [PMID: 1557367 PMCID: PMC48715 DOI: 10.1073/pnas.89.7.2629] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Simulations of the folding pathways of two large alpha/beta proteins, the alpha subunit of tryptophan synthase and triose phosphate isomerase, are reported using the knight's walk lattice model of globular proteins and Monte Carlo dynamics. Starting from randomly generated unfolded states and with no assumptions regarding the nature of the folding intermediates, for the tryptophan synthase subunit these simulations predict, in agreement with experiment, the existence and location of a stable equilibrium intermediate comprised of six beta strands on the amino terminus of the molecule. For the case of triose phosphate isomerase, the simulations predict that both amino- and carboxyl-terminal intermediates should be observed. In a significant modification of previous lattice models, this model includes a full heavy atom side chain description and is capable of representing native conformations at the level of 2.5- to 3-A rms deviation for the C alpha positions, as compared to the crystal structure. With a well-balanced compromise between accuracy of the protein description and the computer requirements necessary to perform simulations spanning biologically significant amounts of time, the lattice model described here brings the possibility of studying important biological processes to present-day computers.
Collapse
Affiliation(s)
- A Godzik
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037
| | | | | |
Collapse
|
50
|
Abstract
A long-standing problem of molecular biology is the prediction of globular protein tertiary structure from the primary sequence. In the context of a new, 24-nearest-neighbor lattice model of proteins that includes both alpha and beta-carbon atoms, the requirements for folding to a unique four-member beta-barrel, four-helix bundles and a model alpha/beta-bundle have been explored. A number of distinct situations are examined, but the common requirements for the formation of a unique native conformation are tertiary interactions plus the presence of relatively small (but not irrelevant) intrinsic turn preferences that select out the native conformer from a manifold of compact states. When side-chains are explicitly included, there are many conformations having the same or a slightly greater number of side-chain contacts as in the native conformation, and it is the local intrinsic turn preferences that produce the conformational selectivity on collapse. The local preference for helix or beta-sheet secondary structure may be at odds with the secondary structure ultimately found in the native conformation. The requisite intrinsic turn populations are about 0.3% for beta-proteins, 2% for mixed alpha/beta-proteins and 6% for helix bundles. In addition, an idealized model of an allosteric conformational transition has been examined. Folding occurs predominantly by a sequential on-site assembly mechanism with folding initiating either at a turn or from an isolated helix or beta-strand (where appropriate). For helical and beta-protein models, similar folding pathways were obtained in diamond lattice simulations, using an entirely different set of local Monte Carlo moves. This argues strongly that the results are universal; that is, they are independent of lattice, protein model or the particular realization of Monte Carlo dynamics. Overall, these simulations demonstrate that the folding of all known protein motifs can be achieved in the context of a single class of lattice models that includes realistic backbone structures and idealized side-chains.
Collapse
Affiliation(s)
- J Skolnick
- Department of Molecular Biology, Scripps Clinic and Research Foundation, La Jolla, CA 92037
| | | |
Collapse
|