1
|
Hassan M, Coutsias EA. Kinematic Reconstruction of Cyclic Peptides and Protein Backbones from Partial Data. J Chem Inf Model 2021; 61:4975-5000. [PMID: 34570494 PMCID: PMC10129052 DOI: 10.1021/acs.jcim.1c00453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We present an algorithm, QBKR (Quaternary Backbone Kinematic Reconstruction), a fast analytical method for an all-atom backbone reconstruction of proteins and linear or cyclic peptide chains from Cα coordinate traces. Unlike previous analytical methods for deriving all-atom representations from coarse-grained models that rely on canonical geometry with planar peptides in the trans conformation, our de novo kinematic model incorporates noncanonical, cis-trans, geometry naturally. Perturbations to this geometry can be effected with ease in our formulation, for example, to account for a continuous change from cis to trans geometry. A simple optimization of a spring-based objective function is employed for Cα-Cα distance variations that extend beyond the cis-trans limit. The kinematic construction produces a linked chain of peptide units, Cα-C-N-Cα, hinged at the Cα atoms spanning all possible planar and nonplanar peptide conformations. We have combined our method with a ring closure algorithm for the case of ring peptides and missing loops in a protein structure. Here, the reconstruction proceeding from both the N and C termini of the protein backbone (or in both directions from a starting position for rings) requires freedom in the position of one Cα atom (a capstone) to achieve a successful loop or ring closure. A salient feature of our reconstruction method is the ability to enrich conformational ensembles to produce alternative feasible conformations in which H-bond forming C-O or N-H pairs in the backbone can reverse orientations, thus addressing a well-known shortcoming in Cα-based RMSD structure comparison, wherein very close structures may lead to significantly different overall H-bond behavior. We apply the fixed Cα-based design to the reverse reconstruction from noisy Cryo-EM data, a posteriori to the optimization. Our method can be applied to speed up the process of an all-atom description from voluminous experimental data or subpar electron density maps.
Collapse
Affiliation(s)
- Mosavverul Hassan
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794, United States
| | - Evangelos A Coutsias
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, New York 11794, United States.,Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794-5252, United States
| |
Collapse
|
2
|
Aszóadi A, Taylor WR. Folding polypeptide α-carbon backbones by distance geometry methods. Biopolymers 2004. [DOI: 10.1002/bip.360340406] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
3
|
Jyothi S, Joshi RR. Protein structure determination by non-parametric regression and knowledge-based constraints. COMPUTERS & CHEMISTRY 2001; 25:283-99. [PMID: 11339411 DOI: 10.1016/s0097-8485(00)00104-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
We have devised a non-parametric regression-based approach for the estimation of small- and medium-range inter-residual three-dimensional (3d) distances in a protein using only the primary sequence as input. A multivariate analysis of variance technique is used to identify the attributes of the primary sequence that is most effective in determining the tertiary structure. Certain compactness and hydrophobic core building heuristics are used along with the estimated distances in a distance geometry program to predict the 3d-structure (tertiary fold). Our method is found to predict correctly the native topologies of small proteins having up to 150 residues. The sensitivity of the structures to long-range distance constraints is studied by incorporating a small number of NMR distance restraints. In terms of modularity, precision, accuracy and computational efficiency our method is found to be better in comparison with current computational methods like X-PLOR and DRAGON on the sample that was reported in the literature for the comparison of these two methods.
Collapse
Affiliation(s)
- S Jyothi
- Department of Mathematics, Indian Institute of Technology Bombay, Powai, Mumbai, India
| | | |
Collapse
|
4
|
Saitoh S, Nakai T, Nishikawa K. A geometrical constraint approach for reproducing the native backbone conformation of a protein. Proteins 1993; 15:191-204. [PMID: 8441754 DOI: 10.1002/prot.340150209] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
It is known that the backbone conformation of a protein can be reproduced with precision once a correct contact map (two-dimensional representation showing residue pairs in contact) is given as geometrical constraints. There is, however, no way to infer the correct contact map for a protein of unknown structure. We started with one-dimensional constraints using the quantity N14 (the number of neighboring residues within the radius of 14 A). Since the plot of N14 along a chain shows a good correlation with the corresponding amino acid sequence, the N14 profile obtained from the X-ray structure is predictable from the sequence. Construction of backbone conformations under a given N14 profile was carried out in the following two steps: (1) a contact map from the N14 profile was produced by taking the product of N14 values of every two residues; (2) backbone conformations were generated by applying the distance geometry technique to distance constraints given by the contact map. If present, disulfide bonds in a protein, as well as the secondary structure, were treated as additional constraints, and both cases with or without the additional information were examined. The method was tested for 11 proteins of known structure, and the results indicated that the reproduced conformation was fairly good, using an X-ray structure for comparison, for small proteins of less than 80 residues long. The basic assumption and effectiveness of the present method were compared with those of previous studies employing the geometrical constraint approach.(ABSTRACT TRUNCATED AT 250 WORDS)
Collapse
Affiliation(s)
- S Saitoh
- Protein Engineering Research Institute, Osaka, Japan
| | | | | |
Collapse
|
5
|
Rashin AA. Aspects of protein energetics and dynamics. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 1993; 60:73-200. [PMID: 8362069 DOI: 10.1016/0079-6107(93)90017-e] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- A A Rashin
- Biosym Technologies Inc, Parsippany, NJ 07054
| |
Collapse
|
6
|
Wako H, Kubota Y. Distance-constraint approach to higher-order structures of globular proteins with empirically determined distances between amino acid residues. JOURNAL OF PROTEIN CHEMISTRY 1991; 10:233-43. [PMID: 1930636 DOI: 10.1007/bf01024787] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
An analysis of higher-order structures of globular proteins by means of a distance-constraint approach is presented. Conformations are generated for each of 21 test proteins of small and medium sizes by optimizing an objective function f = sigma sigma wij(dij - (dij]2, where dij is a distance between residues i and j in a calculated conformation, (dij) is an assigned distance to the (ij) pair of residues which is determined based on the statistics of known three-dimensional structures of 14 proteins in the earlier study, and wij is a weighting factor. (dij) involves information about hydrophobicity and hydrophilicity of each amino acid residue and about connectivity of a polypeptide chain. In these calculations, only the amino acid sequence is used as input data specific to a calculated protein. With respect to higher-order structures regenerated in the optimized conformations, the following properties are analyzed: (a) N14 of a residue, defined as the number of residues surrounding the residue located within a sphere of radius of 14 A; (b) root-mean-square differences of the global and local conformations from the corresponding X-ray conformations; (c) distance profiles in the short and medium ranges; and (d) distance maps. The effects of supplementary information about locations of secondary structures and disulfide bonds are also examined to discuss the potential ability of this methodology to predict the three-dimensional structures of globular proteins.
Collapse
Affiliation(s)
- H Wako
- School of Social Sciences, Waseda University, Tokyo, Japan
| | | |
Collapse
|
7
|
Kikuchi T, Némethy G, Scheraga HA. Prediction of the location of structural domains in globular proteins. JOURNAL OF PROTEIN CHEMISTRY 1988; 7:427-71. [PMID: 3255372 DOI: 10.1007/bf01024890] [Citation(s) in RCA: 52] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The location of structural domains in proteins is predicted from the amino acid sequence, based on the analysis of a computed contact map for the protein, the average distance map (ADM). Interactions between residues i and j in a protein are subdivided into several ranges, according to the separation [i-j[ in the amino acid sequence. Within each range, average spatial distances between every pair of amino acid residues are computed from a data base of known protein structures. Infrequently occurring pairs are omitted as being statistically insignificant. The average distances are used to construct a predicted ADM. The ADM is analyzed for the occurrence of regions with high densities of contacts (compact regions). Locations of rapid changes of density between various parts of the map are determined by the use of scanning plots of contact densities. These locations serve to pinpoint the distribution of compact regions. This distribution, in turn, is used to predict boundaries of domains in the protein. The technique provides an objective method for the location of domains both on a contact map derived from a known three-dimensional protein structure, the real distance map (RDM), and on an ADM. While most other published methods for the identification of domains locate them in the known three-dimensional structure of a protein, the technique presented here also permits the prediction of domains in proteins of unknown spatial structure, as the construction of the ADM for a given protein requires knowledge of only its amino acid sequence.
Collapse
Affiliation(s)
- T Kikuchi
- Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853-1301
| | | | | |
Collapse
|
8
|
Vásquez M, Scheraga HA. Calculation of protein conformation by the build-up procedure. Application to bovine pancreatic trypsin inhibitor using limited simulated nuclear magnetic resonance data. J Biomol Struct Dyn 1988; 5:705-55. [PMID: 2482758 DOI: 10.1080/07391102.1988.10506425] [Citation(s) in RCA: 59] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Low-energy conformations of a set of tetrapeptides derived from the small protein bovine pancreatic trypsin inhibitor (BPTI) were generated by a build-up procedure from the low-energy conformations of single amino acid residues. At each stage, various-size fragments were built up from all combinations of smaller ones, the total energies were then minimized, and the low-energy conformations were retained for the next stage. The energies of the tetrapeptides were re-ordered by including the effects of hydration. No information other than the amino acid sequence was used to obtain the low-energy conformations of the hydrated tetrapeptides. The latter were then supplemented with a limited set of simulated NMR distance information, derived from the X-ray structure of BPTI, to provide a basis for building the rest of the whole protein molecule by the same procedure. A total of 189 upper bounds, plus 12 pairs of upper and lower bounds pertaining to the location of the three disulfide bonds in this molecule, were used. Four sets of conformations of the entire molecule were generated by utilizing different combinations of smaller fragments. It was possible to obtain low-energy conformations with small rms deviations, 1.1 to 1.4 A for the alpha-carbons, from the structure derived by X-ray diffraction. The average deviations of the backbone dihedral angles were also low, viz. 23 degrees to 26 degrees.
Collapse
Affiliation(s)
- M Vásquez
- Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853-1301
| | | |
Collapse
|
9
|
Srinivasan S, Shibata M, Roychoudhury M, Rein R. Multistep modeling of protein structure: application towards refinement of tyr-tRNA synthetase. INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY. QUANTUM BIOLOGY SYMPOSIUM : PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON QUANTUM BIOLOGY AND QUANTUM PHARMACOLOGY. INTERNATIONAL SYMPOSIUM ON QUANTUM BIOLOGY AND QUANTUM PHARMACOLOGY 1987; 14:281-8. [PMID: 11542105 DOI: 10.1002/qua.560320825] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2023]
Abstract
The scope of multistep modeling (MSM) is expanding by adding a least-squares minimization step in the procedure to fit backbone reconstruction consistent with a set of C-alpha coordinates. The analytical solution of Phi and Psi angles, that fits a C-alpha x-ray coordinate is used for tyr-tRNA synthetase. Phi and Psi angles for the region where the above mentioned method fails, are obtained by minimizing the difference in C-alpha distances between the computed model and the crystal structure in a least-squares sense. We present a stepwise application of this part of MSM to the determination of the complete backbone geometry of the 321 N terminal residues of tyrosine tRNA synthetase to a root mean square deviation of 0.47 angstroms from the crystallographic C-alpha coordinates.
Collapse
Affiliation(s)
- S Srinivasan
- Department of Biophysics, Roswell Park Memorial Institute, Buffalo, New York 14263, USA
| | | | | | | |
Collapse
|
10
|
Goel NS, Thompson RL. Organization of biological systems: some principles and models. INTERNATIONAL REVIEW OF CYTOLOGY 1986; 103:1-88. [PMID: 3528019 DOI: 10.1016/s0074-7696(08)60833-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
11
|
Cariani P, Goel NS. On the computation of the tertiary structure of globular proteins--IV. Use of secondary structure information. Bull Math Biol 1985; 47:367-407. [PMID: 4041667 DOI: 10.1007/bf02459922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
12
|
Novotný J, Bruccoleri R, Karplus M. An analysis of incorrectly folded protein models. Implications for structure predictions. J Mol Biol 1984; 177:787-818. [PMID: 6434748 DOI: 10.1016/0022-2836(84)90049-4] [Citation(s) in RCA: 202] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Proteins with homologous amino acid sequences have similar folds and it has been assumed that an unknown three-dimensional structure can be obtained from a known homologous structure by substituting new side-chains into the polypeptide chain backbone, followed by relatively small adjustment of the model. To examine this approach of structure prediction and, more generally, to isolate the characteristics of native proteins, we constructed two incorrectly folded protein models. Sea-worm hemerythrin and the variable domain of mouse immunoglobulin K-chain, two proteins with no sequence homology, were chosen for study; the former is composed of a bundle of four alpha-helices and the latter consists of two 4-stranded beta-sheets. Using an automatic computer procedure, hemerythrin side-chains were substituted into the immunoglobulin domain and vice versa. The structures were energy-minimized with the program CHARMM and the resulting structures compared with the correctly folded forms. It was found that the incorrect side-chains can be incorporated readily into both types of structures (alpha-helices, beta-sheets) with only small structural adjustments. After constrained energy-minimization, which led to an average atomic co-ordinate shift of no more than 0.7 to 0.9 A, the incorrectly folded models arrived at potential energy values comparable to those of the correct structures. Detailed analysis of the energy results shows that the incorrect structures have less stabilizing electrostatic, van der Waals' and hydrogen-bonding interactions. The difference is particularly pronounced when the electrostatic and van der Waals' energy terms are calculated by modified equations that include an approximate representation of solvent effects. The incorrectly folded structures also have a significantly larger solvent-accessible surface and a greater fraction of non-polar side-chain atoms exposed to solvent. Examination of their interior shows that the packing of side-chains at the secondary structure interfaces, although corresponding to sterically allowed conformations, deviates from the characteristics found in normal proteins. The analysis of incorrectly folded structures has made it clear that the absence of bad non-bonded contacts, though necessary, is not sufficient to demonstrate the validity of model-built structures and that modeling of homologous structures has to be accompanied by a thorough quantitative evaluation of the results. Further, certain features that characterize native proteins are made evident by their absence in misfolded models.
Collapse
|
13
|
Havel T, Wüthrich K. A distance geometry program for determining the structures of small proteins and other macromolecules from nuclear magnetic resonance measurements of intramolecular1H−1H proximities in solution. Bull Math Biol 1984. [DOI: 10.1007/bf02459510] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
14
|
Abe H, Braun W, Noguti T, Gō N. Rapid calculation of first and second derivatives of conformational energy with respect to dihedral angles for proteins general recurrent equations. ACTA ACUST UNITED AC 1984. [DOI: 10.1016/0097-8485(84)85015-9] [Citation(s) in RCA: 92] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|