1
|
Skolnick J, Gao M. The role of local versus nonlocal physicochemical restraints in determining protein native structure. Curr Opin Struct Biol 2020; 68:1-8. [PMID: 33129066 DOI: 10.1016/j.sbi.2020.10.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 10/03/2020] [Accepted: 10/05/2020] [Indexed: 12/15/2022]
Abstract
The tertiary structure of a native protein is dictated by the interplay of local secondary structure propensities, hydrogen bonding, and tertiary interactions. It is argued that the space of known protein topologies covers all single domain folds and results from the compactness of the native structure and excluded volume. Protein compactness combined with the chirality of the protein's side chains also yields native-like Ramachandran plots. It is the many-body, tertiary interactions among residues that collectively select for the global structure that a particular protein sequence adopts. This explains why the recent advances in deep-learning approaches that predict protein side-chain contacts, the distance matrix between residues, and sequence alignments are successful. They succeed because they implicitly learned the many-body interactions among protein residues.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332, United States.
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, 950 Atlantic Drive, NW, Atlanta, GA 30332, United States.
| |
Collapse
|
2
|
On the possible origin of protein homochirality, structure, and biochemical function. Proc Natl Acad Sci U S A 2019; 116:26571-26579. [PMID: 31822617 DOI: 10.1073/pnas.1908241116] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Living systems have chiral molecules, e.g., native proteins that almost entirely contain L-amino acids. How protein homochirality emerged from a background of equal numbers of L and D amino acids is among many questions about life's origin. The origin of homochirality and its implications are explored in computer simulations examining the stability and structural and functional properties of an artificial library of compact proteins containing 1:1 (termed demi-chiral), 3:1, and 1:3 ratios of D:L and purely L or D amino acids generated without functional selection. Demi-chiral proteins have shorter secondary structures and fewer internal hydrogen bonds and are less stable than homochiral proteins. Selection for hydrogen bonding yields a preponderance of L or D amino acids. Demi-chiral proteins have native global folds, including similarity to early ribosomal proteins, similar small molecule ligand binding pocket geometries, and many constellations of L-chiral amino acids with a 1.0-Å RMSD to native enzyme active sites. For a representative subset containing 550 active site geometries matching 457 (2) 4-digit (3-digit) enzyme classification (E.C.) numbers, native active site amino acids were generated at random for 472 of 550 cases. This increases to 548 of 550 cases when similar residues are allowed. The most frequently generated sequences correspond to ancient enzymatic functions, e.g., glycolysis, replication, and nucleotide biosynthesis. Surprisingly, even without selection, demi-chiral proteins possess the requisite marginal biochemical function and structure of modern proteins, but were thermodynamically less stable. If demi-chiral proteins were present, they could engage in early metabolism, which created the feedback loop for transcription and cell formation.
Collapse
|
3
|
Espinoza EM, Clark JA, Derr JB, Bao D, Georgieva B, Quina FH, Vullev VI. How Do Amides Affect the Electronic Properties of Pyrene? ACS OMEGA 2018; 3:12857-12867. [PMID: 31458010 PMCID: PMC6644773 DOI: 10.1021/acsomega.8b01581] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 09/24/2018] [Indexed: 05/12/2023]
Abstract
The electronic properties of amide linkers, which are intricate components of biomolecules, offer a wealth of unexplored possibilities. Herein, we demonstrate how the different modes of attaching an amide to a pyrene chromophore affect the electrochemical and optical properties of the chromophore. Thus, although they cause minimal spectral shifts, amide substituents can improve either the electron-accepting or electron-donating capabilities of pyrene. Specifically, inversion of the amide orientation shifts the reduction potentials by 200 mV. These trends indicate that, although amides affect to a similar extent the energies of the ground and singlet excited states of pyrene, the effects on the doublet states of its radical ions are distinctly different. This behavior reflects the unusually strong orientation dependence of the resonance effects of amide substituents, which should extend to amide substituents on other types of chromophores in general. These results represent an example where the Hammett sigma constants fail to predict substituent effects on electrochemical properties. On the other hand, Swain-Lupton parameters are found to be in good agreement with the observed trends. Examination of the frontier orbitals of the pyrene derivatives and their components reveals the underlying reason for the observed amide effects on the electronic properties of this polycyclic aromatic hydrocarbon and points to key molecular-design strategies for electronic and energy-conversion systems.
Collapse
Affiliation(s)
- Eli M. Espinoza
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
- Instituto
de Química, Universidade de São
Paulo, Avenida Lineu
Prestes 748, Cidade Universitária, São
Paulo 05508-000, Brazil
| | - John A. Clark
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
| | - James B. Derr
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
| | - Duoduo Bao
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
| | - Boriana Georgieva
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
| | - Frank H. Quina
- Instituto
de Química, Universidade de São
Paulo, Avenida Lineu
Prestes 748, Cidade Universitária, São
Paulo 05508-000, Brazil
- E-mail: (F.H.Q.)
| | - Valentine I. Vullev
- Department
of Chemistry, Department of Bioengineering, Department of Biochemistry, and Materials Science
and Engineering Program, University of California, Riverside, California 92521, United States
- E-mail: (V.I.V.)
| |
Collapse
|
4
|
Shin WH, Christoffer CW, Kihara D. In silico structure-based approaches to discover protein-protein interaction-targeting drugs. Methods 2017; 131:22-32. [PMID: 28802714 PMCID: PMC5683929 DOI: 10.1016/j.ymeth.2017.08.006] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2017] [Revised: 08/08/2017] [Accepted: 08/08/2017] [Indexed: 02/07/2023] Open
Abstract
A core concept behind modern drug discovery is finding a small molecule that modulates a function of a target protein. This concept has been successfully applied since the mid-1970s. However, the efficiency of drug discovery is decreasing because the druggable target space in the human proteome is limited. Recently, protein-protein interaction (PPI) has been identified asan emerging target space for drug discovery. PPI plays a pivotal role in biological pathways including diseases. Current human interactome research suggests that the number of PPIs is between 130,000 and 650,000, and only a small number of them have been targeted as drug targets. For traditional drug targets, in silico structure-based methods have been successful in many cases. However, their performance suffers on PPI interfaces because PPI interfaces are different in five major aspects: From a geometric standpoint, they have relatively large interface regions, flat geometry, and the interface surface shape tends to fluctuate upon binding. Also, their interactions are dominated by hydrophobic atoms, which is different from traditional binding-pocket-targeted drugs. Finally, PPI targets usually lack natural molecules that bind to the target PPI interface. Here, we first summarize characteristics of PPI interfaces and their known binders. Then, we will review existing in silico structure-based approaches for discovering small molecules that bind to PPI interfaces.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA; Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA.
| |
Collapse
|
5
|
Brender JR, Shultis D, Khattak NA, Zhang Y. An Evolution-Based Approach to De Novo Protein Design. Methods Mol Biol 2017; 1529:243-264. [PMID: 27914055 PMCID: PMC5667548 DOI: 10.1007/978-1-4939-6637-0_12] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
EvoDesign is a computational algorithm that allows the rapid creation of new protein sequences that are compatible with specific protein structures. As such, it can be used to optimize protein stability, to resculpt the protein surface to eliminate undesired protein-protein interactions, and to optimize protein-protein binding. A major distinguishing feature of EvoDesign in comparison to other protein design programs is the use of evolutionary information in the design process to guide the sequence search toward native-like sequences known to adopt structurally similar folds as the target. The observed frequencies of amino acids in specific positions in the structure in the form of structural profiles collected from proteins with similar folds and complexes with similar interfaces can implicitly capture many subtle effects that are essential for correct folding and protein-binding interactions. As a result of the inclusion of evolutionary information, the sequences designed by EvoDesign have native-like folding and binding properties not seen by other physics-based design methods. In this chapter, we describe how EvoDesign can be used to redesign proteins with a focus on the computational and experimental procedures that can be used to validate the designs.
Collapse
|
6
|
Abstract
Native proteins perform an amazing variety of biochemical functions, including enzymatic catalysis, and can engage in protein-protein and protein-DNA interactions that are essential for life. A key question is how special are these functional properties of proteins. Are they extremely rare, or are they an intrinsic feature? Comparison to the properties of compact conformations of artificially generated compact protein structures selected for thermodynamic stability but not any type of function, the artificial (ART) protein library, demonstrates that a remarkable number of the properties of native-like proteins are recapitulated. These include the complete set of small molecule ligand-binding pockets and most protein-protein interfaces. ART structures are predicted to be capable of weakly binding metabolites and cover a significant fraction of metabolic pathways, with the most enriched pathways including ancient ones such as glycolysis. Native-like active sites are also found in ART proteins. A small fraction of ART proteins are predicted to have strong protein-protein and protein-DNA interactions. Overall, it appears that biochemical function is an intrinsic feature of proteins which nature has significantly optimized during evolution. These studies raise questions as to the relative roles of specificity and promiscuity in the biochemical function and control of cells that need investigation.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
7
|
Skolnick J, Gao M, Roy A, Srinivasan B, Zhou H. Implications of the small number of distinct ligand binding pockets in proteins for drug discovery, evolution and biochemical function. Bioorg Med Chem Lett 2015; 25:1163-70. [PMID: 25690787 DOI: 10.1016/j.bmcl.2015.01.059] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 01/23/2015] [Accepted: 01/24/2015] [Indexed: 01/05/2023]
Abstract
Coincidence of the properties of ligand binding pockets in native proteins with those in proteins generated by computer simulations without selection for function shows that pockets are a generic protein feature and the number of distinct pockets is small. Similar pockets occur in unrelated protein structures, an observation successfully employed in pocket-based virtual ligand screening. The small number of pockets suggests that off-target interactions among diverse proteins are inherent; kinases, proteases and phosphatases show this prototypical behavior. The ability to repurpose FDA approved drugs is general, and minor side effects cannot be avoided. Finally, the implications to drug discovery are explored.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA.
| | - Mu Gao
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Ambrish Roy
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Bharath Srinivasan
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30318, USA
| |
Collapse
|
8
|
Skolnick J, Gao M, Zhou H. On the role of physics and evolution in dictating protein structure and function. Isr J Chem 2014; 54:1176-1188. [PMID: 25484448 PMCID: PMC4255337 DOI: 10.1002/ijch.201400013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
How many of the structural and functional properties of proteins are inherent? Computer simulations provide a powerful tool to address this question. A series of studies on QS, quasi-spherical, compact polypeptides which lack any secondary structure; ART, artificial, proteins comprised of compact homopolypeptides with protein-like secondary structure; and PDB, native, single domain proteins shows that essentially all native global folds, pockets and protein-protein interfaces are in the ART library. This suggests that many protein properties are inherent and that evolution is involved in fine-tuning. The completeness of the space of ligand binding pockets and protein-protein interfaces suggests that promiscuous interactions are intrinsic to proteins and that the capacity to perform the biochemistry of life at low level does not require evolution. If so, this has profound consequences for the origin of life.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| |
Collapse
|
9
|
Agbo JK, Gnanasekaran R, Leitner DM. Communication Maps: Exploring Energy Transport through Proteins and Water. Isr J Chem 2014. [DOI: 10.1002/ijch.201300139] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
10
|
Guilloux A, Caudron B, Jestin JL. A method to predict edge strands in beta-sheets from protein sequences. Comput Struct Biotechnol J 2013; 7:e201305001. [PMID: 24688737 PMCID: PMC3962219 DOI: 10.5936/csbj.201305001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/27/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022] Open
Abstract
There is a need for rules allowing three-dimensional structure information to be derived from protein sequences. In this work, consideration of an elementary protein folding step allows protein sub-sequences which optimize folding to be derived for any given protein sequence. Classical mechanics applied to this system and the energy conservation law during the elementary folding step yields an equation whose solutions are taken over the field of rational numbers. This formalism is applied to beta-sheets containing two edge strands and at least two central strands. The number of protein sub-sequences optimized for folding per amino acid in beta-strands is shown in particular to predict edge strands from protein sequences. Topological information on beta-strands and loops connecting them is derived for protein sequences with a prediction accuracy of 75%. The statistical significance of the finding is given. Applications in protein structure prediction are envisioned such as for the quality assessment of protein structure models.
Collapse
Affiliation(s)
- Antonin Guilloux
- Analyse algébrique, Institut de Mathématiques de Jussieu, Université Pierre et Marie Curie, Paris VI, France
| | - Bernard Caudron
- Centre d'Informatique pour la Biologie, Institut Pasteur, Paris, France
| | | |
Collapse
|
11
|
Interplay of physics and evolution in the likely origin of protein biochemical function. Proc Natl Acad Sci U S A 2013; 110:9344-9. [PMID: 23690621 DOI: 10.1073/pnas.1300011110] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The intrinsic ability of protein structures to exhibit the geometric and sequence properties required for ligand binding without evolutionary selection is shown by the coincidence of the properties of pockets in native, single domain proteins with those in computationally generated, compact homopolypeptide, artificial (ART) structures. The library of native pockets is covered by a remarkably small number of representative pockets (∼400), with virtually every native pocket having a statistically significant match in the ART library, suggesting that the library is complete. When sequences are selected for ART structures based on fold stability, pocket sequence conservation is coincident to native. The fact that structurally and sequentially similar pockets occur across fold classes combined with the small number of representative pockets in native proteins implies that promiscuous interactions are inherent to proteins. Based on comparison of PDB (real, single domain protein structures found in the Protein Data Bank) and ART structures and pockets, the widespread assumption that the co-occurrence of global structure, pocket similarity, and amino acid conservation demands an evolutionary relationship between proteins is shown to significantly underestimate the random background probability. Indeed, many features of biochemical function arise from the physical properties of proteins that evolution likely fine-tunes to achieve specificity. Finally, our study suggests that a repertoire of thermodynamically (marginally) stable proteins could engage in many of the biochemical reactions needed for living systems without selection for function, a conclusion with significant implications for the origin of life.
Collapse
|
12
|
Fawcett TM, Irausquin SJ, Simin M, Valafar H. An artificial neural network approach to improving the correlation between protein energetics and the backbone structure. Proteomics 2012. [PMID: 23184572 DOI: 10.1002/pmic.201200330] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Computational approaches to modeling protein structures have made significant advances over the past decade. However, the current limitation in modeling protein structures is to produce protein structures consistently below the limit of 6 Å compared to their native structure. Therefore, improvement of protein structures consistently below the 6 Å limit using simulation of biophysical forces is of significant interest. Current protein force fields such as those implemented in CHARMM, AMBER, and NAMD have been deemed complete, yet their use in ab initio approaches to protein structure determination has been unsuccessful. Here, we introduce a new approach in evaluation of protein structures based on analysis of energy profiles produced by the SCOPE software package. The latest version of SCOPE produces a hydrogen bond profile that is substantially more informative than a single hydrogen bond energy value. We demonstrate how analysis of SCOPE's energy profile by an artificial neural network shows a significant improvement compared to the traditional force-based approaches to evaluation of structures. The artificial neural network based analysis of SCOPE's energy profile showed identification of structures to within the range of 1.5-3.0 Å of the native structure. These results have been obtained by testing structures in the same Homology, Topology, Architecture, or Class of the CATH family.
Collapse
Affiliation(s)
- Timothy M Fawcett
- Computer Science & Engineering, University of South Carolina, Columbia, SC 29208, USA
| | | | | | | |
Collapse
|
13
|
The distribution of ligand-binding pockets around protein-protein interfaces suggests a general mechanism for pocket formation. Proc Natl Acad Sci U S A 2012; 109:3784-9. [PMID: 22355140 DOI: 10.1073/pnas.1117768109] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein-protein and protein-ligand interactions are ubiquitous in a biological cell. Here, we report a comprehensive study of the distribution of protein-ligand interaction sites, namely ligand-binding pockets, around protein-protein interfaces where protein-protein interactions occur. We inspected a representative set of 1,611 representative protein-protein complexes and identified pockets with a potential for binding small molecule ligands. The majority of these pockets are within a 6 Å distance from protein interfaces. Accordingly, in about half of ligand-bound protein-protein complexes, amino acids from both sides of a protein interface are involved in direct contacts with at least one ligand. Statistically, ligands are closer to a protein-protein interface than a random surface patch of the same solvent accessible surface area. Similar results are obtained in an analysis of the ligand distribution around domain-domain interfaces of 1,416 nonredundant, two-domain protein structures. Furthermore, comparable sized pockets as observed in experimental structures are present in artificially generated protein complexes, suggesting that the prominent appearance of pockets around protein interfaces is mainly a structural consequence of protein packing and thus, is an intrinsic geometric feature of protein structure. Nature may take advantage of such a structural feature by selecting and further optimizing for biological function. We propose that packing nearby protein-protein or domain-domain interfaces is a major route to the formation of ligand-binding pockets.
Collapse
|
14
|
Skolnick J, Zhou H, Brylinski M. Further evidence for the likely completeness of the library of solved single domain protein structures. J Phys Chem B 2012; 116:6654-64. [PMID: 22272723 DOI: 10.1021/jp211052j] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recent studies questioned whether the Protein Data Bank (PDB) contains all compact, single domain protein structures. Here, we show that all quasi-spherical, QS, random protein structures devoid of secondary structure are in the PDB and are excellent templates for all native PDB proteins up to 250 residues. Because QS templates have a similar global contour as native, TASSER can refine 98% (90%) of those whose TM-score is 0.4 (0.35) to structures greater than or equal to the 0.5 TM-score threshold (0.74 (0.64) mean TM-score) for CATH/SCOP assignment. On the basis of this and the fact that, at a TM-score of 0.4, 83% (90%) of all (internal) core secondary structure elements are recovered, a 0.40 TM-score is an appropriate fold similarity assignment threshold. Despite the claims of Taylor, Trovato, and Zhou that many of their structures lack a PDB counterpart, using fr-TM-align, at a 0.45 (0.5) TM-score threshold, essentially all (most) are found in the PDB. Thus, the conclusion that the PDB is likely complete is further supported.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, Georgia 30318, USA.
| | | | | |
Collapse
|
15
|
Brylinski M, Gao M, Skolnick J. Why not consider a spherical protein? Implications of backbone hydrogen bonding for protein structure and function. Phys Chem Chem Phys 2011; 13:17044-55. [PMID: 21655593 DOI: 10.1039/c1cp21140d] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The intrinsic ability of protein structures to exhibit the geometric features required for molecular function in the absence of evolution is examined in the context of three systems: the reference set of real, single domain protein structures, a library of computationally generated, compact homopolypeptides, artificial structures with protein-like secondary structural elements, and quasi-spherical random proteins packed at the same density as proteins but lacking backbone secondary structure and hydrogen bonding. Without any evolutionary selection, the library of artificial structures has similar backbone hydrogen bonding, global shape, surface to volume ratio and statistically significant structural matches to real protein global structures. Moreover, these artificial structures have native like ligand binding cavities, and a tiny subset has interfacial geometries consistent with native-like protein-protein interactions and DNA binding. In contrast, the quasi-spherical random proteins, being devoid of secondary structure, have a lower surface to volume ratio and lack ligand binding pockets and intermolecular interaction interfaces. Surprisingly, these quasi-spherical random proteins exhibit protein like distributions of virtual bond angles and almost all have a statistically significant structural match to real protein structures. This implies that it is local chain stiffness, even without backbone hydrogen bonding, and compactness that give rise to the likely completeness of the library solved single domain protein structures. These studies also suggest that the packing of secondary structural elements generates the requisite geometry for intermolecular binding. Thus, backbone hydrogen bonding plays an important role not only in protein structure but also in protein function. Such ability to bind biological molecules is an inherent feature of protein structure; if combined with appropriate protein sequences, it could provide the non-zero background probability for low-level function that evolution requires for selection to occur.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30076, USA
| | | | | |
Collapse
|