1
|
Salazar-Ciudad I, Cano-Fernández H. Evo-devo beyond development: Generalizing evo-devo to all levels of the phenotypic evolution. Bioessays 2023; 45:e2200205. [PMID: 36739577 DOI: 10.1002/bies.202200205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 12/25/2022] [Accepted: 01/12/2023] [Indexed: 02/06/2023]
Abstract
A foundational idea of evo-devo is that morphological variation is not isotropic, that is, it does not occur in all directions. Instead, some directions of morphological variation are more likely than others from DNA-level variation and these largely depend on development. We argue that this evo-devo perspective should apply not only to morphology but to evolution at all phenotypic levels. At other phenotypic levels there is no development, but there are processes that can be seen, in analogy to development, as constructing the phenotype (e.g., protein folding, learning for behavior, etc.). We argue that to explain the direction of evolution two types of arguments need to be combined: generative arguments about which phenotypic variation arises in each generation and selective arguments about which of it passes to the next generation. We explain how a full consideration of the two types of arguments improves the explanatory power of evolutionary theory. Also see the video abstract here: https://youtu.be/Egbvma_uaKc.
Collapse
Affiliation(s)
- Isaac Salazar-Ciudad
- Centre de Recerca Matemàtica, Cerdanyola del Vallès, Spain.,Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Hugo Cano-Fernández
- Genomics, Bioinformatics and Evolution, Departament de Genètica i Microbiologia, Universitat Autònoma de Barcelona, Barcelona, Spain
| |
Collapse
|
2
|
Finkelstein AV, Bogatyreva NS, Ivankov DN, Garbuzynskiy SO. Protein folding problem: enigma, paradox, solution. Biophys Rev 2022; 14:1255-1272. [PMID: 36659994 PMCID: PMC9842845 DOI: 10.1007/s12551-022-01000-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 09/19/2022] [Indexed: 01/22/2023] Open
Abstract
The ability of protein chains to spontaneously form their three-dimensional structures is a long-standing mystery in molecular biology. The most conceptual aspect of this mystery is how the protein chain can find its native, "working" spatial structure (which, for not too big protein chains, corresponds to the global free energy minimum) in a biologically reasonable time, without exhaustive enumeration of all possible conformations, which would take billions of years. This is the so-called "Levinthal's paradox." In this review, we discuss the key ideas and discoveries leading to the current understanding of protein folding kinetics, including folding landscapes and funnels, free energy barriers at the folding/unfolding pathways, and the solution of Levinthal's paradox. A special role here is played by the "all-or-none" phase transition occurring at protein folding and unfolding and by the point of thermodynamic (and kinetic) equilibrium between the "native" and the "unfolded" phases of the protein chain (where the theory obtains the simplest form). The modern theory provides an understanding of key features of protein folding and, in good agreement with experiments, it (i) outlines the chain length-dependent range of protein folding times, (ii) predicts the observed maximal size of "foldable" proteins and domains. Besides, it predicts the maximal size of proteins and domains that fold under solely thermodynamic (rather than kinetic) control. Complementarily, a theoretical analysis of the number of possible protein folding patterns, performed at the level of formation and assembly of secondary structures, correctly outlines the upper limit of protein folding times.
Collapse
Affiliation(s)
- Alexei V. Finkelstein
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
- Biotechnology Department of the Lomonosov Moscow State University, 4 Institutskaya Str, 142290 Pushchino, Moscow Region, Russia
- Biology Department of the Lomonosov Moscow State University, 1-12 Leninskie Gory, 119991 Moscow, Russia
| | - Natalya S. Bogatyreva
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | - Dmitry N. Ivankov
- Center of Life Sciences, Skolkovo Institute of Science and Technology, 121205 Moscow, Russia
| | - Sergiy O. Garbuzynskiy
- Institute of Protein Research of the Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| |
Collapse
|
3
|
Robson B. De novo protein folding on computers. Benefits and challenges. Comput Biol Med 2022; 143:105292. [PMID: 35158120 DOI: 10.1016/j.compbiomed.2022.105292] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 01/05/2023]
Abstract
There has been recent success in prediction of the three-dimensional folded native structures of proteins, most famously by the AlphaFold Algorithm running on Google's/Alphabet's DeepMind computer. However, this largely involves machine learning of protein structures and is not a de novo protein structure prediction method for predicting three-dimensional structures from amino acid residue sequences. A de novo approach would be based almost entirely on general principles of energy and entropy that govern protein folding energetics, and importantly do so without the use of the amino acid sequences and structural features of other proteins. Most consider that problem as still unsolved even though it has occupied leading scientists for decades. Many consider that it remains one of the major outstanding issues in modern science. There is crucial continuing help from experimental findings on protein unfolding and refolding in the laboratory, but only to a limited extent because many researchers consider that the speed by which real proteins folds themselves, often from milliseconds to minutes, is itself still not fully understood. This is unfortunate, because a practical solution to the problem would probably have a major effect on personalized medicine, the pharmaceutical industry, biotechnology, and nanotechnology, including for example "smaller" tasks such as better modeling of flexible "unfolded" regions of the SARS-COV-2 spike glycoprotein when interacting with its cell receptor, antibodies, and therapeutic agents. Some important ideas from earlier studies are given before moving on to lessons from periodic and aperiodic crystals, and a possible role for quantum phenomena. The conclusion is that better computation of entropy should be the priority, though that is presented guardedly.
Collapse
Affiliation(s)
- Barry Robson
- Ingine Inc.Cleveland Ohio and The Dirac Foundation, Oxfordshire, UK.
| |
Collapse
|
4
|
Hassan S, Sudhakar V, Nancy Mary MB, Babu R, Doble M, Dadar M, Hanna LE. Computational approach identifies protein off-targets for Isoniazid-NAD adduct: hypothesizing a possible drug resistance mechanism in Mycobacterium tuberculosis. J Biomol Struct Dyn 2019; 38:1697-1710. [PMID: 31094664 DOI: 10.1080/07391102.2019.1615987] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Isoniazid is an important antitubercular molecule identified as a drug of choice in tuberculosis treatment. As such, INH is an inactive prodrug; it acquires an active conformation by forming an adduct with NAD. The adduct targets inhA protein, a reductase responsible for fatty acid chain elongation in the cell wall of Mycobacterium tuberculosis. Resistance to INH is majorly contributed by mutations in inhA, katG and geneic and non-geneic regions associated with efflux genes. Despite being widespread, the mechanism of resistance remains unknown in ∼15% of INH-resistant strains. Studies report that an intracellular increase in NADH concentration prevents inhA inhibition, leading to INH resistance. In the pursuit of finding possible resistance mechanisms, we set out to find NAD binding proteins to explore similarities in structure and NAD binding property of these proteins with that of inhA. We identified 172 NAD binding proteins, of which 53 were identified to have sequence or structural similarity to inhA. By performing docking analysis on selected proteins, we identified INH-adduct to have good binding affinity despite very minimal structural similarity to inhA. This analysis was further supported by principal component analysis, which identified 65 proteins with NAD binding conformation similar to that of inhA. These findings prompt us to hypothesize that upon exposure to INH, bacteria tries to reduce inhA susceptibility by inducing expression of these NAD binding proteins through increase in NADH concentration. This in turn favours off-target binding and leads to decreased binding and potency of INH, thus contributing indirectly to INH resistance.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Sameer Hassan
- Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden.,Department of HIV, National Institute for Research in Tuberculosis, Chennai, Tamil Nadu, India
| | - Vaishnavi Sudhakar
- Department of HIV, National Institute for Research in Tuberculosis, Chennai, Tamil Nadu, India
| | - M Benita Nancy Mary
- Department of HIV, National Institute for Research in Tuberculosis, Chennai, Tamil Nadu, India
| | - Rajeshwari Babu
- Department of HIV, National Institute for Research in Tuberculosis, Chennai, Tamil Nadu, India
| | - Mukesh Doble
- Department of Biotechnology, Indian Institute of Technology, Chennai, Tamil Nadu, India
| | - Maryam Dadar
- Education and Extension Organization, Razi Vaccine and Serum Research Institute, Agricultural Research, Karaj, Iran
| | - Luke Elizabeth Hanna
- Department of HIV, National Institute for Research in Tuberculosis, Chennai, Tamil Nadu, India
| |
Collapse
|
5
|
Insight into the functional and structural transition of garlic phytocystatin induced by urea and guanidine hydrochloride: A comparative biophysical study. Int J Biol Macromol 2018; 106:20-29. [DOI: 10.1016/j.ijbiomac.2017.07.172] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2017] [Revised: 07/20/2017] [Accepted: 07/29/2017] [Indexed: 01/29/2023]
|
6
|
Shukla VK, Singh JS, Vispute N, Ahmad B, Kumar A, Hosur RV. Unfolding of CPR3 Gets Initiated at the Active Site and Proceeds via Two Intermediates. Biophys J 2017; 112:605-619. [PMID: 28256221 DOI: 10.1016/j.bpj.2016.12.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Revised: 12/01/2016] [Accepted: 12/13/2016] [Indexed: 12/29/2022] Open
Abstract
Cyclophilin catalyzes the ubiquitous process "peptidyl-prolyl cis-trans isomerization," which plays a key role in protein folding, regulation, and function. Here, we present a detailed characterization of the unfolding of yeast mitochondrial cyclophilin (CPR3) induced by urea. It is seen that CPR3 unfolding is reversible and proceeds via two intermediates, I1 and I2. The I1 state has native-like secondary structure and shows strong anilino-8-naphthalenesulphonate binding due to increased exposure of the solvent-accessible cluster of non-polar groups. Thus, it has some features of a molten globule. The I2 state is more unfolded, but it retains some residual secondary structure, and shows weak anilino-8-naphthalenesulphonate binding. Chemical shift perturbation analysis by 1H-15N heteronuclear single quantum coherence spectra reveals disruption of the tertiary contacts among the regions close to the active site in the first step of unfolding, i.e., the N-I1 transition. Both of the intermediates, I1 and I2, showed a propensity to self-associate under stirring conditions, but their kinetic profiles are different; the native protein did not show any such tendency under the same conditions. All these observations could have significant implications for the function of the protein.
Collapse
Affiliation(s)
- Vaibhav Kumar Shukla
- UM-DAE-Centre for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Mumbai, India
| | - Jai Shankar Singh
- Department of Biosciences and Bioengineering, Indian Institute of Technology, Mumbai, India
| | - Neha Vispute
- UM-DAE-Centre for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Mumbai, India
| | - Basir Ahmad
- UM-DAE-Centre for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Mumbai, India
| | - Ashutosh Kumar
- Department of Biosciences and Bioengineering, Indian Institute of Technology, Mumbai, India.
| | - Ramakrishna V Hosur
- UM-DAE-Centre for Excellence in Basic Sciences, University of Mumbai, Kalina Campus, Mumbai, India; Department of Chemical Sciences, Tata Institute of Fundamental Research, Mumbai, India.
| |
Collapse
|
7
|
Finkelstein AV, Badretdin AJ, Galzitskaya OV, Ivankov DN, Bogatyreva NS, Garbuzynskiy SO. There and back again: Two views on the protein folding puzzle. Phys Life Rev 2017; 21:56-71. [PMID: 28190683 DOI: 10.1016/j.plrev.2017.01.025] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Revised: 01/05/2017] [Accepted: 01/19/2017] [Indexed: 02/08/2023]
Abstract
The ability of protein chains to spontaneously form their spatial structures is a long-standing puzzle in molecular biology. Experimentally measured folding times of single-domain globular proteins range from microseconds to hours: the difference (10-11 orders of magnitude) is the same as that between the life span of a mosquito and the age of the universe. This review describes physical theories of rates of overcoming the free-energy barrier separating the natively folded (N) and unfolded (U) states of protein chains in both directions: "U-to-N" and "N-to-U". In the theory of protein folding rates a special role is played by the point of thermodynamic (and kinetic) equilibrium between the native and unfolded state of the chain; here, the theory obtains the simplest form. Paradoxically, a theoretical estimate of the folding time is easier to get from consideration of protein unfolding (the "N-to-U" transition) rather than folding, because it is easier to outline a good unfolding pathway of any structure than a good folding pathway that leads to the stable fold, which is yet unknown to the folding protein chain. And since the rates of direct and reverse reactions are equal at the equilibrium point (as follows from the physical "detailed balance" principle), the estimated folding time can be derived from the estimated unfolding time. Theoretical analysis of the "N-to-U" transition outlines the range of protein folding rates in a good agreement with experiment. Theoretical analysis of folding (the "U-to-N" transition), performed at the level of formation and assembly of protein secondary structures, outlines the upper limit of protein folding times (i.e., of the time of search for the most stable fold). Both theories come to essentially the same results; this is not a surprise, because they describe overcoming one and the same free-energy barrier, although the way to the top of this barrier from the side of the unfolded state is very different from the way from the side of the native state; and both theories agree with experiment. In addition, they predict the maximal size of protein domains that fold under solely thermodynamic (rather than kinetic) control and explain the observed maximal size of the "foldable" protein domains.
Collapse
Affiliation(s)
- Alexei V Finkelstein
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation.
| | - Azat J Badretdin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Oxana V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation
| | - Dmitry N Ivankov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation; Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Natalya S Bogatyreva
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation; Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
| | - Sergiy O Garbuzynskiy
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russian Federation
| |
Collapse
|
8
|
Structural basis of urea-induced unfolding: Unraveling the folding pathway of hemochromatosis factor E. Int J Biol Macromol 2016; 91:1051-61. [DOI: 10.1016/j.ijbiomac.2016.06.055] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Revised: 06/16/2016] [Accepted: 06/17/2016] [Indexed: 12/16/2022]
|
9
|
Minami S, Sawada K, Chikenji G. How a spatial arrangement of secondary structure elements is dispersed in the universe of protein folds. PLoS One 2014; 9:e107959. [PMID: 25243952 PMCID: PMC4171485 DOI: 10.1371/journal.pone.0107959] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Accepted: 08/18/2014] [Indexed: 11/18/2022] Open
Abstract
It has been known that topologically different proteins of the same class sometimes share the same spatial arrangement of secondary structure elements (SSEs). However, the frequency by which topologically different structures share the same spatial arrangement of SSEs is unclear. It is important to estimate this frequency because it provides both a deeper understanding of the geometry of protein folds and a valuable suggestion for predicting protein structures with novel folds. Here we clarified the frequency with which protein folds share the same SSE packing arrangement with other folds, the types of spatial arrangement of SSEs that are frequently observed across different folds, and the diversity of protein folds that share the same spatial arrangement of SSEs with a given fold, using a protein structure alignment program MICAN, which we have been developing. By performing comprehensive structural comparison of SCOP fold representatives, we found that approximately 80% of protein folds share the same spatial arrangement of SSEs with other folds. We also observed that many protein pairs that share the same spatial arrangement of SSEs belong to the different classes, often with an opposing N- to C-terminal direction of the polypeptide chain. The most frequently observed spatial arrangement of SSEs was the 2-layer α/β packing arrangement and it was dispersed among as many as 27% of SCOP fold representatives. These results suggest that the same spatial arrangements of SSEs are adopted by a wide variety of different folds and that the spatial arrangement of SSEs is highly robust against the N- to C-terminal direction of the polypeptide chain.
Collapse
Affiliation(s)
- Shintaro Minami
- Department of Complex Systems Science, Nagoya University, Nagoya, Aichi, Japan
| | - Kengo Sawada
- Department of Applied Physics, Nagoya University, Nagoya, Aichi, Japan
| | - George Chikenji
- Department of Computational Science and Engineering, Nagoya University, Nagoya, Aichi, Japan
- * E-mail:
| |
Collapse
|
10
|
Andreeva A, Howorth D, Chothia C, Kulesha E, Murzin AG. SCOP2 prototype: a new approach to protein structure mining. Nucleic Acids Res 2013; 42:D310-4. [PMID: 24293656 PMCID: PMC3964979 DOI: 10.1093/nar/gkt1242] [Citation(s) in RCA: 198] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
We present a prototype of a new structural classification of proteins, SCOP2 (http://scop2.mrc-lmb.cam.ac.uk/), that we have developed recently. SCOP2 is a successor to the Structural Classification of Proteins (SCOP, http://scop.mrc-lmb.cam.ac.uk/scop/) database. Similarly to SCOP, the main focus of SCOP2 is to organize structurally characterized proteins according to their structural and evolutionary relationships. SCOP2 was designed to provide a more advanced framework for protein structure annotation and classification. It defines a new approach to the classification of proteins that is essentially different from SCOP, but retains its best features. The SCOP2 classification is described in terms of a directed acyclic graph in which nodes form a complex network of many-to-many relationships and are represented by a region of protein structure and sequence. The new classification project is expected to ensure new advances in the field and open new areas of research.
Collapse
Affiliation(s)
- Antonina Andreeva
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge, CB2 0QH, UK and European Bioinformatics Institute, Hinxton, Cambridge, CB10 1SD, UK
| | | | | | | | | |
Collapse
|
11
|
Abstract
Rhodopsins are photochemically reactive membrane proteins that covalently bind retinal chromophores. Type I rhodopsins are found in both prokaryotes and eukaryotic microbes, whereas type II rhodopsins function as photoactivated G-protein coupled receptors (GPCRs) in animal vision. Both rhodopsin families share the seven transmembrane α-helix GPCR fold and a Schiff base linkage from a conserved lysine to retinal in helix G. Nevertheless, rhodopsins are widely cited as a striking example of evolutionary convergence, largely because the two families lack detectable sequence similarity and differ in many structural and mechanistic details. Convergence entails that the shared rhodopsin fold is so especially suited to photosensitive function that proteins from separate origins were selected for this architecture twice. Here we show, however, that the rhodopsin fold is not required for photosensitive activity. We engineered functional bacteriorhodopsin variants with novel folds, including radical noncircular permutations of the α-helices, circular permutations of an eight-helix construct, and retinal linkages relocated to other helices. These results contradict a key prediction of convergence and thereby provide an experimental attack on one of the most intractable problems in molecular evolution: how to establish structural homology for proteins devoid of discernible sequence similarity.
Collapse
|
12
|
Arumugam G, Nair AG, Hariharaputran S, Ramanathan S. Rebelling for a reason: protein structural "outliers". PLoS One 2013; 8:e74416. [PMID: 24073209 PMCID: PMC3779223 DOI: 10.1371/journal.pone.0074416] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2013] [Accepted: 07/31/2013] [Indexed: 11/29/2022] Open
Abstract
Analysis of structural variation in domain superfamilies can reveal constraints in protein evolution which aids protein structure prediction and classification. Structure-based sequence alignment of distantly related proteins, organized in PASS2 database, provides clues about structurally conserved regions among different functional families. Some superfamily members show large structural differences which are functionally relevant. This paper analyses the impact of structural divergence on function for multi-member superfamilies, selected from the PASS2 superfamily alignment database. Functional annotations within superfamilies, with structural outliers or 'rebels', are discussed in the context of structural variations. Overall, these data reinforce the idea that functional similarities cannot be extrapolated from mere structural conservation. The implication for fold-function prediction is that the functional annotations can only be inherited with very careful consideration, especially at low sequence identities.
Collapse
Affiliation(s)
- Gandhimathi Arumugam
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Gandhi Krishi Vigyana Kendra Campus, Bangalore, India
| | - Anu G. Nair
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Gandhi Krishi Vigyana Kendra Campus, Bangalore, India
| | - Sridhar Hariharaputran
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Gandhi Krishi Vigyana Kendra Campus, Bangalore, India
| | - Sowdhamini Ramanathan
- National Centre for Biological Sciences, Tata Institute of Fundamental Research, Gandhi Krishi Vigyana Kendra Campus, Bangalore, India
| |
Collapse
|
13
|
Application of improved three-dimensional kernel approach to prediction of protein structural class. BIOMED RESEARCH INTERNATIONAL 2013; 2013:625403. [PMID: 23878814 PMCID: PMC3708390 DOI: 10.1155/2013/625403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2013] [Revised: 05/04/2013] [Accepted: 05/10/2013] [Indexed: 11/25/2022]
Abstract
Kernel methods, such as kernel PCA, kernel PLS, and support vector machines, are widely known machine learning techniques in biology, medicine, chemistry, and material science. Based on nonlinear mapping and Coulomb function, two 3D kernel approaches were improved and applied to predictions of the four protein tertiary structural classes of domains (all-α, all-β, α/β, and α + β) and five membrane protein types with satisfactory results. In a benchmark test, the performances of improved 3D kernel approach were compared with those of neural networks, support vector machines, and ensemble algorithm. Demonstration through leave-one-out cross-validation on working datasets constructed by investigators indicated that new kernel approaches outperformed other predictors. It has not escaped our notice that 3D kernel approaches may hold a high potential for improving the quality in predicting the other protein features as well. Or at the very least, it will play a complementary role to many of the existing algorithms in this regard.
Collapse
|
14
|
Abstract
The metabolic pathway called the arachidonic acid cascade produces a wide range of eicosanoids, such as prostaglandins, thromboxanes and leukotrienes with potent biological activities. Recombinant DNA techniques have made it possible to determine the nucleotide sequences of cDNAs and/or genomic structures for the enzymes involved in the pathway. Sequence comparison analyses of the accumulated sequence data have brought great insights into the structure, function and molecular evolution of the enzymes. This paper reviews the sequence comparison analyses of the enzymes involved in the arachidonic acid cascade.
Collapse
|
15
|
Chakraborty J, Dutta TK. From lipid transport to oxygenation of aromatic compounds: evolution within the Bet v1-like superfamily. J Biomol Struct Dyn 2011; 29:67-78. [PMID: 21696226 DOI: 10.1080/07391102.2011.10507375] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
In absence of significant sequence similarity, remote homology between proteins can be confused with analogy and in such a case, shared ancestry can be inferred in light of certain unique and common features. In the present study, to understand the evolutionary origin of catalytic domain of large subunit of ring-hydroxylating oxygenases (RHOs), belonging to the Bet v1-like superfamily, structure-based phylogenies have been derived from structural alignment of representative proteins of the superfamily. A careful inspection of the structural relatedness of RHOs with the rest of the families showed closest similarity between RHO catalytic domain and PA1206-like protein. In addition, phylogenetic relationship of the Rieske domain of the large subunit of RHOs with functionally and structurally similar proteins has also been elucidated so as to postulate the most possible events leading to the genesis of the large subunit of RHOs.
Collapse
Affiliation(s)
- Joydeep Chakraborty
- Department of Microbiology, Bose Institute, P-1/12 C.I.T. Scheme VII M, Kolkata 700054, India
| | | |
Collapse
|
16
|
SHERIDAN ROBERTP, DIXON JSCOTT, VENKATARAGHAVAN R. Generating plausible protein folds by secondary structure similarity. ACTA ACUST UNITED AC 2009. [DOI: 10.1111/j.1399-3011.1985.tb02156.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
17
|
Giuseppe PO, Neves FO, Nascimento ALTO, Guimarães BG. The leptospiral antigen Lp49 is a two-domain protein with putative protein binding function. J Struct Biol 2008; 163:53-60. [PMID: 18508281 DOI: 10.1016/j.jsb.2008.04.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2008] [Revised: 04/07/2008] [Accepted: 04/08/2008] [Indexed: 11/18/2022]
Abstract
Pathogenic Leptospira is the etiological agent of leptospirosis, a life-threatening disease that affects populations worldwide. Currently available vaccines have limited effectiveness and therapeutic interventions are complicated by the difficulty in making an early diagnosis of leptospirosis. The genome of Leptospira interrogans was recently sequenced and comparative genomic analysis contributed to the identification of surface antigens, potential candidates for development of new vaccines and serodiagnosis. Lp49 is a membrane-associated protein recognized by antibodies present in sera from early and convalescent phases of leptospirosis patients. Its crystal structure was determined by single-wavelength anomalous diffraction using selenomethionine-labelled crystals and refined at 2.0 A resolution. Lp49 is composed of two domains and belongs to the all-beta-proteins class. The N-terminal domain folds in an immunoglobulin-like beta-sandwich structure, whereas the C-terminal domain presents a seven-bladed beta-propeller fold. Structural analysis of Lp49 indicates putative protein-protein binding sites, suggesting a role in Leptospira-host interaction. This is the first crystal structure of a leptospiral antigen described to date.
Collapse
Affiliation(s)
- Priscila Oliveira Giuseppe
- Centro de Biologia Molecular Estrutural, Laboratório Nacional de Luz Síncrotron, Rua Giuseppe Máximo Scolfaro 10000, PO Box 6192, Campinas 13083-970, SP, Brazil
| | | | | | | |
Collapse
|
18
|
Maggiora GM, Mao B, Chou KC, Narasimhan SL. Theoretical and empirical approaches to protein-structure prediction and analysis. METHODS OF BIOCHEMICAL ANALYSIS 2006; 35:1-86. [PMID: 2002769 DOI: 10.1002/9780470110560.ch1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
19
|
Dryla A, Hoffmann B, Gelbmann D, Giefing C, Hanner M, Meinke A, Anderson AS, Koppensteiner W, Konrat R, von Gabain A, Nagy E. High-affinity binding of the staphylococcal HarA protein to haptoglobin and hemoglobin involves a domain with an antiparallel eight-stranded beta-barrel fold. J Bacteriol 2006; 189:254-64. [PMID: 17041047 PMCID: PMC1797202 DOI: 10.1128/jb.01366-06] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Iron scavenging from the host is essential for the growth of pathogenic bacteria. In this study, we further characterized two staphylococcal cell wall proteins previously shown to bind hemoproteins. HarA and IsdB harbor homologous ligand binding domains, the so called NEAT domain (for "near transporter") present in several surface proteins of gram-positive pathogens. Surface plasmon resonance measurements using glutathione S-transferase (GST)-tagged HarAD1, one of the ligand binding domains of HarA, and GST-tagged full-length IsdB proteins confirmed high-affinity binding to hemoglobin and haptoglobin-hemoglobin complexes with equilibrium dissociation constants (K(D)) of 5 to 50 nM. Haptoglobin binding could be detected only with HarA and was in the low micromolar range. In order to determine the fold of this evolutionarily conserved ligand binding domain, the untagged HarAD1 protein was subjected to nuclear magnetic resonance spectroscopy, which revealed an eight-stranded, purely antiparallel beta-barrel with the strand order (-beta1 -beta2 -beta3 -beta6 -beta5 -beta4 -beta7 -beta8), forming two Greek key motifs. Based on structural-homology searches, the topology of the HarAD1 domain resembles that of the immunoglobulin (Ig) fold family, whose members are involved in protein-protein interactions, but with distinct structural features. Therefore, we consider that the HarAD1/NEAT domain fold is a novel variant of the Ig fold that has not yet been observed in other proteins.
Collapse
|
20
|
Chou KC, Shen HB. Predicting Eukaryotic Protein Subcellular Location by Fusing Optimized Evidence-Theoretic K-Nearest Neighbor Classifiers. J Proteome Res 2006; 5:1888-97. [PMID: 16889410 DOI: 10.1021/pr060167c] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Facing the explosion of newly generated protein sequences in the post genomic era, we are challenged to develop an automated method for fast and reliably annotating their subcellular locations. Knowledge of subcellular locations of proteins can provide useful hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both expensive and time-consuming to determine the localization of an uncharacterized protein in a living cell purely based on experiments. To tackle the challenge, a novel hybridization classifier was developed by fusing many basic individual classifiers through a voting system. The "engine" of these basic classifiers was operated by the OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbor) rule. As a demonstration, predictions were performed with the fusion classifier for proteins among the following 16 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) lysosome, (11) mitochondria, (12) nucleus, (13) peroxisome, (14) plasma membrane, (15) plastid, and (16) vacuole. To get rid of redundancy and homology bias, none of the proteins investigated here had >/=25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jack-knife cross-validation test and independent dataset test were 81.6% and 83.7%, respectively, which were 46 approximately 63% higher than those performed by the other existing methods on the same benchmark datasets. Also, it is clearly elucidated that the overwhelmingly high success rates obtained by the fusion classifier is by no means a trivial utilization of the GO annotations as prone to be misinterpreted because there is a huge number of proteins with given accession numbers and the corresponding GO numbers, but their subcellular locations are still unknown, and that the percentage of proteins with GO annotations indicating their subcellular components is even less than the percentage of proteins with known subcellular location annotation in the Swiss-Prot database. It is anticipated that the powerful fusion classifier may also become a very useful high throughput tool in characterizing other attributes of proteins according to their sequences, such as enzyme class, membrane protein type, and nuclear receptor subfamily, among many others. A web server, called "Euk-OET-PLoc", has been designed at http://202.120.37.186/bioinf/euk-oet for public to predict subcellular locations of eukaryotic proteins by the fusion OET-KNN classifier.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Gordon Life Science Institute, 13784 Torrey Del Mar Drive, San Diego, California 92130, USA.
| | | |
Collapse
|
21
|
Panchenko AR, Wolf YI, Panchenko LA, Madej T. Evolutionary plasticity of protein families: coupling between sequence and structure variation. Proteins 2006; 61:535-44. [PMID: 16184609 PMCID: PMC1941674 DOI: 10.1002/prot.20644] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In this work we examine how protein structural changes are coupled with sequence variation in the course of evolution of a family of homologs. The sequence-structure correlation analysis performed on 81 homologous protein families shows that the majority of them exhibit statistically significant linear correlation between the measures of sequence and structural similarity. We observed, however, that there are cases where structural variability cannot be mainly explained by sequence variation, such as protein families with a number of disulfide bonds. To understand whether structures from different families and/or folds evolve in the same manner, we compared the degrees of structural change per unit of sequence change ("the evolutionary plasticity of structure") between those families with a significant linear correlation. Using rigorous statistical procedures we find that, with a few exceptions, evolutionary plasticity does not show a statistically significant difference between protein families. Similar sequence-structure analysis performed for protein loop regions shows that evolutionary plasticity of loop regions is greater than for the protein core.
Collapse
Affiliation(s)
- Anna R Panchenko
- Computational Biology Branch, National Center for Biotechnology Information, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | | | | | |
Collapse
|
22
|
Kister AE, Fokas AS, Papatheodorou TS, Gelfand IM. Strict rules determine arrangements of strands in sandwich proteins. Proc Natl Acad Sci U S A 2006; 103:4107-10. [PMID: 16537492 PMCID: PMC1449654 DOI: 10.1073/pnas.0510747103] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
From a computer analysis of the spatial organization of the secondary structures of beta-sandwich proteins, we find certain sets of consecutive strands that are connected by hydrogen bonds, which we call "strandons." The analysis of the arrangements of strandons in 491 protein structures that come from 69 different superfamilies reveals strict regularities in the arrangements of strandons and the formation of what we call "canonical supermotifs." Six such supermotifs account for approximately 90% of all observed structures. Simple geometric rules are described that dictate the formation of these supermotifs.
Collapse
Affiliation(s)
- A. E. Kister
- *Department of Health Informatics, School of Health Related Professions, University of Medicine and Dentistry of New Jersey, Newark, NJ 07107
- To whom correspondence may be addressed. E-mail:
or
| | - A. S. Fokas
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, United Kingdom
| | - T. S. Papatheodorou
- High Performance Computing Laboratory, Department of Computer Engineering and Informatics, University of Patras, Patras 26500, Greece; and
| | - I. M. Gelfand
- Department of Mathematics, Rutgers, The State University of New Jersey, Piscataway, NJ 08855
- To whom correspondence may be addressed. E-mail:
or
| |
Collapse
|
23
|
Zhang Y, Hubner IA, Arakaki AK, Shakhnovich E, Skolnick J. On the origin and highly likely completeness of single-domain protein structures. Proc Natl Acad Sci U S A 2006; 103:2605-10. [PMID: 16478803 PMCID: PMC1413790 DOI: 10.1073/pnas.0509379103] [Citation(s) in RCA: 140] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The size and origin of the protein fold universe is of fundamental and practical importance. Analyzing randomly generated, compact sticky homopolypeptide conformations constructed in generic simplified and all-atom protein models, all have similar folds in the library of solved structures, the Protein Data Bank, and conversely, all compact, single-domain protein structures in the Protein Data Bank have structural analogues in the compact model set. Thus, both sets are highly likely complete, with the protein fold universe arising from compact conformations of hydrogen-bonded, secondary structures. Because side chains are represented by their Cbeta atoms, these results also suggest that the observed protein folds are insensitive to the details of side-chain packing. Sequence specificity enters both in fine-tuning the structure and thermodynamically stabilizing a given fold with respect to the set of alternatives. Scanning the models against a three-dimensional active-site library, close geometric matches are frequently found. Thus, the presence of active-site-like geometries also seems to be a consequence of the packing of compact, secondary structural elements. These results have significant implications for the evolution of protein structure and function.
Collapse
Affiliation(s)
- Yang Zhang
- *Center of Excellence in Bioinformatics, University at Buffalo, State University of New York, 901 Washington Street, Buffalo, NY 14203; and
| | - Isaac A. Hubner
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138
| | - Adrian K. Arakaki
- *Center of Excellence in Bioinformatics, University at Buffalo, State University of New York, 901 Washington Street, Buffalo, NY 14203; and
| | - Eugene Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138
| | - Jeffrey Skolnick
- *Center of Excellence in Bioinformatics, University at Buffalo, State University of New York, 901 Washington Street, Buffalo, NY 14203; and
- To whom correspondence should be sent at the present address:
Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318. E-mail:
| |
Collapse
|
24
|
Zeldovich KB, Berezovsky IN, Shakhnovich EI. Physical origins of protein superfamilies. J Mol Biol 2006; 357:1335-43. [PMID: 16483605 DOI: 10.1016/j.jmb.2006.01.081] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Accepted: 01/23/2006] [Indexed: 10/25/2022]
Abstract
In this work, we discovered a fundamental connection between selection for protein stability and emergence of preferred structures of proteins. Using a standard exact three-dimensional lattice model we evolve sequences starting from random ones and determine the exact native structure after each mutation. Acceptance of mutations is biased to select for stable proteins. We found that certain structures, "wonderfolds", are independently discovered numerous times as native states of stable proteins in many unrelated runs of selection. The strong dependence of lattice fold usage on the structural determinant of designability quantitatively reproduces uneven fold usage in natural proteins. Diversity of sequences that fold into wonderfold structures gives rise to superfamilies, i.e. sets of dissimilar sequences that fold into the same or very similar structures. The present work establishes a model of pre-biotic structure selection, which identifies dominant structural patterns emerging upon optimization of proteins for survival in a hot environment. Convergently discovered pre-biotic initial superfamilies with wonderfold structures could have served as a seed for subsequent biological evolution involving gene duplications and divergence.
Collapse
Affiliation(s)
- Konstantin B Zeldovich
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
25
|
Theobald DL, Wuttke DS. Divergent evolution within protein superfolds inferred from profile-based phylogenetics. J Mol Biol 2005; 354:722-37. [PMID: 16266719 PMCID: PMC1769326 DOI: 10.1016/j.jmb.2005.08.071] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2005] [Revised: 08/29/2005] [Accepted: 08/30/2005] [Indexed: 11/19/2022]
Abstract
Many dissimilar protein sequences fold into similar structures. A central and persistent challenge facing protein structural analysis is the discrimination between homology and convergence for structurally similar domains that lack significant sequence similarity. Classic examples are the OB-fold and SH3 domains, both small, modular beta-barrel protein superfolds. The similarities among these domains have variously been attributed to common descent or to convergent evolution. Using a sequence profile-based phylogenetic technique, we analyzed all structurally characterized OB-fold, SH3, and PDZ domains with less than 40% mutual sequence identity. An all-against-all, profile-versus-profile analysis of these domains revealed many previously undetectable significant interrelationships. The matrices of scores were used to infer phylogenies based on our derivation of the relationships between sequence similarity E-values and evolutionary distances. The resulting clades of domains correlate remarkably well with biological function, as opposed to structural similarity, indicating that the functionally distinct sub-families within these superfolds are homologous. This method extends phylogenetics into the challenging "twilight zone" of sequence similarity, providing the first objective resolution of deep evolutionary relationships among distant protein families.
Collapse
Affiliation(s)
- Douglas L. Theobald
- Department of Chemistry and Biochemistry, UCB 215 University of Colorado Boulder, CO 80309-0215, USA
| | - Deborah S. Wuttke
- Department of Chemistry and Biochemistry, UCB 215 University of Colorado Boulder, CO 80309-0215, USA
| |
Collapse
|
26
|
Analysis of protein homology by assessing the (dis)similarity in protein loop regions. Proteins 2005; 57:539-47. [PMID: 15382231 DOI: 10.1002/prot.20237] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Two proteins are considered to have a similar fold if sufficiently many of their secondary structure elements are positioned similarly in space and are connected in the same order. Such a common structural scaffold may arise due to either divergent or convergent evolution. The intervening unaligned regions ("loops") between the superimposable helices and strands can exhibit a wide range of similarity and may offer clues to the structural evolution of folds. One might argue that more closely related proteins differ less in their nonconserved loop regions than distantly related proteins and, at the same time, the degree of variability in the loop regions in structurally similar but unrelated proteins is higher than in homologs. Here we introduce a new measure for structural (dis)similarity in loop regions that is based on the concept of the Hausdorff metric. This measure is used to gauge protein relatedness and is tested on a benchmark of homologous and analogous protein structures. It has been shown that the new measure can distinguish homologous from analogous proteins with the same or higher accuracy than the conventional measures that are based on comparing proteins in structurally aligned regions. We argue that this result can be attributed to the higher sensitivity of the Hausdorff (dis)similarity measure in detecting particularly evident dissimilarities in structures and draw some conclusions about evolutionary relatedness of proteins in the most populated protein folds.
Collapse
|
27
|
Feng KY, Cai YD, Chou KC. Boosting classifier for predicting protein domain structural class. Biochem Biophys Res Commun 2005; 334:213-7. [PMID: 15993842 DOI: 10.1016/j.bbrc.2005.06.075] [Citation(s) in RCA: 111] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2005] [Accepted: 06/14/2005] [Indexed: 11/26/2022]
Abstract
A novel classifier, the so-called "LogitBoost" classifier, was introduced to predict the structural class of a protein domain according to its amino acid sequence. LogitBoost is featured by introducing a log-likelihood loss function to reduce the sensitivity to noise and outliers, as well as by performing classification via combining many weak classifiers together to build up a very strong and robust classifier. It was demonstrated thru jackknife cross-validation tests that LogitBoost outperformed other classifiers including "support vector machine," a very powerful classifier widely used in biological literatures. It is anticipated that LogitBoost can also become a useful vehicle in classifying other attributes of proteins according to their sequences, such as subcellular localization and enzyme family class, among many others.
Collapse
Affiliation(s)
- Kai-Yan Feng
- Imaging Science and Biomedical Engineering, Medical School, The University of Manchester, Manchester, M13 9PT, UK
| | | | | |
Collapse
|
28
|
Ratner V, Amir D, Kahana E, Haas E. Fast Collapse but Slow Formation of Secondary Structure Elements in the Refolding Transition of E.coli Adenylate Kinase. J Mol Biol 2005; 352:683-99. [PMID: 16098987 DOI: 10.1016/j.jmb.2005.06.074] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2005] [Revised: 06/23/2005] [Accepted: 06/30/2005] [Indexed: 11/20/2022]
Abstract
The various models proposed for protein folding transition differ in their order of appearance of the basic steps during this process. In this study, steady state and time-resolved dynamic non-radiative excitation energy transfer (FRET and trFRET) combined with site specific labeling experiments were applied in order to characterize the initial transient ensemble of Escherichia coli adenylate kinase (AK) molecules upon shifting conditions from those favoring denaturation to refolding and from folding to denaturing. Three sets of labeled AK mutants were prepared, which were designed to probe the equilibrium and transient distributions of intramolecular segmental end-to-end distances. A 176 residue section (residues 28-203), which spans most of the 214 residue molecule, and two short secondary structure chain segments including an alpha-helix (residues 169-188) and a predominantly beta-strand region (residues 188-203), were labeled. Upon fast change of conditions from denaturing to folding, the end-to-end distance of the 176 residue chain section showed an immediate collapse to a mean value of 26 A. Under the same conditions, the two short secondary structure elements did not respond to this shift within the first ten milliseconds, and retained the characteristics of a fully unfolded state. Within the first 10 ms after changes of the solvent from folding to denaturing, only minor changes were observed at the local environments of residues 203 and 169. The response of these same local environments to the shift of conditions from denaturing to folding occurred within the dead time of the mixing device. Thus, the response of the CORE domain of AK to fast transfer from folding to unfolding conditions is slow at all three conformational levels that were probed, and for at least a few milliseconds the ensemble of folded molecules is maintained under unfolding conditions. A different order of the changes was observed upon initiation of refolding. The AK molecules undergo fast collapse to an ensemble of compact structures where the local environment of surface probes seems to be native-like but the two labeled secondary structure elements remain unfolded.
Collapse
Affiliation(s)
- V Ratner
- Faculty of Life Sciences, Bar Ilan University, 52900 Ramat Gan, Israel
| | | | | | | |
Collapse
|
29
|
Chelliah V, Blundell TL. Quantifying Structural and Functional Restraints on Amino Acid Substitutions in Evolution of Proteins. BIOCHEMISTRY (MOSCOW) 2005; 70:835-40. [PMID: 16212538 DOI: 10.1007/s10541-005-0192-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
One of Oleg Ptitsyn's most important papers (Shakhnovich, E., Abkevich, V., and Ptitsyn, O. (1996) Nature, 379, 96-98) describes how knowledge of structure and function can be used to understand better the nature of amino acid substitutions in families and superfamilies of proteins. The selective advantages of retaining structure and function during evolution can be expressed as restraints on the amino acid substitutions that are accepted.
Collapse
Affiliation(s)
- V Chelliah
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, England
| | | |
Collapse
|
30
|
Maguid S, Fernandez-Alberti S, Ferrelli L, Echave J. Exploring the common dynamics of homologous proteins. Application to the globin family. Biophys J 2005; 89:3-13. [PMID: 15749782 PMCID: PMC1366528 DOI: 10.1529/biophysj.104.053041] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a procedure to explore the global dynamics shared between members of the same protein family. The method allows the comparison of patterns of vibrational motion obtained by Gaussian network model analysis. After the identification of collective coordinates that were conserved during evolution, we quantify the common dynamics within a family. Representative vectors that describe these dynamics are defined using a singular value decomposition approach. As a test case, the globin heme-binding family is considered. The two lowest normal modes are shown to be conserved within this family. Our results encourage the development of models for protein evolution that take into account the conservation of dynamical features.
Collapse
Affiliation(s)
- Sandra Maguid
- Universidad Nacional de Quilmes, B1876BXD Bernal, Argentina
| | | | | | | |
Collapse
|
31
|
Demirel MC, Cherny D. Clustering and diversity of fluctuations for proteins. NANOMEDICINE : NANOTECHNOLOGY, BIOLOGY, AND MEDICINE 2005; 1:41-6. [PMID: 17292056 DOI: 10.1016/j.nano.2004.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2004] [Accepted: 11/25/2004] [Indexed: 05/13/2023]
Abstract
BACKGROUND Protein topology plays a key role in various types of interactions. Topological constraints of a protein are defined by a contact map. We studied the fluctuations of proteins with use of a new approach based on contact map. METHODS An annealing algorithm is used to generate a 3-dimensional protein structure from the contact map. First, we study the properties of structural elements based on fluctuations by adding individual structures (domains or subdomains). Thereafter, we focus on the building block of proteins in terms of fluctuations. RESULTS To verify our hypothesis, we analyzed the pattern of fluctuations for chymotrypsin inhibitor-2 (CI2) by unstructuring (melting) of subregions. The data show different patterns of fluctuations for the unstructured CI2 relative to that calculated for the intact protein. CONCLUSION Our approach introduces a new concept for classifying building blocks of proteins based on thermal fluctuations.
Collapse
Affiliation(s)
- Melik C Demirel
- College of Engineering, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | |
Collapse
|
32
|
Matsuda K, Nishioka T, Kinoshita K, Kawabata T, Go N. Finding evolutionary relations beyond superfamilies: fold-based superfamilies. Protein Sci 2004; 12:2239-51. [PMID: 14500881 PMCID: PMC2366925 DOI: 10.1110/ps.0383603] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Superfamily classifications are based variably on similarity of sequences, global folds, local structures, or functions. We have examined the possibility of defining superfamilies purely from the viewpoint of the global fold/function relationship. For this purpose, we first classified protein domains according to the beta-sheet topology. We then introduced the concept of kinship relations among the classified beta-sheet topology by assuming that the major elementary event leading to creation of a new beta-sheet topology is either an addition or deletion of one beta-strand at the edge of an existing beta-sheet during the molecular evolution. Based on this kinship relation, a network of protein domains was constructed so that the distance between a pair of domains represents the number of evolutionary events that lead one from the other domain. We then mapped on it all known domains with a specific core chemical function (here taken, as an example, that involving ATP or its analogs). Careful analyses revealed that the domains are found distributed on the network as >20 mutually disjointed clusters. The proteins in each cluster are defined to form a fold-based superfamily. The results indicate that >20 ATP-binding protein superfamilies have been invented independently in the process of molecular evolution, and the conservative evolutionary diffusion of global folds and functions is the origin of the relationship between them.
Collapse
Affiliation(s)
- Keiko Matsuda
- Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, 630-0101, Japan
| | | | | | | | | |
Collapse
|
33
|
Liu X, Fan K, Wang W. The number of protein folds and their distribution over families in nature. Proteins 2004; 54:491-9. [PMID: 14747997 DOI: 10.1002/prot.10514] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Currently, of the 10(6) known protein sequences, only about 10(4) structures have been solved. Based on homologies and similarities, proteins are grouped into different families in which each has a structural prototype, namely, the fold, and some share the same folds. However, the total number of folds and families, and furthermore, the distribution of folds over families in nature, are still an enigma. Here, we report a study on the distribution of folds over families and the total number of folds in nature, using a maximum probability principle and the moment method of estimation. A quadratic relation between the numbers of families and folds is found for the number of families in an interval from 6000 to 30,000. For example, about 2700 folds for 23,100 families are obtained, among them about 33 superfolds, including more than 100 families each, and the largest superfold comprises about 800 families. Our results suggest that although the majority of folds have only a single family per fold, a considerably larger number of folds include many more families each than in the database, and the distribution of folds over families in nature differs markedly from the sampled distribution. The long tail of fold distribution is first estimated in this article. The results fit the data for different versions of the structural classification of proteins (SCOP) excellently, and the goodness-of-fit tests strongly support the results. In addition, the method of directly "enlarging" the sample to the population may be useful in inferring distributions of species in different fields.
Collapse
Affiliation(s)
- Xinsheng Liu
- National Lab of Solid State Microstructure, Department of Physics and Institute of Biophysics, Nanjing University, Nanjing, China
| | | | | |
Collapse
|
34
|
Denton MJ, Dearden PK, Sowerby SJ. Physical law not natural selection as the major determinant of biological complexity in the subcellular realm: new support for the pre-Darwinian conception of evolution by natural law. Biosystems 2003; 71:297-303. [PMID: 14563569 DOI: 10.1016/s0303-2647(03)00100-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Before Darwin many biologists considered organic forms to be immutable natural forms or types which like inorganic forms such as atoms or crystals are part of a changeless world order and determined by physical law. Adaptations were viewed as secondary modifications of these 'crystal like' abstract afunctional 'givens of physics.' We argue here that much of the emerging picture of biological order in the subcellular realm resembles closely the pre-Darwinian conception of nature. We point out that in the subcellular realm, between nano and micrometers, physical law necessarily plays a far more significant role in organizing matter than in the familiar 'Darwinian world' between millimeters and meters (where matter can be arranged into almost any contingent artifactual arrangement we choose, as witness Lego toys, watches or jumbo jets). Consequently, when deploying matter into complex structures in the subcellular realm the cell must necessarily make extensive use of natural forms-such as the protein and RNA folds, microtubular forms and tensegrity structures-which like atoms or crystals self-organize under the direction of physical law into what are essentially 'pre-Darwinian' afunctional abstract molecular architectures in which adaptations are trivial secondary modifications of what are evidently primary givens of physics.
Collapse
Affiliation(s)
- Michael J Denton
- Biochemistry Department, University of Otago, PO Box 56, Dunedin, New Zealand.
| | | | | |
Collapse
|
35
|
Abstract
Protein-related information is more accumulated rather than reduced to a synthetic view. Itemising properties of protein sequences is informative, so is the list of ingredients to do some cooking, but without a recipe, that is, quantification and chronology, understanding is incomplete. If the goal of accumulating information is to discover or reveal the function and related biochemical mechanisms, information has to be weighed and ordered. As a guideline, the weight of a piece of information should reflect how often it consistently occurs in various contexts. We propose a common sense approach to quantify and put data and information into perspective. Complete bacterial proteomes are individually mapped with the Pfam-A database of domains and protein family signatures in an attempt to assess the modularity of proteins at the level of a single proteome and the implications of a modular description of proteins for a functional interpretation. Poorly annotated proteins in the most documented bacteria (E. coli and B. subtilis) were considered in an attempt to formulate hypothesis on the basis of domain/module content.
Collapse
|
36
|
Suzuki R, Nagata K, Yumoto F, Kawakami M, Nemoto N, Furutani M, Adachi K, Maruyama T, Tanokura M. Three-dimensional solution structure of an archaeal FKBP with a dual function of peptidyl prolyl cis-trans isomerase and chaperone-like activities. J Mol Biol 2003; 328:1149-60. [PMID: 12729748 DOI: 10.1016/s0022-2836(03)00379-6] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Here we report the solution structure of an archaeal FK506-binding protein (FKBP) from a thermophilic archaeum, Methanococcus thermolithotrophicus (MtFKBP17), which has peptidyl prolyl cis-trans isomerase (PPIase) and chaperone-like activities, to reveal the structural basis for the dual function. In addition to a typical PPIase domain, a newly identified domain is formed in the flap loop by a 48-residue insert that is required for the chaperone-like activity. The new domain, called IF domain (the Insert in the Flap), is a novel-folding motif and exposes a hydrophobic surface, which we consider to play an important role in the chaperone-like activity.
Collapse
Affiliation(s)
- Rintaro Suzuki
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 113-8657, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Koike R, Kinoshita K, Kidera A. Ring and zipper formation is the key to understanding the structural variety in all-beta proteins. FEBS Lett 2003; 533:9-13. [PMID: 12505150 DOI: 10.1016/s0014-5793(02)03729-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A novel structural classification of beta proteins is presented from the viewpoint of the ring-shaped structure and the zipper-like contact pattern, based on the fact that 92% and 60% of beta proteins have the ring topology and the zippered contact pattern, respectively. We discuss the implication of the unexpectedly high preference for the ring and zippered structures in connection with the folding process of beta proteins.
Collapse
Affiliation(s)
- Ryotaro Koike
- Department of Chemistry, Graduate School of Science, Kyoto University, Kitashirakawa-Oiwake-cho, Sakyo-ku, Kyoto, 606-8502, Japan.
| | | | | |
Collapse
|
38
|
Denton MJ, Marshall CJ, Legge M. The protein folds as platonic forms: new support for the pre-Darwinian conception of evolution by natural law. J Theor Biol 2002; 219:325-42. [PMID: 12419661 DOI: 10.1006/jtbi.2002.3128] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Before the Darwinian revolution many biologists considered organic forms to be determined by natural law like atoms or crystals and therefore necessary, intrinsic and immutable features of the world order, which will occur throughout the cosmos wherever there is life. The search for the natural determinants of organic form-the celebrated "Laws of Form"-was seen as one of the major tasks of biology. After Darwin, this Platonic conception of form was abandoned and natural selection, not natural law, was increasingly seen to be the main, if not the exclusive, determinant of organic form. However, in the case of one class of very important organic forms-the basic protein folds-advances in protein chemistry since the early 1970s have revealed that they represent a finite set of natural forms, determined by a number of generative constructional rules, like those which govern the formation of atoms or crystals, in which functional adaptations are clearly secondary modifications of primary "givens of physics." The folds are evidently determined by natural law, not natural selection, and are "lawful forms" in the Platonic and pre-Darwinian sense of the word, which are bound to occur everywhere in the universe where the same 20 amino acids are used for their construction. We argue that this is a major discovery which has many important implications regarding the origin of proteins, the origin of life and the fundamental nature of organic form. We speculate that it is unlikely that the folds will prove to be the only case in nature where a set of complex organic forms is determined by natural law, and suggest that natural law may have played a far greater role in the origin and evolution of life than is currently assumed.
Collapse
Affiliation(s)
- Michael J Denton
- Department of Biochemistry, University of Otago, PO Box 56, Dunedin, New Zealand.
| | | | | |
Collapse
|
39
|
Steward RE, Thornton JM. Prediction of strand pairing in antiparallel and parallel beta-sheets using information theory. Proteins 2002; 48:178-91. [PMID: 12112687 DOI: 10.1002/prot.10152] [Citation(s) in RCA: 62] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
An information theory approach was developed to predict the alignment of interacting antiparallel and parallel beta-strands. Information scores were derived for the preference of a residue on a beta-strand to be opposite a sequence of residues on an adjacent beta-strand. These scores were used to predict the interstrand register of interacting beta-strands from 10 alternative offset positions either side of the experimentally observed beta-sheet register. The amino acid sequence of an internal beta-strand can be correctly aligned with two beta-strands in a fixed position either side of the strand in 45% of antiparallel and 48% of parallel arrangements. For comparison, when another beta-strand from a nonhomologous protein substitutes the internal beta-strand, the same register is predicted for only 24 and 36% of antiparallel and parallel arrangements. As expected, alignment of a single fixed strand with just a second beta-strand sequence was more difficult, and gave a correct register in 31 and 37% of antiparallel and parallel beta-pairs, respectively. These scores are 10% higher than for two randomly selected beta-strand sequences. In general, prediction accuracy was not improved by information tables that distinguished hydrogen-bonding patterns or beta-strand order. These results will contribute to predicting the arrangement of beta-strands in beta-pleated sheets and protein topology.
Collapse
Affiliation(s)
- Robert E Steward
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom.
| | | |
Collapse
|
40
|
|
41
|
Korepanova A, Douglas C, Leyngold I, Logan TM. N-terminal extension changes the folding mechanism of the FK506-binding protein. Protein Sci 2001; 10:1905-10. [PMID: 11514681 PMCID: PMC2253207 DOI: 10.1110/ps.14801] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Abstract
Many of the protein fusion systems used to enhance the yield of recombinant proteins result in the addition of a small number of amino acid residues onto the desired protein. Here, we investigate the effect of short (three amino acid) N-terminal extensions on the equilibrium denaturation and kinetic folding and unfolding reactions of the FK506-binding protein (FKBP) and compare the results obtained with data collected on an FKBP variant lacking this extension. Isothermal equilibrium denaturation experiments demonstrated that the N-terminal extension had a slight destabilizing effect. NMR investigations showed that the N-terminal extension slightly perturbed the protein structure near the site of the extension, with lesser effects being propagated into the single alpha-helix of FKBP. These structural perturbations probably account for the differential stability. In contrast to the relatively minor equilibrium effects, the N-terminal extension generated a kinetic-folding intermediate that is not observed in the shorter construct. Kinetic experiments performed on a construct with a different amino acid sequence in the extension showed that the length and the sequence of the extension both contribute to the observed equilibrium and kinetic effects. These results point to an important role for the N terminus in the folding of FKBP and suggest that a biological consequence of N-terminal methionine removal observed in many eukaryotic and prokaryotic proteins is to increase the folding efficiency of the polypeptide chain.
Collapse
Affiliation(s)
- A Korepanova
- Graduate Program in Molecular Biophysics, Florida State University, Tallahassee, Florida 32306, USA
| | | | | | | |
Collapse
|
42
|
Orengo CA, Sillitoe I, Reeves G, Pearl FM. Review: what can structural classifications reveal about protein evolution? J Struct Biol 2001; 134:145-65. [PMID: 11551176 DOI: 10.1006/jsbi.2001.4398] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this article we present a review of the methods used for comparing and classifying protein structures. We discuss the hierarchies and populations of fold groups and evolutionary families in some of the major classifications and we consider some of the problems confronting any general analyses of structural evolution in protein families. We also review some more recent analyses that have expanded these classifications by identifying sequence relatives in the genomes and thereby reveal interesting trends in fold usage and recurrence.
Collapse
Affiliation(s)
- C A Orengo
- Department of Biochemistry and Molecular Biology, University College, Gower Street, London, WC1E 6BT, United Kingdom
| | | | | | | |
Collapse
|
43
|
Abstract
Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with globally distinct structures. Significant sequence conservation, local structural resemblance, and functional similarity strongly indicate evolutionary relationships between these proteins despite pronounced structural differences at the fold level. Several mechanisms such as insertions/deletions/substitutions, circular permutations, and rearrangements in beta-sheet topologies account for the majority of detected structural irregularities. The existence of evolutionarily related proteins that possess different folds brings new challenges to the homology modeling techniques and the structure classification strategies and offers new opportunities for protein design in experimental studies.
Collapse
Affiliation(s)
- N V Grishin
- Howard Hughes Medical Institute, Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, Texas 75390-9050, USA
| |
Collapse
|
44
|
Abstract
A model for topological coding of proteins is proposed. The model is based on the capacity of hydrogen bonds (property of connectivity) to fix conformations of protein molecules. The protein chain is modeled by an n -arc graph with the following elements: vertices (alpha -carbon atoms), structural edges (peptide bonds) and connectivity edges (virtual edges connecting non-adjacent atoms). It was shown that 64 conformations of the 4-arc graph can be described in the binary system by matrices of six variables which form a supermatrix containing four blocks. On the basis of correspondences between the pairs of variables in matrices and four letters of the genetic code matrices and supermatrix are converted, respectively, into the triplets and the table of the genetic code. An algorithm admitting computer programming is proposed for coding the n -arc graph and protein chain. Connectivity operators (polar amino acids) are assigned to blocks of triplets coding for cyclic conformations (G, A-in the second position), while anti-connectivity operators (non-polar amino acids) correspond to blocks of triplets coding for open conformations (C, U-in the second position). Amino acids coded by triplets differing by the first base have different structures. The third base for C, U and G, A is degenerated. Properties of the real genetic code are in full agreement with the model. The model provides an insight into the topological nature of the genetic code and can be used for development of algorithms for the prediction of the protein structure.
Collapse
Affiliation(s)
- V A Karasev
- St. Petersburg State Electrotechnical University, Prof. Popov str. 5, 197376 St. Petersburg, Russia.
| | | |
Collapse
|
45
|
Abstract
The Greek key motifs are the topological signature of many beta-barrels and a majority of beta-sandwich structures. An updated survey of these structures integrates many early observations and newly emerging patterns and provides a better understanding of the unique role of Greek keys in protein structures. A stereotypical Greek key beta-barrel accommodates five or six strands and can have 12 possible topologies. All except one six-stranded topologies have been observed, and only one five-stranded topologies have been seen in actual structures. Of the representative beta-barrel structures analyzed here, half have left-handed Greek keys. This result challenges the empirical claim of the handedness regularity of Greek keys in beta-barrels. One of the five-stranded topologies that has not been observed in beta-barrels comprises two overlapping Greek keys. The two three-dimensional forms of this topology constitute a structural unit that is present in a vast majority of known beta-sandwich structures. Using this unit as the root, we have built a new taxonomy tree for the beta-sandwich folds and deduced a set of rules that appear to constrain how other beta-strands adjoin the unit to form a larger double-layered structure. These rules, though derived from a larger data set, are essentially the same as those drawn from earlier studies, suggesting that they may reflect the true topological constraints in the design of beta-sandwich structures. Finally, a novel variant of the Greek key motif (defined here as the twisted Greek key) has emerged which introduces loop crossings into the folded structures. Proteins 2000;40:409-419.
Collapse
Affiliation(s)
- C Zhang
- Department of Chemistry and E.O. Lawrence Berkeley National Laboratory University of California, Berkeley, California
| | | |
Collapse
|
46
|
Abstract
Here, we present a systematic analysis of the open-faced beta-sheet topologies in a set of non-redundant protein domain structures; in particular, we focus on the topological diversity of four-stranded beta-sheet motifs. Of the 96 topologies that are possible for a four-stranded beta-sheet, 42 were identified in known protein structures. Of these, four account for 50% of the structures that we have studied. Two sets of the topologies that were not observed may represent the section of the topological space that is not readily accessible to proteins on either thermodynamic or kinetic grounds. The first set contains topologies with alternating parallel and antiparallel beta-ladders. Their rare occurrence reflects the expectation that it is energetically unfavorable to match different hydrogen bonding patterns. The polypeptide chains in the second set of topologies go through convoluted paths and are expected to experience great kinetic frustrations during the folding processes. A knowledge of the potential causes for the topological preference of small beta-sheets also helps us to understand the topological properties of larger beta-sheet structures which frequently contain four-stranded motifs. The notion that protein topologies can only be taken from a confined and discrete space has important implications for structural genomics.
Collapse
Affiliation(s)
- C Zhang
- Department of Chemistry, E.O. Lawrence Berkeley National Laboratory, University of California, Berkeley, CA, 94720, USA
| | | |
Collapse
|
47
|
Adenot M, Sarrauste de Menthière C, Chavanieu A, Calas B, Grassy G. Peptides quantitative structure-function relationships: an automated mutation strategy to design peptides and pseudopeptides from substitution matrices. J Mol Graph Model 1999; 17:292-309. [PMID: 10840689 DOI: 10.1016/s1093-3263(99)00037-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
The process by which analogs in peptide chemistry are currently designed does not include any quantitative basis for amino acid substitutions from pharmacological leads. Here, we show that substitution matrices such as PAM 250 can provide quantitative constraints compatible with biological activity. This article describes its use in a strategy of rational amino acid substitution in peptides and proteins: we have computed a chemically derived matrix equivalent to the well-known PAM 250 matrix, reflecting the natural mutability rates of amino acids in protein evolutions but that can be extended to all the noncoded amino acids. Some of these noncoded amino acids are widely used to mimic secondary structure, to constrain backbone conformation, or to evade protease degradation. An automated sequence mutation (ASM) strategy has been defined to generate mutations within constraints. Application of such a substitution matrix to quantitative structure-function relationship studies will be of use in the design of proteins and peptides destined to become pharmaceutical drugs. In particular, issues such as which functionally conserved substitutions are able to satisfy conformational restrictions, oral bioavailability, or formulation demands can be quantitatively addressed.
Collapse
Affiliation(s)
- M Adenot
- Centre de Biochimie Structurale, CNRS UMR 9955, INSERM U 414, Faculté de Pharmacie 15, Montpellier, France
| | | | | | | | | |
Collapse
|
48
|
Affiliation(s)
- R L Baldwin
- Department of Biochemistry, Stanford University School of Medicine, California 94305-5307, USA
| |
Collapse
|
49
|
Lomize AL, Pogozheva ID, Mosberg HI. Structural organization of G-protein-coupled receptors. J Comput Aided Mol Des 1999; 13:325-53. [PMID: 10425600 DOI: 10.1023/a:1008050821744] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Atomic-resolution structures of the transmembrane 7-alpha-helical domains of 26 G-protein-coupled receptors (GPCRs) (including opsins, cationic amine, melatonin, purine, chemokine, opioid, and glycoprotein hormone receptors and two related proteins, retinochrome and Duffy erythrocyte antigen) were calculated by distance geometry using interhelical hydrogen bonds formed by various proteins from the family and collectively applied as distance constraints, as described previously [Pogozheva et al., Biophys. J., 70 (1997) 1963]. The main structural features of the calculated GPCR models are described and illustrated by examples. Some of the features reflect physical interactions that are responsible for the structural stability of the transmembrane alpha-bundle: the formation of extensive networks of interhelical H-bonds and sulfur-aromatic clusters that are spatially organized as 'polarity gradients'; the close packing of side-chains throughout the transmembrane domain; and the formation of interhelical disulfide bonds in some receptors and a plausible Zn2+ binding center in retinochrome. Other features of the models are related to biological function and evolution of GPCRs: the formation of a common 'minicore' of 43 evolutionarily conserved residues; a multitude of correlated replacements throughout the transmembrane domain; an Na(+)-binding site in some receptors, and excellent complementarity of receptor binding pockets to many structurally dissimilar, conformationally constrained ligands, such as retinal, cyclic opioid peptides, and cationic amine ligands. The calculated models are in good agreement with numerous experimental data.
Collapse
Affiliation(s)
- A L Lomize
- College of Pharmacy, University of Michigan, Ann Arbor 48109-1065, USA
| | | | | |
Collapse
|
50
|
Abstract
A number of fundamental questions in structural biology concern the diversity of protein architectures (or folds). Here, we address two of them, the size of the universe of folds, and the distribution of sequence families among them, using an analysis based on a new and rigorous statistical sampling method. In particular we show that the number of known non-transmembrane protein folds is approximately one half of the total that exist, and that certain superfolds should exist, which accommodate dozens of non-homologous sequence families.
Collapse
Affiliation(s)
- C Zhang
- Department of Biomedical Engineering, Boston University College of Engineering, Boston, MA, 02215, USA
| | | |
Collapse
|