1
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
2
|
Shang X, Yu P, Yin Y, Zhang Y, Lu Y, Mao Q, Li Y. Effect of selenium-rich Bacillus subtilis against mercury-induced intestinal damage repair and oxidative stress in common carp. Comp Biochem Physiol C Toxicol Pharmacol 2021; 239:108851. [PMID: 32777471 DOI: 10.1016/j.cbpc.2020.108851] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 07/04/2020] [Accepted: 07/22/2020] [Indexed: 12/13/2022]
Abstract
Mercury (Hg) poisoning in humans and fish represents a significant global problem. Hg is one of the most dangerous threats to the aquatic ecosystem due to its high toxicity. Mercury has a high oxidative stress-inducing potential, and can compounds exert toxic effects by interacting with many important enzymes involved in the regulation of antioxidants. Selenium (Se) supplementation can reactivate the mercury-inhibited enzymes viability. The probiotic Bacillus subtilis is widely used in aquaculture, and it has a certain adsorption effect on heavy metals. The interactions between Hg and Se have been rigorously investigated, particularly due to the observed protective effects of Se against Hg toxicity. The objective of this study was to evaluate whether Se-rich B. subtilis ameliorated Hg-induced toxicity in C. carpio var. specularis. Fish were exposed to waterborne Hg (0.03 mg/L) and fed a diet supplemented with 105 cfu/g Se-rich B. subtilis for 30 days. Fish were sampled, antioxidant activity, and Intestinal damage repair were assessed. Our results indicated that Se-rich B. subtilis protected the Intestinal from Hg-induced morphological changes. Hg treatment significantly decreased the activity levels of SOD, CAT and GSH-PX while increasing the activity levels of MDA, GST, and GSH. Hg treatment also upregulated the mRNA expression of Nrf2, CAT, GSH-PX and HO-1, and reduced expression of keap1. Se-rich B. subtilis had a significant protective effect against Hg-induced oxidative stress.
Collapse
Affiliation(s)
- Xinchi Shang
- College of Animal Science and Technology, Jilin Agriculture University, Changchun 130118, China
| | - Peng Yu
- College of Electronic and Information Engineering, Changchun University of Science and Technology, Changchun, Jilin 130022, China
| | - Yuwei Yin
- College of Animal Science and Technology, Jilin Agriculture University, Changchun 130118, China
| | - Yue Zhang
- College of Animal Science and Technology, Jilin Agriculture University, Changchun 130118, China
| | - Yuting Lu
- College of Animal Science and Technology, Jilin Agriculture University, Changchun 130118, China
| | - Qiaohong Mao
- College of Animal Medicine, Jilin University, Changchun, China
| | - Yuehong Li
- College of Animal Science and Technology, Jilin Agriculture University, Changchun 130118, China.
| |
Collapse
|
3
|
Evolution of Protein Structure and Stability in Global Warming. Int J Mol Sci 2020; 21:ijms21249662. [PMID: 33352933 PMCID: PMC7767258 DOI: 10.3390/ijms21249662] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/12/2022] Open
Abstract
This review focuses on the molecular signatures of protein structures in relation to evolution and survival in global warming. It is based on the premise that the power of evolutionary selection may lead to thermotolerant organisms that will repopulate the planet and continue life in general, but perhaps with different kinds of flora and fauna. Our focus is on molecular mechanisms, whereby known examples of thermoresistance and their physicochemical characteristics were noted. A comparison of interactions of diverse residues in proteins from thermophilic and mesophilic organisms, as well as reverse genetic studies, revealed a set of imprecise molecular signatures that pointed to major roles of hydrophobicity, solvent accessibility, disulfide bonds, hydrogen bonds, ionic and π-electron interactions, and an overall condensed packing of the higher-order structure, especially in the hydrophobic regions. Regardless of mutations, specialized protein chaperones may play a cardinal role. In evolutionary terms, thermoresistance to global warming will likely occur in stepwise mutational changes, conforming to the molecular signatures, such that each "intermediate" fits a temporary niche through punctuated equilibrium, while maintaining protein functionality. Finally, the population response of different species to global warming may vary substantially, and, as such, some may evolve while others will undergo catastrophic mass extinction.
Collapse
|
4
|
The Effect of Longwave Ultraviolet Light Radiation on Dendrolimus tabulaeformis Antioxidant and Detoxifying Enzymes. INSECTS 2019; 11:insects11010001. [PMID: 31861292 PMCID: PMC7022865 DOI: 10.3390/insects11010001] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 12/14/2019] [Accepted: 12/16/2019] [Indexed: 11/16/2022]
Abstract
Longwave ultraviolet (UVA) light, in the range of 315-400 nm, has been widely used as a light source in the light trapping of insect pests. Previous studies have demonstrated the oxidative stress and lethal effect of UV radiation on insects. In this study, we evaluated the influence of UVA radiation on the antioxidant and detoxifying enzymes of Dendrolimus tabulaeformis. We tested the contents of malondialdehyde (MDA), hydroxyl radical (·OH), hydrogen peroxide (H2O2), reduced glutathione (GSH), and oxidized glutathione (GSSH) following different exposure time periods of UVA light irradiation on D. tabulaeformis adults. In addition, we investigated how the activities of antioxidant and detoxifying enzymes responded to UVA radiation by determining the activities of superoxide dismutase (SOD), catalase (CAT), peroxidase (POD), polyphenol oxidase (PPO), glutathione S-transferase (GST), glutathione reductase (GR), acetylcholinesterase (AChE), carboxylesterase (CarE), alkaline phosphatase (ALP), and acid phosphatase (ACP). Adults were exposed to UVA light for different time periods (0, 5, 15, 30, 60, and 120 min). We found that exposure to UVA light for 5 min resulted in rapid variation in the activities of the antioxidant and detoxification enzyme systems. However, the antioxidant capacity of females was incongruous with that of males following UVA irradiation. Our results confirmed that UVA light irradiation increased the level of oxidative stress and disturbed physiological detoxification in D. tabulaeformis adults. Based on the above results, we anticipated that further research of the mechanism of UVA irradiation on the antioxidant and detoxifying enzymes of D. tabulaeformis would gain more importance, allowing to develop and use new, less toxic and environmentally friendly pesticides.
Collapse
|
5
|
Çelik E, Ollis AA, Lasanajak Y, Fisher AC, Gür G, Smith DF, DeLisa MP. Glycoarrays with engineered phages displaying structurally diverse oligosaccharides enable high-throughput detection of glycan-protein interactions. Biotechnol J 2014; 10:199-209. [PMID: 25263089 DOI: 10.1002/biot.201400354] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Revised: 08/21/2014] [Accepted: 09/24/2014] [Indexed: 02/02/2023]
Abstract
Glycan microarrays have become a powerful platform to investigate the interactions of carbohydrates with a variety of biomolecules. However, the number and diversity of glycans available for use in such arrays represent a key bottleneck in glycan array fabrication. To address this challenge, we describe a novel glycan array platform based on surface patterning of engineered glycophages that display unique carbohydrate epitopes. Specifically, we show that glycophages are compatible with surface immobilization procedures and that phage-displayed oligosaccharides retain the ability to be recognized by different glycan-binding proteins (e.g. antibodies and lectins) after immobilization. A key advantage of glycophage arrays is that large quantities of glycophages can be produced biosynthetically from recombinant bacteria and isolated directly from bacterial supernatants without laborious purification steps. Taken together, the glycophage array technology described here should help to expand the diversity of glycan libraries and provide a complement to the existing toolkit for high-throughput analysis of glycan-protein interactions.
Collapse
Affiliation(s)
- Eda Çelik
- School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA; Department of Chemical Engineering, Hacettepe University, Beytepe, Ankara, Turkey; Bioengineering Division, Institute of Science, Hacettepe University, Beytepe, Ankara, Turkey
| | | | | | | | | | | | | |
Collapse
|
6
|
Are proposed early genetic codes capable of encoding viable proteins? J Mol Evol 2014; 78:263-74. [PMID: 24826911 DOI: 10.1007/s00239-014-9622-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 04/28/2014] [Indexed: 01/10/2023]
Abstract
Proteins are elaborate biopolymers balancing between contradicting intrinsic propensities to fold, aggregate, or remain disordered. Assessing their primary structural preferences observable without evolutionary optimization has been reinforced by the recent identification of de novo proteins that have emerged from previously non-coding sequences. In this paper we investigate structural preferences of hypothetical proteins translated from random DNA segments using the standard genetic code and three of its proposed evolutionarily predecessor models encoding 10, 6, and 4 amino acids, respectively. Our only main assumption is that the disorder, aggregation, and transmembrane helix predictions used are able to reflect the differences in the trends of the protein sets investigated. We found that the 10-residue code encodes proteins that resemble modern proteins in their predicted structural properties. All of the investigated early genetic codes give rise to proteins with enhanced disorder and diminished aggregation propensities. Our results suggest that an ancestral genetic code similar to the proposed 10-residue one is capable of encoding functionally diverse proteins but these might have existed under conditions different from today's common physiological ones. The existence of a protein functional repertoire for the investigated earlier stages which is quite distinct as it is today can be deduced from the presented results.
Collapse
|
7
|
Abstract
Although both the most popular form of synthetic biology (SB) and chemical synthetic biology (CSB) share the biotechnologically useful aim of making new forms of life, SB does so by using genetic manipulation of extant microorganism, while CSB utilises classic chemical procedures in order to obtain biological structures which are non-existent in nature. The main query concerning CSB is the philosophical question: why did nature do this, and not that? The idea then is to synthesise alternative structures in order to understand why nature operated in such a particular way. We briefly present here some various examples of CSB, including those cases of nucleic acids synthesised with pyranose instead of ribose, and proteins with a reduced alphabet of amino acids; also we report the developing research on the "never born proteins" (NBP) and "never born RNA" (NBRNA), up to the minimal cell project, where the issue is the preparation of semi-synthetic cells that can perform the basic functions of biological cells.
Collapse
Affiliation(s)
| | - Pier Luigi Luisi
- Department of Materials, Swiss Federal Institute of Technology Zurich (ETHZ), University of Roma Tre, Italy
| |
Collapse
|
8
|
Yanagawa H. Exploration of the Origin and Evolution of Globular Proteins by mRNA Display. Biochemistry 2013; 52:3841-51. [DOI: 10.1021/bi301704x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Affiliation(s)
- Hiroshi Yanagawa
- Department of Biosciences and Informatics,
Faculty
of Sciences and Technology, Keio University, 3-14-1, Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan
| |
Collapse
|
9
|
Approaches to chemical synthetic biology. FEBS Lett 2012; 586:2138-45. [DOI: 10.1016/j.febslet.2012.01.014] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2012] [Accepted: 01/10/2012] [Indexed: 11/24/2022]
|
10
|
Celik E, Fisher AC, Guarino C, Mansell TJ, DeLisa MP. A filamentous phage display system for N-linked glycoproteins. Protein Sci 2011; 19:2006-13. [PMID: 20669235 DOI: 10.1002/pro.472] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
We have developed a filamentous phage display system for the detection of asparagine-linked glycoproteins in Escherichia coli that carry a plasmid encoding the protein glycosylation locus (pgl) from Campylobacter jejuni. In our assay, fusion of target glycoproteins to the minor phage coat protein g3p results in the display of glycans on phage. The glyco-epitope displayed on phage is the product of biosynthetic enzymes encoded by the C. jejuni pgl pathway and minimally requires three essential factors: a pathway for oligosaccharide biosynthesis, a functional oligosaccharyltransferase, and an acceptor protein with a D/E-X(1)-N-X(2)-S/T motif. Glycosylated phages could be recovered by lectin chromatography with enrichment factors as high as 2 × 10(5) per round of panning and these enriched phages retained their infectivity after panning. Using this assay, we show that desired glyco-phenotypes can be reliably selected by panning phage-displayed glycoprotein libraries on lectins that are specific for the glycan. For instance, we used our phage selection to identify permissible residues in the -2 position of the bacterial consensus acceptor site sequence. Taken together, our results demonstrate that a genotype-phenotype link can be established between the phage-associated glyco-epitope and the phagemid-encoded genes for any of the three essential components of the glycosylation process. Thus, we anticipate that our phage display system can be used to isolate interesting variants in any step of the glycosylation process, thereby making it an invaluable tool for genetic analysis of protein glycosylation and for glycoengineering in E. coli cells.
Collapse
Affiliation(s)
- Eda Celik
- School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, New York 14853, USA
| | | | | | | | | |
Collapse
|
11
|
Tanaka J, Doi N, Takashima H, Yanagawa H. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids. Protein Sci 2010; 19:786-95. [PMID: 20162614 DOI: 10.1002/pro.358] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279-284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids.
Collapse
Affiliation(s)
- Junko Tanaka
- Department of Biosciences and Informatics, Keio University, Yokohama 223-8522, Japan
| | | | | | | |
Collapse
|
12
|
Overlapping genes produce proteins with unusual sequence properties and offer insight into de novo protein creation. J Virol 2009; 83:10719-36. [PMID: 19640978 DOI: 10.1128/jvi.00595-09] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
It is widely assumed that new proteins are created by duplication, fusion, or fission of existing coding sequences. Another mechanism of protein birth is provided by overlapping genes. They are created de novo by mutations within a coding sequence that lead to the expression of a novel protein in another reading frame, a process called "overprinting." To investigate this mechanism, we have analyzed the sequences of the protein products of manually curated overlapping genes from 43 genera of unspliced RNA viruses infecting eukaryotes. Overlapping proteins have a sequence composition globally biased toward disorder-promoting amino acids and are predicted to contain significantly more structural disorder than nonoverlapping proteins. By analyzing the phylogenetic distribution of overlapping proteins, we were able to confirm that 17 of these had been created de novo and to study them individually. Most proteins created de novo are orphans (i.e., restricted to one species or genus). Almost all are accessory proteins that play a role in viral pathogenicity or spread, rather than proteins central to viral replication or structure. Most proteins created de novo are predicted to be fully disordered and have a highly unusual sequence composition. This suggests that some viral overlapping reading frames encoding hypothetical proteins with highly biased composition, often discarded as noncoding, might in fact encode proteins. Some proteins created de novo are predicted to be ordered, however, and whenever a three-dimensional structure of such a protein has been solved, it corresponds to a fold previously unobserved, suggesting that the study of these proteins could enhance our knowledge of protein space.
Collapse
|
13
|
Patel SC, Bradley LH, Jinadasa SP, Hecht MH. Cofactor binding and enzymatic activity in an unevolved superfamily of de novo designed 4-helix bundle proteins. Protein Sci 2009; 18:1388-400. [PMID: 19544578 PMCID: PMC2775209 DOI: 10.1002/pro.147] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2009] [Revised: 04/12/2009] [Accepted: 04/13/2009] [Indexed: 11/09/2022]
Abstract
To probe the potential for enzymatic activity in unevolved amino acid sequence space, we created a combinatorial library of de novo 4-helix bundle proteins. This collection of novel proteins can be considered an "artificial superfamily" of helical bundles. The superfamily of 102-residue proteins was designed using binary patterning of polar and nonpolar residues, and expressed in Escherichia coli from a library of synthetic genes. Sequences from the library were screened for a range of biological functions including heme binding and peroxidase, esterase, and lipase activities. Proteins exhibiting these functions were purified and characterized biochemically. The majority of de novo proteins from this superfamily bound the heme cofactor, and a sizable fraction of the proteins showed activity significantly above background for at least one of the tested enzymatic activities. Moreover, several of the designed 4-helix bundles proteins showed activity in all of the assays, thereby demonstrating the functional promiscuity of unevolved proteins. These studies reveal that de novo proteins-which have neither been designed for function, nor subjected to evolutionary pressure (either in vivo or in vitro)-can provide rudimentary activities and serve as a "feedstock" for evolution.
Collapse
Affiliation(s)
- Shona C Patel
- Department of Chemical Engineering, Princeton UniversityPrinceton, New Jersey 08544
| | - Luke H Bradley
- Department of Chemistry, Princeton UniversityPrinceton, New Jersey 08544
| | - Sayuri P Jinadasa
- Department of Chemistry, Princeton UniversityPrinceton, New Jersey 08544
| | - Michael H Hecht
- Department of Chemistry, Princeton UniversityPrinceton, New Jersey 08544
| |
Collapse
|
14
|
Dutta S, Koide A, Koide S. High-throughput analysis of the protein sequence-stability landscape using a quantitative yeast surface two-hybrid system and fragment reconstitution. J Mol Biol 2008; 382:721-33. [PMID: 18674545 DOI: 10.1016/j.jmb.2008.07.036] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2008] [Revised: 07/07/2008] [Accepted: 07/12/2008] [Indexed: 12/11/2022]
Abstract
Stability evaluation of many mutants can lead to a better understanding of the sequence determinants of a structural motif and of factors governing protein stability and protein evolution. The traditional biophysical analysis of protein stability is low throughput, limiting our ability to widely explore sequence space in a quantitative manner. In this study, we have developed a high-throughput library screening method for quantifying stability changes, which is based on protein fragment reconstitution and yeast surface display. Our method exploits the thermodynamic linkage between protein stability and fragment reconstitution and the ability of the yeast surface display technique to quantitatively evaluate protein-protein interactions. The method was applied to a fibronectin type III (FN3) domain. Characterization of fragment reconstitution was facilitated by the co-expression of two FN3 fragments, thus establishing a yeast surface two-hybrid method. Importantly, our method does not rely on competition between clones and thus eliminates a common limitation of high-throughput selection methods in which the most stable variants are recovered predominantly. Thus, it allows for the isolation of sequences that exhibit a desired level of stability. We identified more than 100 unique sequences for a beta-bulge motif, which was significantly more informative than natural sequences of the FN3 family in revealing the sequence determinants for the beta-bulge. Our method provides a powerful means for the rapid assessment of the stability of many variants, for the systematic assessment of the contribution of different factors to protein stability, and for enhancement of the protein stability.
Collapse
Affiliation(s)
- Sanjib Dutta
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, IL 60637, USA
| | | | | |
Collapse
|
15
|
Go A, Kim S, Baum J, Hecht MH. Structure and dynamics of de novo proteins from a designed superfamily of 4-helix bundles. Protein Sci 2008; 17:821-32. [PMID: 18436954 DOI: 10.1110/ps.073377908] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Libraries of de novo proteins provide an opportunity to explore the structural and functional potential of biological molecules that have not been biased by billions of years of evolutionary selection. Given the enormity of sequence space, a rational approach to library design is likely to yield a higher fraction of folded and functional proteins than a stochastic sampling of random sequences. We previously investigated the potential of library design by binary patterning of hydrophobic and hydrophilic amino acids. The structure of the most stable protein from a binary patterned library of de novo 4-helix bundles was solved previously and shown to be consistent with the design. One structure, however, cannot fully assess the potential of the design strategy, nor can it account for differences in the stabilities of individual proteins. To more fully probe the quality of the library, we now report the NMR structure of a second protein, S-836. Protein S-836 proved to be a 4-helix bundle, consistent with design. The similarity between the two solved structures reinforces previous evidence that binary patterning can encode stable, 4-helix bundles. Despite their global similarities, the two proteins have cores that are packed at different degrees of tightness. The relationship between packing and dynamics was probed using the Modelfree approach, which showed that regions containing a high frequency of chemical exchange coincide with less well-packed side chains. These studies show (1) that binary patterning can drive folding into a particular topology without the explicit design of residue-by-residue packing, and (2) that within a superfamily of binary patterned proteins, the structures and dynamics of individual proteins are modulated by the identity and packing of residues in the hydrophobic core.
Collapse
Affiliation(s)
- Abigail Go
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | | | | | |
Collapse
|
16
|
Cheng Z, Miskolzie M, Campbell RE. In vivo screening identifies a highly folded beta-hairpin peptide with a structured extension. Chembiochem 2007; 8:880-3. [PMID: 17457813 DOI: 10.1002/cbic.200600565] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Zihao Cheng
- Department of Chemistry, University of Alberta, Edmonton, AB, Canada
| | | | | |
Collapse
|
17
|
Besenmatter W, Kast P, Hilvert D. Relative tolerance of mesostable and thermostable protein homologs to extensive mutation. Proteins 2006; 66:500-6. [PMID: 17096428 DOI: 10.1002/prot.21227] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Evolvability, designability, and plasticity of a protein are properties that are important to protein engineers, but difficult to quantify. Here, we directly compare homologous AroQ chorismate mutases from the thermophile Methanococcus jannaschii and the mesophile Escherichia coli with respect to their capacity to accommodate extensive mutation. The N-terminal helix comprising about 40% of these proteins was randomized at the genetic level using a binary pattern of hydrophobic and hydrophilic residues based on the respective wild-type sequences. Catalytically active library members were identified by a survival-selection assay in a chorismate mutase-deficient E. coli strain. Functional variants were found approximately approximately 10-times more frequently with the thermostable protein compared to its mesostable counterpart. Moreover, detailed sequence analysis revealed that functional M. jannaschii enzyme variants contained a smaller number of conserved residues and tolerated greater variability at individual sequence positions. Our results thus highlight the greater robustness of the thermostable protein with respect to amino acid substitution, while identifying specific sites important for constructing active enzymes. Overall, they support the notion that redesign projects will benefit from using a thermostable starting structure, even at very high mutational loads.
Collapse
|
18
|
Cao HB, Wang CZ, Dobbs D, Ihm Y, Ho KM. Codability criterion for picking proteinlike structures from random three-dimensional configurations. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2006; 74:031921. [PMID: 17025681 DOI: 10.1103/physreve.74.031921] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2004] [Revised: 07/24/2006] [Indexed: 05/12/2023]
Abstract
We show that the dominant eigenvectors of real protein structural contact matrices are highly correlated with their amino acid sequences. These results suggests that an ab initio sequence-independent profile exists for every protein structure and that this profile is highly effective in differentiating the ordering of amino acids in natural protein sequences from random sequences. This profile provides a structural code and is a key for understanding the unique behavior of protein structures. Using a lattice model, we show that there are special codable structures highly separated from random structures in the dominant eigenvector space of their structural contact matrices. As an example, we show our results provide a good explanation to the "designable principle" of protein structures.
Collapse
Affiliation(s)
- Hai-Bo Cao
- Department of Physics and Astronomy, Iowa State University, Ames, Iowa 50011, USA
| | | | | | | | | |
Collapse
|
19
|
Floudas C, Fung H, McAllister S, Mönnigmann M, Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 2006. [DOI: 10.1016/j.ces.2005.04.009] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
20
|
Doi N, Kakukawa K, Oishi Y, Yanagawa H. High solubility of random-sequence proteins consisting of five kinds of primitive amino acids. Protein Eng Des Sel 2005; 18:279-84. [PMID: 15928003 DOI: 10.1093/protein/gzi034] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Searching for functional proteins among random-sequence libraries is a major challenge of protein engineering; the difficulties include the poor solubility of many random-sequence proteins. A library in which most of the polypeptides are soluble and stable would therefore be of great benefit. Although modern proteins consist of 20 amino acids, it has been suggested that early proteins evolved from a reduced alphabet. Here, we have constructed a library of random-sequence proteins consisting of only five amino acids, Ala, Gly, Val, Asp and Glu, which are believed to have been the most abundant in the prebiotic environment. Expression and characterization of arbitrarily chosen proteins in the library indicated that five-alphabet random-sequence proteins have higher solubility than do 20-alphabet random-sequence proteins with a similar level of hydrophobicity. The results support the reduced-alphabet hypothesis of the primordial genetic code and should also be helpful in constructing optimized protein libraries for evolutionary protein engineering.
Collapse
Affiliation(s)
- Nobuhide Doi
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan
| | | | | | | |
Collapse
|
21
|
Abstract
Why do proteins adopt the conformations that they do, and what determines their stabilities? While we have come to some understanding of the forces that underlie protein architecture, a precise, predictive, physicochemical explanation is still elusive. Two obstacles to addressing these questions are the unfathomable vastness of protein sequence space, and the difficulty in making direct physical measurements on large numbers of protein variants. Here, we review combinatorial methods that have been applied to problems in protein biophysics over the last 15 years. The effects of hydrophobic core composition, the most important determinant of structure and stability, are still poorly understood. Particular attention is given to core composition as addressed by library methods. Increasingly useful screens and selections, in combination with modern high-throughput approaches borrowed from genomics and proteomics efforts, are making the empirical, statistical correlation between sequence and structure a tractable problem for the coming years.
Collapse
Affiliation(s)
- Thomas J Magliery
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT, USA
| | | |
Collapse
|