1
|
Kak S. Self-similarity and the maximum entropy principle in the genetic code. Theory Biosci 2023; 142:205-210. [PMID: 37402087 DOI: 10.1007/s12064-023-00396-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Accepted: 06/16/2023] [Indexed: 07/05/2023]
Abstract
This paper addresses the relationship between information and structure of the genetic code. The code has two puzzling anomalies: First, when viewed as 64 sub-cubes of a [Formula: see text] cube, the codons for serine (S) are not contiguous, and there are amino acid codons with zero redundancy, which goes counter to the objective of error correction. To make sense of this, the paper shows that the genetic code must be viewed not only on stereochemical, co-evolution, and error-correction considerations, but also on two additional factors of significance to natural systems, that of an information-theoretic dimensionality of the code data, and the principle of maximum entropy. One implication of non-integer dimensionality associated with data dimensions is self-similarity to different scales, and it is shown that the genetic code does satisfy this property, and it is further shown that the maximum entropy principle operates through the scrambling of the elements in the sense of maximum algorithmic information complexity, generated by an appropriate exponentiation mapping. It is shown that the new considerations and the use of maximum entropy transformation create new constraints that are likely the reasons for the non-uniform codon groups and codons with no redundancy.
Collapse
Affiliation(s)
- Subhash Kak
- Chapman University, Orange, CA, 92866, USA.
- Oklahoma State University, Stillwater, OK, 74078, USA.
| |
Collapse
|
2
|
Kalinina AA, Kolesnikov AV, Kozyr AV, Kulikova NL, Zamkova MA, Kazansky DB, Khromykh LM. Preparative Production and Purification of Recombinant Human Cyclophilin A. BIOCHEMISTRY. BIOKHIMIIA 2022; 87:259-268. [PMID: 35526853 DOI: 10.1134/s0006297922030063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 02/15/2022] [Accepted: 02/15/2022] [Indexed: 06/14/2023]
Abstract
In this work, we developed the method of preparative production of recombinant human cyclophilin A (rhCypA) in Escherichia coli. The full-length cDNA encoding the gene of human CypA (CYPA) was amplified by RT-PCR from the total RNA of human T cell lymphoma Jurkat. The nucleotide sequence of CYPA was optimized to provide highly effective translation in E. coli. Recombinant CYPA DNA was cloned into the pET22b(+) vector, and the resulted expression plasmid was used to transform E. coli strain BL21(DE3)Gold. The recombinant producer strain of E. coli produced soluble rhCypA in the bacterial cytoplasm. The synthesis efficiency of rhCypA was up to 50% of the total cell protein allowing to produce rhCypA in the amount of 1 g per liter of the culture. We also developed the method for rhCypA purification, consisting of a single-step tandem anion exchange chromatography on DEAE- and Q-Sepharose columns. The protein purity was 95% according to electrophoresis (SDS-PAGE), and its contamination with endotoxin did not exceed 0.05 ng per 1 mg of the protein that met the requirements of European pharmacopoeia for injectable preparations. The produced recombinant protein exhibited functional features of native CypA, i.e., isomerase activity and chemokine activity as assessed by stimulation of migration of mouse bone marrow hematopoietic stem cells in vivo. The generated producer strain of E. coli is a super-producer and could be used for large-scale experimental studies of rhCypA and in its preclinical and clinical trials as a drug.
Collapse
Affiliation(s)
- Anastasiia A Kalinina
- N. N. Blokhin National Medical Research Center of Oncology, the Ministry of Health of the Russian Federation, Moscow, 115478, Russia
| | - Alexander V Kolesnikov
- State Research Center of Applied Microbiology and Biotechnology, Obolensk, Moscow Region, 142279, Russia
| | - Arina V Kozyr
- State Research Center of Applied Microbiology and Biotechnology, Obolensk, Moscow Region, 142279, Russia
| | - Natalia L Kulikova
- Institute of Immunological Engineering, Lyubuchany, Moscow Region, 142380, Russia
| | - Maria A Zamkova
- N. N. Blokhin National Medical Research Center of Oncology, the Ministry of Health of the Russian Federation, Moscow, 115478, Russia
| | - Dmitry B Kazansky
- N. N. Blokhin National Medical Research Center of Oncology, the Ministry of Health of the Russian Federation, Moscow, 115478, Russia
| | - Ludmila M Khromykh
- N. N. Blokhin National Medical Research Center of Oncology, the Ministry of Health of the Russian Federation, Moscow, 115478, Russia.
| |
Collapse
|
3
|
Affiliation(s)
- Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Artem S. Novozhilov
- Department of Mathematics, North Dakota State University, Fargo, North Dakota 58108, USA
| |
Collapse
|
4
|
Guimarães RC. The Self-Referential Genetic Code is Biologic and Includes the Error Minimization Property. ORIGINS LIFE EVOL B 2015; 45:69-75. [PMID: 25773583 DOI: 10.1007/s11084-015-9417-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 11/24/2014] [Indexed: 11/26/2022]
Abstract
The distribution of the triplet to amino acid correspondences in the genetic code matrix contains blocks of similarity. There are (a) groups of similar triplets coding for the same amino acid, which is called code degeneracy, and (b) clusters of similar amino acids corresponding to similar triplets. Processes that led to this regionalization have been investigated through a variety of perspectives but no consensus has been reached and no model has been convincing enough to drive experimental tests. Most traditional has been the hypothesis that the code was derived from the standard evolutionary processes of testing variations in the correspondences through the fitness measure of reaching distributions in the matrix space in an optimal manner so that the effects of mutations on protein phenotypes would be minimized, that is, with reduction of the intensity or of the deviant quality of the functional alterations associated with variations. In contrast, the self-referential model for the formation of the code is based on an original regionalization of characters through the concerted superposition of the two components of the encodings: the four modules of dimers of tRNAs are occupied sequentially by sets of amino acids that are also sequentially devoted to fulfilling specific functions in the protein sites and motifs to which they preferentially belong. Therewith, part (b) of the error-minimizing property follows. Part (a) of the property, the code degeneracy, is derived from the synthetase character of developing specificities directed initially to the principal dinucleotides of the triplets, resulting in tetracodonic degeneracy. This was later partly modified during evolution according to the developments of codon usage and the introduction of new amino acids.
Collapse
Affiliation(s)
- Romeu Cardoso Guimarães
- Lab. Biodiversidade e Evolução Molecular, Dept. Biologia Geral, Inst. Ciências Biológicas, Univ. Federal de Minas Gerais, 31270.901, Belo Horizonte, MG, Brazil,
| |
Collapse
|
5
|
Harcombe WR, Delaney NF, Leiby N, Klitgord N, Marx CJ. The ability of flux balance analysis to predict evolution of central metabolism scales with the initial distance to the optimum. PLoS Comput Biol 2013; 9:e1003091. [PMID: 23818838 PMCID: PMC3688462 DOI: 10.1371/journal.pcbi.1003091] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2012] [Accepted: 04/26/2013] [Indexed: 11/21/2022] Open
Abstract
The most powerful genome-scale framework to model metabolism, flux balance analysis (FBA), is an evolutionary optimality model. It hypothesizes selection upon a proposed optimality criterion in order to predict the set of internal fluxes that would maximize fitness. Here we present a direct test of the optimality assumption underlying FBA by comparing the central metabolic fluxes predicted by multiple criteria to changes measurable by a 13C-labeling method for experimentally-evolved strains. We considered datasets for three Escherichia coli evolution experiments that varied in their length, consistency of environment, and initial optimality. For ten populations that were evolved for 50,000 generations in glucose minimal medium, we observed modest changes in relative fluxes that led to small, but significant decreases in optimality and increased the distance to the predicted optimal flux distribution. In contrast, seven populations evolved on the poor substrate lactate for 900 generations collectively became more optimal and had flux distributions that moved toward predictions. For three pairs of central metabolic knockouts evolved on glucose for 600–800 generations, there was a balance between cases where optimality and flux patterns moved toward or away from FBA predictions. Despite this variation in predictability of changes in central metabolism, two generalities emerged. First, improved growth largely derived from evolved increases in the rate of substrate use. Second, FBA predictions bore out well for the two experiments initiated with ancestors with relatively sub-optimal yield, whereas those begun already quite optimal tended to move somewhat away from predictions. These findings suggest that the tradeoff between rate and yield is surprisingly modest. The observed positive correlation between rate and yield when adaptation initiated further from the optimum resulted in the ability of FBA to use stoichiometric constraints to predict the evolution of metabolism despite selection for rate. The most common method of modeling genome-scale metabolism, flux balance analysis, involves using known stoichiometry to define feasible metabolic states and then choosing between these states by proposing that evolution has selected a metabolic flux that optimizes fitness. But does evolution optimize metabolism, and if so, what component of metabolism equates to fitness? We directly tested the underlying assumption of stoichiometric optimality by comparing predicted flux distributions with changes in fluxes that occurred following experimental evolution. Across three experiments ranging in length from a few hundred to fifty thousand generations, we found that substrate uptake – an input to the model – always increased, but supposed optimality criteria such as yield only increased sometimes. Despite this, there was a clear trend. Highly optimal ancestors evolved slightly lower yield in the course of increasing the overall rate, whereas more sub-optimal strains were able to increase both. These results suggest that flux balance analysis is capable of predicting either the initial metabolic behavior of strains or how they will evolve, but not both.
Collapse
Affiliation(s)
- William R. Harcombe
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Nigel F. Delaney
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Nicholas Leiby
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Systems Biology Program, Harvard University, Cambridge, Massachusetts, United States of America
| | - Niels Klitgord
- Bioinformatics Graduate Program, Boston University, Boston, Massachusetts, United States of America
| | - Christopher J. Marx
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- Faculty of Arts and Sciences Center for Systems Biology, Harvard University, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
6
|
Povolotskaya IS, Kondrashov FA, Ledda A, Vlasov PK. Stop codons in bacteria are not selectively equivalent. Biol Direct 2012; 7:30. [PMID: 22974057 PMCID: PMC3549826 DOI: 10.1186/1745-6150-7-30] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2012] [Accepted: 08/22/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The evolution and genomic stop codon frequencies have not been rigorously studied with the exception of coding of non-canonical amino acids. Here we study the rate of evolution and frequency distribution of stop codons in bacterial genomes. RESULTS We show that in bacteria stop codons evolve slower than synonymous sites, suggesting the action of weak negative selection. However, the frequency of stop codons relative to genomic nucleotide content indicated that this selection regime is not straightforward. The frequency of TAA and TGA stop codons is GC-content dependent, with TAA decreasing and TGA increasing with GC-content, while TAG frequency is independent of GC-content. Applying a formal, analytical model to these data we found that the relationship between stop codon frequencies and nucleotide content cannot be explained by mutational biases or selection on nucleotide content. However, with weak nucleotide content-dependent selection on TAG, -0.5 < Nes < 1.5, the model fits all of the data and recapitulates the relationship between TAG and nucleotide content. For biologically plausible rates of mutations we show that, in bacteria, TAG stop codon is universally associated with lower fitness, with TAA being the optimal for G-content < 16% while for G-content > 16% TGA has a higher fitness than TAG. CONCLUSIONS Our data indicate that TAG codon is universally suboptimal in the bacterial lineage, such that TAA is likely to be the preferred stop codon for low GC content while the TGA is the preferred stop codon for high GC content. The optimization of stop codon usage may therefore be useful in genome engineering or gene expression optimization applications.
Collapse
Affiliation(s)
- Inna S Povolotskaya
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG) and UPF, 88 Dr, Aiguader, Barcelona 08003, Spain
| | | | | | | |
Collapse
|
7
|
The genetic code and its optimization for kinetic energy conservation in polypeptide chains. Biosystems 2012; 109:141-4. [DOI: 10.1016/j.biosystems.2012.03.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Revised: 01/27/2012] [Accepted: 03/06/2012] [Indexed: 10/28/2022]
|
8
|
Ma W. The scenario on the origin of translation in the RNA world: in principle of replication parsimony. Biol Direct 2010; 5:65. [PMID: 21110883 PMCID: PMC3002371 DOI: 10.1186/1745-6150-5-65] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2010] [Accepted: 11/27/2010] [Indexed: 01/06/2023] Open
Abstract
Background It is now believed that in the origin of life, proteins should have been "invented" in an RNA world. However, due to the complexity of a possible RNA-based proto-translation system, this evolving process seems quite complicated and the associated scenario remains very blurry. Considering that RNA can bind amino acids with specificity, it has been reasonably supposed that initial peptides might have been synthesized on "RNA templates" containing multiple amino acid binding sites. This "Direct RNA Template (DRT)" mechanism is attractive because it should be the simplest mechanism for RNA to synthesize peptides, thus very likely to have been adopted initially in the RNA world. Then, how this mechanism could develop into a proto-translation system mechanism is an interesting problem. Presentation of the hypothesis Here an explanation to this problem is shown considering the principle of "replication parsimony" --- genetic information tends to be utilized in a parsimonious way under selection pressure, due to its replication cost (e.g., in the RNA world, nucleotides and ribozymes for RNA replication). Because a DRT would be quite long even for a short peptide, its replication cost would be great. Thus the diversity and the length of functional peptides synthesized by the DRT mechanism would be seriously limited. Adaptors (proto-tRNAs) would arise to allow a DRT's complementary strand (called "C-DRT" here) to direct the synthesis of the same peptide synthesized by the DRT itself. Because the C-DRT is a necessary part in the DRT's replication, fewer turns of the DRT's replication would be needed to synthesize definite copies of the functional peptide, thus saving the replication cost. Acting through adaptors, C-DRTs could transform into much shorter templates (called "proto-mRNAs" here) and substitute the role of DRTs, thus significantly saving the replication cost. A proto-rRNA corresponding to the small subunit rRNA would then emerge to aid the binding of proto-tRNAs and proto-mRNAs, allowing the reduction of base pairs between them (ultimately resulting in the triplet anticodon/codon pair), thus further saving the replication cost. In this context, the replication cost saved would allow the appearance of more and longer functional peptides and, finally, proteins. The hypothesis could be called "DRT-RP" ("RP" for "replication parsimony"). Testing the hypothesis The scenario described here is open for experimental work at some key scenes, including the compact DRT mechanism, the development of adaptors from aa-aptamers, the synthesis of peptides by proto-tRNAs and proto-mRNAs without the participation of proto-rRNAs, etc. Interestingly, a recent computer simulation study has demonstrated the plausibility of one of the evolving processes driven by replication parsimony in the scenario. Implication of the hypothesis An RNA-based proto-translation system could arise gradually from the DRT mechanism according to the principle of "replication parsimony" --- to save the replication cost of RNA templates for functional peptides. A surprising side deduction along the logic of the hypothesis is that complex, biosynthetic amino acids might have entered the genetic code earlier than simple, prebiotic amino acids, which is opposite to the common sense. Overall, the present discussion clarifies the blurry scenario concerning the origin of translation with a major clue, which shows vividly how life could "manage" to exploit potential chemical resources in nature, eventually in an efficient way over evolution. Reviewers This article was reviewed by Eugene V. Koonin, Juergen Brosius, and Arcady Mushegian.
Collapse
Affiliation(s)
- Wentao Ma
- College of Life Sciences, Wuhan University, Wuhan 430072, PR China.
| |
Collapse
|
9
|
Stability of the genetic code and optimal parameters of amino acids. J Theor Biol 2010; 269:57-63. [PMID: 20955716 DOI: 10.1016/j.jtbi.2010.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Revised: 09/20/2010] [Accepted: 10/12/2010] [Indexed: 11/24/2022]
Abstract
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme "leave one out" and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.
Collapse
|
10
|
Castro-Chavez F. The rules of variation: amino acid exchange according to the rotating circular genetic code. J Theor Biol 2010; 264:711-21. [PMID: 20371250 PMCID: PMC3130497 DOI: 10.1016/j.jtbi.2010.03.046] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2009] [Revised: 03/06/2010] [Accepted: 03/30/2010] [Indexed: 12/11/2022]
Abstract
General guidelines for the molecular basis of functional variation are presented while focused on the rotating circular genetic code and allowable exchanges that make it resistant to genetic diseases under normal conditions. The rules of variation, bioinformatics aids for preventative medicine, are: (1) same position in the four quadrants for hydrophobic codons, (2) same or contiguous position in two quadrants for synonymous or related codons, and (3) same quadrant for equivalent codons. To preserve protein function, amino acid exchange according to the first rule takes into account the positional homology of essential hydrophobic amino acids with every codon with a central uracil in the four quadrants, the second rule includes codons for identical, acidic, or their amidic amino acids present in two quadrants, and the third rule, the smaller, aromatic, stop codons, and basic amino acids, each in proximity within a 90 degree angle. I also define codifying genes and palindromati, CTCGTGCCGAATTCGGCACGAG.
Collapse
|