1
|
Synthesis, Characterization, In vivo, Molecular Docking, ADMET and HOMO-LUMO study of Juvenile Hormone Analogues having sulfonamide feature as an Insect Growth Regulators. J Mol Struct 2021. [DOI: 10.1016/j.molstruc.2021.129945] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
2
|
Shelah S, Strüngmann L. Infinite combinatorics in mathematical biology. Biosystems 2021; 204:104392. [PMID: 33731280 DOI: 10.1016/j.biosystems.2021.104392] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/11/2021] [Accepted: 02/18/2021] [Indexed: 12/12/2022]
Abstract
Is it possible to apply infinite combinatorics and (infinite) set theory in theoretical biology? We do not know the answer yet but in this article we try to present some techniques from infinite combinatorics and set theory that have been used over the last decades in order to prove existence results and independence theorems in algebra and that might have the flexibility and generality to be also used in theoretical biology. In particular, we will introduce the theory of forcing and an algebraic construction technique based on trees and forests using infinite binary sequences. We will also present an overview of the theory of circular codes. Such codes had been found in the genetic information and are assumed to play an important role in error detecting and error correcting mechanisms during the process of translation. Finally, examples and constructions of infinite mixed circular codes using binary sequences hopefully show some similarity between these theories - a starting point for future applications.
Collapse
Affiliation(s)
- Saharon Shelah
- Einstein Institute of Mathematics, The Hebrew University of Jerusalem(1), 9190401, Jerusalem, Israel; Department of Mathematics, Rutgers University, Piscataway, NJ, 08854-8019, USA.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163, Mannheim, Germany.
| |
Collapse
|
3
|
Abstract
The origin of the modern genetic code and the mechanisms that have contributed to its present form raise many questions. The main goal of this work is to test two hypotheses concerning the development of the genetic code for their compatibility and complementarity and see if they could benefit from each other. On the one hand, Gonzalez, Giannerini and Rosa developed a theory, based on four-based codons, which they called tesserae. This theory can explain the degeneracy of the modern vertebrate mitochondrial code. On the other hand, in the 1990s, so-called circular codes were discovered in nature, which seem to ensure the maintenance of a correct reading-frame during the translation process. It turns out that the two concepts not only do not contradict each other, but on the contrary complement and enrichen each other.
Collapse
|
4
|
Fimmel E, Strüngmann L. Linear codes and the mitochondrial genetic code. Biosystems 2019; 184:103990. [PMID: 31326431 DOI: 10.1016/j.biosystems.2019.103990] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 07/09/2019] [Accepted: 07/10/2019] [Indexed: 11/29/2022]
Abstract
The origin of the genetic code can certainly be regarded as one of the most challenging problems in the theory of molecular evolution. Thus the known variants of the genetic code and a possible common ancestry of them haven been studied extensively in the literature. Gonzalez et al. (2012) developed the theory of a primeval mitochondrial genetic code composed of four base codons. These were called tesserae and it was shown that the tesserae code has some remarkable error detection capabilities. In our paper we will show that using classical coding theory we can construct the tessera code as a linear coding of the standard genetic code and at the same time it can be deduced from the code of all dinucleotides by Plotkin's construction. It shows that the tessera model of the mitochondrial code does not just have a biological explanation but also has a clear mathematical structure. This underlines the role that the tessera model might have played in evolution.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, and Competence Center for Algorithmic and Mathematical Methods in Biology, Biotechnology and Medicine, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, and Competence Center for Algorithmic and Mathematical Methods in Biology, Biotechnology and Medicine, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
5
|
Fimmel E, Michel CJ, Pirot F, Sereni JS, Strüngmann L. Mixed circular codes. Math Biosci 2019; 317:108231. [PMID: 31325443 DOI: 10.1016/j.mbs.2019.108231] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2018] [Revised: 07/16/2019] [Accepted: 07/17/2019] [Indexed: 12/11/2022]
Abstract
By an extensive statistical analysis in genes of bacteria, archaea, eukaryotes, plasmids and viruses, a maximal C3-self-complementary trinucleotide circular code has been found to have the highest average occurrence in the reading frame of the ribosome during translation. Circular codes may play an important role in maintaining the correct reading frame. On the other hand, as several evolutionary theories propose primeval codes based on dinucleotides, trinucleotides and tetranucleotides, mixed circular codes were investigated. By using a graph-theoretical approach of circular codes recently developed, we study mixed circular codes, which are the union of a dinucleotide circular code, a trinucleotide circular code and a tetranucleotide circular code. Maximal mixed circular codes of (di,tri)-nucleotides, (tri,tetra)-nucleotides and (di,tri,tetra)-nucleotides are constructed, respectively. In particular, we show that any maximal dinucleotide circular code of size 6 can be embedded into a maximal mixed (di,tri)-nucleotide circular code such that its trinucleotide component is a maximal C3-comma-free code. The growth function of self-complementary mixed circular codes of dinucleotides and trinucleotides is given. Self-complementary mixed circular codes could have been involved in primitive genetic processes.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, Mannheim 68163, Germany.
| | - Christian J Michel
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France.
| | - François Pirot
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France; LORIA (Orpailleur) and Dept. of Mathematics, University of Lorraine and Radboud University, Vandœuvre-lès-Nancy, France and Nijmegen, Netherlands.
| | - Jean-Sébastien Sereni
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, Illkirch 67400, France.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, Mannheim 68163, Germany.
| |
Collapse
|
6
|
Nemzer LR. A binary representation of the genetic code. Biosystems 2017; 155:10-19. [PMID: 28300609 DOI: 10.1016/j.biosystems.2017.03.001] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2016] [Revised: 03/03/2017] [Accepted: 03/06/2017] [Indexed: 12/23/2022]
Abstract
This article introduces a novel binary representation of the canonical genetic code based on both the structural similarities of the nucleotides, as well as the physicochemical properties of the encoded amino acids. Each of the four mRNA bases is assigned a unique 2-bit identifier, so that the 64 triplet codons are each indexed by a 6-bit label. The ordering of the bits reflects the hierarchical organization manifested by the DNA replication/repair and tRNA translation systems. In this system, transition and transversion mutations are naturally expressed as binary operations, and the severities of the different point mutations can be analyzed. Using a principal component analysis, it is shown that the physicochemical properties of amino acids related to protein folding also correlate with certain bit positions of their respective labels. Thus, the likelihood for a point mutation to be conservative, and less likely to cause a change in protein functionality, can be estimated.
Collapse
Affiliation(s)
- Louis R Nemzer
- Department of Chemistry and Physics, Halmos College of Natural Sciences and Oceanography, Nova Southeastern University, Davie, FL, USA.
| |
Collapse
|
7
|
Grosjean H, Westhof E. An integrated, structure- and energy-based view of the genetic code. Nucleic Acids Res 2016; 44:8020-40. [PMID: 27448410 PMCID: PMC5041475 DOI: 10.1093/nar/gkw608] [Citation(s) in RCA: 185] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2016] [Revised: 06/11/2016] [Accepted: 06/17/2016] [Indexed: 12/25/2022] Open
Abstract
The principles of mRNA decoding are conserved among all extant life forms. We present an integrative view of all the interaction networks between mRNA, tRNA and rRNA: the intrinsic stability of codon-anticodon duplex, the conformation of the anticodon hairpin, the presence of modified nucleotides, the occurrence of non-Watson-Crick pairs in the codon-anticodon helix and the interactions with bases of rRNA at the A-site decoding site. We derive a more information-rich, alternative representation of the genetic code, that is circular with an unsymmetrical distribution of codons leading to a clear segregation between GC-rich 4-codon boxes and AU-rich 2:2-codon and 3:1-codon boxes. All tRNA sequence variations can be visualized, within an internal structural and energy framework, for each organism, and each anticodon of the sense codons. The multiplicity and complexity of nucleotide modifications at positions 34 and 37 of the anticodon loop segregate meaningfully, and correlate well with the necessity to stabilize AU-rich codon-anticodon pairs and to avoid miscoding in split codon boxes. The evolution and expansion of the genetic code is viewed as being originally based on GC content with progressive introduction of A/U together with tRNA modifications. The representation we present should help the engineering of the genetic code to include non-natural amino acids.
Collapse
Affiliation(s)
- Henri Grosjean
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris-Sud, Université Paris-Saclay, 91198 Gif-sur-Yvette, France
| | - Eric Westhof
- Architecture et Réactivité de l'ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 15 rue René Descartes, 67084 Strasbourg, France
| |
Collapse
|
8
|
Sharma P, Thakur S, Awasthi P. In silico and bio assay of juvenile hormone analogs as an insect growth regulator against Galleria mellonella (wax moth) – Part I. J Biomol Struct Dyn 2016; 34:1061-78. [DOI: 10.1080/07391102.2015.1056549] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Priyanka Sharma
- Department of Chemistry, National Institute of Technology, Hamirpur, HP 177005, India
| | - Sunil Thakur
- Institute of Environmental Science and Biotechnology, Hamirpur, HP 177001, India
| | - Pamita Awasthi
- Department of Chemistry, National Institute of Technology, Hamirpur, HP 177005, India
| |
Collapse
|
9
|
Fimmel E, Giannerini S, Gonzalez DL, Strüngmann L. Dinucleotide circular codes and bijective transformations. J Theor Biol 2015; 386:159-65. [PMID: 26423358 DOI: 10.1016/j.jtbi.2015.08.034] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Revised: 07/30/2015] [Accepted: 08/29/2015] [Indexed: 11/20/2022]
Abstract
The presence of circular codes in mRNA coding sequences is postulated to be involved in informational mechanisms aimed at detecting and maintaining the normal reading frame during protein synthesis. Most of the recent research is focused on trinucleotide circular codes. However, also dinucleotide circular codes are important since dinucleotides are ubiquitous in genomes and associated to important biological functions. In this work we adopt the group theoretic approach used for trinucleotide codes in Fimmel et al. (2015) to study dinucleotide circular codes and highlight their symmetry properties. Moreover, we characterize such codes in terms of n-circularity and provide a graph representation that allows to visualize them geometrically. The results establish a theoretical framework for the study of the biological implications of dinucleotide circular codes in genomic sequences.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute for Mathematical Biology, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Simone Giannerini
- Department of Statistical Sciences, University of Bologna, 40126, Bologna, Italy.
| | - Diego Luis Gonzalez
- CNR-IMM, Sezione di Bologna, Via Gobetti 101, I-40129, Bologna, Italia; Department of Statistical Sciences, University of Bologna, 40126, Bologna, Italy.
| | - Lutz Strüngmann
- Institute for Mathematical Biology, Faculty of Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
10
|
Sharma P, Thakur S, Awasthi P. Synthesis, Characterization, Biological Evaluation and Docking Study of Heterocyclic-Based Synthetic Sulfonamides as Potential Pesticide Against G. mellonella. Appl Biochem Biotechnol 2015; 176:125-39. [DOI: 10.1007/s12010-015-1562-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Accepted: 03/12/2015] [Indexed: 11/28/2022]
|
11
|
Sciarrino A, Sorba P. Crystal basis model: Codon-Anticodon interaction and genetic code evolution. ACTA ACUST UNITED AC 2014. [DOI: 10.1134/s2070046614040013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Hoyal Cuthill J. The size of the character state space affects the occurrence and detection of homoplasy: modelling the probability of incompatibility for unordered phylogenetic characters. J Theor Biol 2014; 366:24-32. [PMID: 25451518 DOI: 10.1016/j.jtbi.2014.10.033] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 10/14/2014] [Accepted: 10/24/2014] [Indexed: 11/26/2022]
Abstract
This study models the probability of incompatibility versus compatibility for binary or unordered multistate phylogenetic characters, by treating the allocation of taxa to character states as a classical occupancy problem in probability. It is shown that, under this model, the number of character states has a non-linear effect on the probability of character incompatibility, which is also affected by the number of taxa. Effects on homoplasy from the number of character states are further explored using evolutionary computer simulations. The results indicate that the character state space affects both the known levels of homoplasy (recorded during simulated evolution) and those inferred from parsimony analysis of the resulting character data, with particular relevance for morphological phylogenetic analyses which generally use the parsimony method. When the evolvable state space is large (more potential states per character) there is a reduction in the known occurrence of homoplasy (as reported previously). However, this is not always reflected in the levels of homoplasy detected in a parsimony analysis, because higher numbers of states per character can lead to an increase in the probability of character incompatibility (as well as the maximum homoplasy measurable with some indices). As a result, inferred trends in homoplasy can differ markedly from the underlying trend (that recorded during evolutionary simulation). In such cases, inferred homoplasy can be entirely misleading with regard to tree quality (with higher levels of homoplasy inferred for better quality trees). When rates of evolution are low, commonly used indices such as the number of extra steps (H) and the consistency index (CI) provide relatively good measures of homoplasy. However, at higher rates, estimates may be improved by using the retention index (RI), and particularly by accounting for homoplasy measured among randomised character data using the homoplasy excess ratio (HER).
Collapse
Affiliation(s)
- Jennifer Hoyal Cuthill
- Department of Earth Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EQ, United Kingdom.
| |
Collapse
|
13
|
Rosandić M, Paar V. Codon sextets with leading role of serine create "ideal" symmetry classification scheme of the genetic code. Gene 2014; 543:45-52. [PMID: 24709107 DOI: 10.1016/j.gene.2014.04.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Accepted: 04/03/2014] [Indexed: 11/17/2022]
Abstract
The standard classification scheme of the genetic code is organized for alphabetic ordering of nucleotides. Here we introduce the new, "ideal" classification scheme in compact form, for the first time generated by codon sextets encoding Ser, Arg and Leu amino acids. The new scheme creates the known purine/pyrimidine, codon-anticodon, and amino/keto type symmetries and a novel A+U rich/C+G rich symmetry. This scheme is built from "leading" and "nonleading" groups of 32 codons each. In the ensuing 4 × 16 scheme, based on trinucleotide quadruplets, Ser has a central role as initial generator. Six codons encoding Ser and six encoding Arg extend continuously along a linear array in the "leading" group, and together with four of six Leu codons uniquely define construction of the "leading" group. The remaining two Leu codons enable construction of the "nonleading" group. The "ideal" genetic code suggests the evolution of genetic code with serine as an initiator.
Collapse
Affiliation(s)
- Marija Rosandić
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia
| | - Vladimir Paar
- Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia; Faculty of Science, University of Zagreb, Bijenička 32, 10000 Zagreb, Croatia.
| |
Collapse
|
14
|
Salinas DG, Gallardo MO, Osorio MI. Probable relationship between partitions of the set of codons and the origin of the genetic code. Biosystems 2014; 117:77-81. [PMID: 24495914 DOI: 10.1016/j.biosystems.2014.01.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2013] [Revised: 12/26/2013] [Accepted: 01/24/2014] [Indexed: 11/16/2022]
Abstract
Here we study the distribution of randomly generated partitions of the set of amino acid-coding codons. Some results are an application from a previous work, about the Stirling numbers of the second kind and triplet codes, both to the cases of triplet codes having four stop codons, as in mammalian mitochondrial genetic code, and hypothetical doublet codes. Extending previous results, in this work it is found that the most probable number of blocks of synonymous codons, in a genetic code, is similar to the number of amino acids when there are four stop codons, as well as it could be for a primigenious doublet code. Also it is studied the integer partitions associated to patterns of synonymous codons and it is shown, for the canonical code, that the standard deviation inside an integer partition is one of the most probable. We think that, in some early epoch, the genetic code might have had a maximum of the disorder or entropy, independent of the assignment between codons and amino acids, reaching a state similar to "code freeze" proposed by Francis Crick. In later stages, maybe deterministic rules have reassigned codons to amino acids, forming the natural codes, such as the canonical code, but keeping the numerical features describing the set partitions and the integer partitions, like a "fossil numbers"; both kinds of partitions about the set of amino acid-coding codons.
Collapse
Affiliation(s)
- Dino G Salinas
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Mauricio O Gallardo
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| | - Manuel I Osorio
- Centro de Investigación Biomédica, Facultad de Medicina, Universidad Diego Portales, Avda. Ejército 141, Santiago, Chile.
| |
Collapse
|
15
|
Frias D, Monteiro-Cunha JP, Mota-Miranda AC, Fonseca VS, de Oliveira T, Galvao-Castro B, Alcantara LCJ. Human Retrovirus Codon Usage from tRNA Point of View: Therapeutic Insights. Bioinform Biol Insights 2013; 7:335-45. [PMID: 24151425 PMCID: PMC3798314 DOI: 10.4137/bbi.s12093] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
The purpose of this study was to investigate the balance between transfer ribonucleic acid (tRNA) supply and demand in retrovirus-infected cells, seeking the best targets for antiretroviral therapy based on the hypothetical tRNA Inhibition Therapy (TRIT). Codon usage and tRNA gene data were retrieved from public databases. Based on logistic principles, a therapeutic score (T-score) was calculated for all sense codons, in each retrovirus-host system. Codons that are critical for viral protein translation, but not as critical for the host, have the highest T-score values. Theoretically, inactivating the cognate tRNA species should imply a severe reduction of the elongation rate during viral mRNA translation. We developed a method to predict tRNA species critical for retroviral protein synthesis. Four of the best TRIT targets in HIV-1 and HIV-2 encode Large Hydrophobic Residues (LHR), which have a central role in protein folding. One of them, codon CUA, is also a TRIT target in both HTLV-1 and HTLV-2. Therefore, a drug designed for inactivating or reducing the cytoplasmatic concentration of tRNA species with anticodon TAG could attenuate significantly both HIV and HTLV protein synthesis rates. Inversely, replacing codons ending in UA by synonymous codons should increase the expression, which is relevant for DNA vaccine design.
Collapse
Affiliation(s)
- Diego Frias
- Bahia State University, Salvador, Bahia, Brazil
| | | | | | | | | | | | | |
Collapse
|
16
|
Pal A, Mukhopadhyay S, Bothra AK. Statistical analysis of pentose phosphate pathway genes from eubacteria and eukarya reveals translational selection as a major force in shaping codon usage pattern. Bioinformation 2013; 9:349-56. [PMID: 23750079 PMCID: PMC3669787 DOI: 10.6026/97320630009349] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2013] [Accepted: 03/27/2013] [Indexed: 11/23/2022] Open
Abstract
Comparative analysis of metabolic pathways among widely diverse species provides an excellent opportunity to extract information about the functional relation of organisms and pentose phosphate pathway exemplifies one such pathway. A comparative codon usage analysis of the pentose phosphate pathway genes of a diverse group of organisms representing different niches and the related factors affecting codon usage with special reference to the major forces influencing codon usage patterns was carried out. It was observed that organism specific codon usage bias percolates into vital metabolic pathway genes irrespective of their near universality. A clear distinction in the codon usage pattern of gram positive and gram negative bacteria, which is a major classification criterion for bacteria, in terms of pentose phosphate pathway was an important observation of this study. The codon utilization scheme in all the organisms indicates the presence of translation selection as a major force in shaping codon usage. Another key observation was the segregation of the H. sapiens genes as a separate cluster by correspondence analysis, which is primarily attributed to the different codon usage pattern in this genus along with its longer gene lengths. We have also analyzed the amino acid distribution comparison of transketolase protein primary structures among all the organisms and found that there is a certain degree of predictability in the composition profile except in A. fumigatus and H. sapiens, where few exceptions are prominent. In A. fumigatus, a human pathogen responsible for invasive aspergillosis, a significantly different codon usage pattern, which finally translated into its amino acid composition model portraying a unique profile in a key pentose phosphate pathway enzyme transketolase was observed.
Collapse
Affiliation(s)
- Ayon Pal
- Department of Botany, Raiganj College (University College) P.O.- Raiganj, Dist.- Uttar Dinajpur, PIN-733134, West Bengal, India
| | - Subhasis Mukhopadhyay
- Bioinformatics Centre, Department of Biophysics, Molecular Biology and Bioinformatics University of Calcutta, 92 APC Road, Kolkata-700009, West Bengal, India
| | - Asim Kumar Bothra
- Cheminformatics Bioinformatics Lab, Department of Chemistry, Raiganj College (University College) P.O.- Raiganj, Dist.- Uttar Dinajpur, PIN-733134, West Bengal, India
| |
Collapse
|
17
|
Pohl M, Theissen G, Schuster S. GC content dependency of open reading frame prediction via stop codon frequencies. Gene 2012; 511:441-6. [PMID: 23000023 DOI: 10.1016/j.gene.2012.09.031] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Revised: 04/27/2012] [Accepted: 09/05/2012] [Indexed: 11/18/2022]
Abstract
A frequently used approach for detecting potential coding regions is to search for stop codons. In the standard genetic code 3 out of 64 trinucleotides are stop codons. Hence, in random or non-coding DNA one can expect every 21st trinucleotide to have the same sequence as a stop codon. In contrast, the open reading frames (ORFs) of most protein-coding genes are considerably longer. Thus, the stop codon frequency in coding sequences deviates from the background frequency of the corresponding trinucleotides. This has been utilized for gene prediction, in particular, in detecting protein-coding ORFs. Traditional methods based on stop codon frequency are based on the assumption that the GC content is about 50%. However, many genomes show significant deviations from that value. With the presented method we can describe the effects of GC content on the selection of appropriate length thresholds of potentially coding ORFs. Conversely, for a given length threshold, we can calculate the probability of observing it in a random sequence. Thus, we can derive the maximum GC content for which ORF length is practicable as a feature for gene prediction methods and the resulting false positive rates. A rough estimate for an upper limit is a GC content of 80%. This estimate can be made more precise by including further parameters and by taking into account start codons as well. We demonstrate the feasibility of this method by applying it to the genomes of the bacteria Rickettsia prowazekii, Escherichia coli and Caulobacter crescentus, exemplifying the effect of GC content variations according to our predictions. We have adapted the method for predicting coding ORFs by stop codon frequency to the case of GC contents different from 50%. Usually, several methods for gene finding need to be combined. Thus, our results concern a specific part within a package of methods. Interestingly, for genomes with low GC content such as that of R. prowazekii, the presented method provides remarkably good results even when applied alone.
Collapse
Affiliation(s)
- Martin Pohl
- Department of Bioinformatics, Friedrich Schiller University Jena, Ernst-Abbe-Platz 2, 07745 Jena, Germany.
| | | | | |
Collapse
|
18
|
Salinas DG, Gallardo MO, Osorio MI. The most probable number of blocks for the partitions of the set of codons could have determined the number of standard amino acids. Biosystems 2012; 109:133-6. [DOI: 10.1016/j.biosystems.2012.02.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2011] [Revised: 09/16/2011] [Accepted: 02/28/2012] [Indexed: 10/28/2022]
|
19
|
Sciarrino A, Sorba P. A minimum principle in codon–anticodon interaction. Biosystems 2012; 107:113-9. [DOI: 10.1016/j.biosystems.2011.10.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2011] [Revised: 10/11/2011] [Accepted: 10/18/2011] [Indexed: 10/16/2022]
|
20
|
Nikolajewa S, Friedel M, Beyer A, Wilhelm T. THE NEW CLASSIFICATION SCHEME OF THE GENETIC CODE, ITS EARLY EVOLUTION, AND tRNA USAGE. J Bioinform Comput Biol 2011; 4:609-20. [PMID: 16819806 DOI: 10.1142/s0219720006001825] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2005] [Revised: 12/09/2005] [Accepted: 12/23/2005] [Indexed: 11/18/2022]
Abstract
We present a new classification scheme of the genetic code. In contrast to the standard form it clearly shows five codon symmetries: codon-anticodon, codon-reverse codon, and sense-antisense symmetry, as well as symmetries with respect to purine-pyrimidine (A versus G, U versus C) and keto-aminobase (G versus U, A versus C) exchanges. We study the number of tRNA genes of 16 archaea, 81 bacteria and 7 eucaryotes to analyze whether these symmetries are reflected in the corresponding tRNA usage patterns. Two features are especially striking: reverse stop codons do not have their own tRNAs (just one exception in human), and A** anticodons are significantly suppressed. Our classification scheme of the genetic code and the identified tRNA usage patterns support recent speculations about the early evolution of the genetic code. In particular, pre-tRNAs might have had the ability to bind their codons in two directions to the corresponding codons.
Collapse
Affiliation(s)
- Swetlana Nikolajewa
- Theoretical Systems Biology, Institute of Molecular Biotechnology Beutenbergstr, 11, Jena, D-07745, Germany
| | | | | | | |
Collapse
|
21
|
Zhang Z, Yu J. On the organizational dynamics of the genetic code. GENOMICS PROTEOMICS & BIOINFORMATICS 2011; 9:21-9. [PMID: 21641559 PMCID: PMC5054158 DOI: 10.1016/s1672-0229(11)60004-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2010] [Accepted: 10/26/2010] [Indexed: 11/23/2022]
Abstract
The organization of the canonical genetic code needs to be thoroughly illuminated. Here we reorder the four nucleotides—adenine, thymine, guanine and cytosine—according to their emergence in evolution, and apply the organizational rules to devising an algebraic representation for the canonical genetic code. Under a framework of the devised code, we quantify codon and amino acid usages from a large collection of 917 prokaryotic genome sequences, and associate the usages with its intrinsic structure and classification schemes as well as amino acid physicochemical properties. Our results show that the algebraic representation of the code is structurally equivalent to a content-centric organization of the code and that codon and amino acid usages under different classification schemes were correlated closely with GC content, implying a set of rules governing composition dynamics across a wide variety of prokaryotic genome sequences. These results also indicate that codons and amino acids are not randomly allocated in the code, where the six-fold degenerate codons and their amino acids have important balancing roles for error minimization. Therefore, the content-centric code is of great usefulness in deciphering its hitherto unknown regularities as well as the dynamics of nucleotide, codon, and amino acid compositions.
Collapse
Affiliation(s)
- Zhang Zhang
- Plant Stress Genomics Research Center, Division of Chemical and Life Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | | |
Collapse
|
22
|
Stability of the genetic code and optimal parameters of amino acids. J Theor Biol 2010; 269:57-63. [PMID: 20955716 DOI: 10.1016/j.jtbi.2010.10.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Revised: 09/20/2010] [Accepted: 10/12/2010] [Indexed: 11/24/2022]
Abstract
The standard genetic code is known to be much more efficient in minimizing adverse effects of misreading errors and one-point mutations in comparison with a random code having the same structure, i.e. the same number of codons coding for each particular amino acid. We study the inverse problem, how the code structure affects the optimal physico-chemical parameters of amino acids ensuring the highest stability of the genetic code. It is shown that the choice of two or more amino acids with given properties determines unambiguously all the others. In this sense the code structure determines strictly the optimal parameters of amino acids or the corresponding scales may be derived directly from the genetic code. In the code with the structure of the standard genetic code the resulting values for hydrophobicity obtained in the scheme "leave one out" and in the scheme with fixed maximum and minimum parameters correlate significantly with the natural scale. The comparison of the optimal and natural parameters allows assessing relative impact of physico-chemical and error-minimization factors during evolution of the genetic code. As the resulting optimal scale depends on the choice of amino acids with given parameters, the technique can also be applied to testing various scenarios of the code evolution with increasing number of codified amino acids. Our results indicate the co-evolution of the genetic code and physico-chemical properties of recruited amino acids.
Collapse
|
23
|
Görnerup O, Jacobi MN. A model-independent approach to infer hierarchical codon substitution dynamics. BMC Bioinformatics 2010; 11:201. [PMID: 20412602 PMCID: PMC2868013 DOI: 10.1186/1471-2105-11-201] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 04/23/2010] [Indexed: 12/03/2022] Open
Abstract
Background Codon substitution constitutes a fundamental process in molecular biology that has been studied extensively. However, prior studies rely on various assumptions, e.g. regarding the relevance of specific biochemical properties, or on conservation criteria for defining substitution groups. Ideally, one would instead like to analyze the substitution process in terms of raw dynamics, independently of underlying system specifics. In this paper we propose a method for doing this by identifying groups of codons and amino acids such that these groups imply closed dynamics. The approach relies on recently developed spectral and agglomerative techniques for identifying hierarchical organization in dynamical systems. Results We have applied the techniques on an empirically derived Markov model of the codon substitution process that is provided in the literature. Without system specific knowledge of the substitution process, the techniques manage to "blindly" identify multiple levels of dynamics; from amino acid substitutions (via the standard genetic code) to higher order dynamics on the level of amino acid groups. We hypothesize that the acquired groups reflect earlier versions of the genetic code. Conclusions The results demonstrate the applicability of the techniques. Due to their generality, we believe that they can be used to coarse grain and identify hierarchical organization in a broad range of other biological systems and processes, such as protein interaction networks, genetic regulatory networks and food webs.
Collapse
Affiliation(s)
- Olof Görnerup
- Complex Systems Group, Department of Energy and Environment, Chalmers University of Technology, 412 96 Göteborg, Sweden.
| | | |
Collapse
|
24
|
Chechetkin V, Lobzin V. Local stability and evolution of the genetic code. J Theor Biol 2009; 261:643-53. [DOI: 10.1016/j.jtbi.2009.08.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2009] [Revised: 08/31/2009] [Accepted: 08/31/2009] [Indexed: 11/25/2022]
|
25
|
Berleant D, White M, Pierce E, Tudoreanu E, Boeszoermenyi A, Shtridelman Y, Macosko JC. The Genetic Code—More Than Just a Table. Cell Biochem Biophys 2009; 55:107-16. [DOI: 10.1007/s12013-009-9060-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 07/02/2009] [Indexed: 10/20/2022]
|
26
|
Jiménez-Montaño MA. The fourfold way of the genetic code. Biosystems 2009; 98:105-14. [PMID: 19643160 DOI: 10.1016/j.biosystems.2009.07.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2009] [Revised: 07/14/2009] [Accepted: 07/16/2009] [Indexed: 11/29/2022]
Abstract
We describe a compact representation of the genetic code that factorizes the table in quartets. It represents a "least grammar" for the genetic language. It is justified by the Klein-4 group structure of RNA bases and codon doublets. The matrix of the outer product between the column-vector of bases and the corresponding row-vector V(T)=(C G U A), considered as signal vectors, has a block structure consisting of the four cosets of the KxK group of base transformations acting on doublet AA. This matrix, translated into weak/strong (W/S) and purine/pyrimidine (R/Y) nucleotide classes, leads to a code table with mixed and unmixed families in separate regions. A basic difference between them is the non-commuting (R/Y) doublets: AC/CA, GU/UG. We describe the degeneracy in the canonical code and the systematic changes in deviant codes in terms of the divisors of 24, employing modulo multiplication groups. We illustrate binary sub-codes characterizing mutations in the quartets. We introduce a decision-tree to predict the mode of tRNA recognition corresponding to each codon, and compare our result with related findings by Jestin and Soulé [Jestin, J.-L., Soulé, C., 2007. Symmetries by base substitutions in the genetic code predict 2' or 3' aminoacylation of tRNAs. J. Theor. Biol. 247, 391-394], and the rearrangements of the table by Delarue [Delarue, M., 2007. An asymmetric underlying rule in the assignment of codons: possible clue to a quick early evolution of the genetic code via successive binary choices. RNA 13, 161-169] and Rodin and Rodin [Rodin, S.N., Rodin, A.S., 2008. On the origin of the genetic code: signatures of its primordial complementarity in tRNAs and aminoacyl-tRNA synthetases. Heredity 100, 341-355], respectively.
Collapse
Affiliation(s)
- Miguel Angel Jiménez-Montaño
- Division of Mathematics, Science, and Technology, Parker Building, Nova Southeastern University, Fort Lauderdale, FL 33314-7796, USA.
| |
Collapse
|
27
|
Baranov PV, Venin M, Provan G. Codon size reduction as the origin of the triplet genetic code. PLoS One 2009; 4:e5708. [PMID: 19479032 PMCID: PMC2682656 DOI: 10.1371/journal.pone.0005708] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 04/22/2009] [Indexed: 11/26/2022] Open
Abstract
The genetic code appears to be optimized in its robustness to missense errors and frameshift errors. In addition, the genetic code is near-optimal in terms of its ability to carry information in addition to the sequences of encoded proteins. As evolution has no foresight, optimality of the modern genetic code suggests that it evolved from less optimal code variants. The length of codons in the genetic code is also optimal, as three is the minimal nucleotide combination that can encode the twenty standard amino acids. The apparent impossibility of transitions between codon sizes in a discontinuous manner during evolution has resulted in an unbending view that the genetic code was always triplet. Yet, recent experimental evidence on quadruplet decoding, as well as the discovery of organisms with ambiguous and dual decoding, suggest that the possibility of the evolution of triplet decoding from living systems with non-triplet decoding merits reconsideration and further exploration. To explore this possibility we designed a mathematical model of the evolution of primitive digital coding systems which can decode nucleotide sequences into protein sequences. These coding systems can evolve their nucleotide sequences via genetic events of Darwinian evolution, such as point-mutations. The replication rates of such coding systems depend on the accuracy of the generated protein sequences. Computer simulations based on our model show that decoding systems with codons of length greater than three spontaneously evolve into predominantly triplet decoding systems. Our findings suggest a plausible scenario for the evolution of the triplet genetic code in a continuous manner. This scenario suggests an explanation of how protein synthesis could be accomplished by means of long RNA-RNA interactions prior to the emergence of the complex decoding machinery, such as the ribosome, that is required for stabilization and discrimination of otherwise weak triplet codon-anticodon interactions.
Collapse
Affiliation(s)
- Pavel V Baranov
- Biochemistry Department, University College Cork, Cork, Ireland.
| | | | | |
Collapse
|
28
|
Abstract
The codon table for the canonical genetic code can be rearranged in such a way that the code is divided into four quarters and two halves according to the variability of their GC and purine contents, respectively. For prokaryotic genomes, when the genomic GC content increases, their amino acid contents tend to be restricted to the GC-rich quarter and the purine-content insensitive half, where all codons are fourfold degenerate and relatively mutation-tolerant. Conversely, when the genomic GC content decreases, most of the codons retract to the AU-rich quarter and the purine-content sensitive half; most of the codons not only remain encoding physicochemically diversified amino acids but also vary when transversion (between purine and pyrimidine) happens. Amino acids with sixfold-degenerate codons are distributed into all four quarters and across the two halves; their fourfold-degenerate codons are all partitioned into the purine-insensitive half in favorite of robustness against mutations. The features manifested in the rearranged codon table explain most of the intrinsic relationship between protein coding sequences (the informational content) and amino acid compositions (the functional content). The renovated codon table is useful in predicting abundant amino acids and positioning the amino acids with related or distinct physicochemical properties.
Collapse
Affiliation(s)
- Jun Yu
- Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 101300, China.
| |
Collapse
|
29
|
Root-Bernstein R. Simultaneous origin of homochirality, the genetic code and its directionality. Bioessays 2007; 29:689-98. [PMID: 17563089 DOI: 10.1002/bies.20602] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The origin of homochirality in molecules characterizing living systems has remained a mystery since Pasteur's recognition of the problem some 150 years ago.(2-5) Most theories also assume that homochirality emerged in one class of molecules (e.g. ribose) from which it was enriched in other molecules (e.g. amino acids) as well.(2-5)I propose a novel, experimentally testable hypothesis describing a process by which selective chirality in amino acids and ribonucleotides emerged simultaneously and hand-in-hand with the origin and directionality of the genetic code within a system of interactions involving amino acids, peptides, nucleotide bases, their sugars and polynucleotides.
Collapse
Affiliation(s)
- Robert Root-Bernstein
- Department of Physiology, 2174 Biomedical and Physical Sciences Building, Michigan State University, East Lansing, Michigan, USA.
| |
Collapse
|
30
|
Liu YZ, Wang TM. Related matrices of DNA primary sequences based on triplets of nucleic acid bases. Chem Phys Lett 2006. [DOI: 10.1016/j.cplett.2005.10.007] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
31
|
Nikolajewa S, Beyer A, Friedel M, Hollunder J, Wilhelm T. Common patterns in type II restriction enzyme binding sites. Nucleic Acids Res 2005; 33:2726-33. [PMID: 15888729 PMCID: PMC1097771 DOI: 10.1093/nar/gki575] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Restriction enzymes are among the best studied examples of DNA binding proteins. In order to find general patterns in DNA recognition sites, which may reflect important properties of protein–DNA interaction, we analyse the binding sites of all known type II restriction endonucleases. We find a significantly enhanced GC content and discuss three explanations for this phenomenon. Moreover, we study patterns of nucleotide order in recognition sites. Our analysis reveals a striking accumulation of adjacent purines (R) or pyrimidines (Y). We discuss three possible reasons: RR/YY dinucleotides are characterized by (i) stronger H-bond donor and acceptor clusters, (ii) specific geometrical properties and (iii) a low stacking energy. These features make RR/YY steps particularly accessible for specific protein–DNA interactions. Finally, we show that the recognition sites of type II restriction enzymes are underrepresented in host genomes and in phage genomes.
Collapse
Affiliation(s)
| | | | | | | | - Thomas Wilhelm
- To whom correspondence should be addressed. Tel: +49 3641 65 6208; Fax: +49 3641 65 6191;
| |
Collapse
|