1
|
Michel CJ, Sereni JS. Genome Galaxy Identified by the Circular Code Theory. Bull Math Biol 2024; 87:5. [PMID: 39589676 DOI: 10.1007/s11538-024-01366-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 09/24/2024] [Indexed: 11/27/2024]
Abstract
The genome galaxy identified in bacteria is studied by expressing the reading frame retrieval (RFR) function according to the YZ-content (GC-, AG- and GT-content) of bacterial codons. We have developed a simple probabilistic model for ambiguous sequences in order to show that the RFR function is a measure of the gene reading frame retrieval. Indeed, the RFR function increases with the ratio of ambiguous sequences and the ratio of ambiguous sequences decreases when the codon usage dispersion increases. The classical GC-content is the best parameter for characterizing the upper arm, which is related to bacterial genes with a low GC-content, and the lower arm, which is related to bacterial genes with a high GC-content. The galaxy center has a GC-content around 0.5. Then, these results are confirmed by expressing the GC-content of bacterial codons as a function of the codon usage dispersion. Finally, the bacterial genome galaxy is better described with the GC3-content in the 3rd codon site compared to the GC1-content and GC2-content in the 1st and 2nd codons sites, respectively. Whereas the codon usage is used extensively by biologists, its dispersion, which is an important parameter to reveal this genome galaxy, is surprisingly little known and unused. Therefore, we have developed a mathematical theory of codon usage dispersion by deriving several formulæ. It shows three important parameters in codon usage: the minimum and maximum codon probabilities and the number of codons with high frequency, i.e. with a probability at least 1/64. By applying this theory to the evolution of the genetic code, we see that bacteria have optimised the number of codons with high frequency to maximise the codon dispersion, thus maximising the capacity to retrieve the reading frame in genes. The derived formulæ of dispersion can be easily extended to any weighted code over a finite alphabet.
Collapse
Affiliation(s)
- Christian J Michel
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400, Illkirch, France.
| | - Jean-Sébastien Sereni
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400, Illkirch, France
| |
Collapse
|
2
|
Michel CJ. Circular code in introns. Biosystems 2024; 239:105215. [PMID: 38641199 DOI: 10.1016/j.biosystems.2024.105215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 04/14/2024] [Accepted: 04/15/2024] [Indexed: 04/21/2024]
Abstract
A massive statistical analysis based on the autocorrelation function of the circular code X observed in genes is performed on the (eukaryotic) introns. Surprisingly, a circular code periodicity 0 modulo 3 is identified in 5 groups of introns: birds, ascomycetes, basidiomycetes, green algae and land plants. This circular code periodicity, which is a property of retrieving the reading frame in (protein coding) genes, may suggest that these introns have a coding property. In a well-known way, a periodicity 1 modulo 2 is observed in 6 groups of introns: amphibians, fishes, mammals, other animals, reptiles and apicomplexans. A mixed periodicity modulo 2 and 3 is found in the introns of insects. Astonishing, a subperiodicity 3 modulo 6 is a common statistical property in these 3 classes of introns. When the particular trinucleotides N1N2N1 of the circular code X are not considered, the circular code periodicity 0 modulo 3, hidden by the periodicity 1 modulo 2, is now retrieved in 5 groups of introns: amphibians, fishes, other animals, reptiles and insects. Thus, 10 groups of introns, taxonomically different, out of 12 have a coding property related to the reading frame retrieval. The trinucleotides N1N2N1 are analysed in the 216 maximal C3 self-complementary trinucleotide circular codes. A hexanucleotide code (words of 6 letters) is proposed to explain the periodicity 3 modulo 6. It could be a trace of more general circular codes at the origin of the circular code X.
Collapse
Affiliation(s)
- Christian J Michel
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France.
| |
Collapse
|
3
|
Fimmel E, Michel CJ, Strüngmann L. Circular mixed sets. Biosystems 2023; 229:104906. [PMID: 37196893 DOI: 10.1016/j.biosystems.2023.104906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 04/29/2023] [Indexed: 05/19/2023]
Abstract
In this article, we introduce the new mathematical concept of circular mixed sets of words over an arbitrary finite alphabet. These circular mixed sets may not be codes in the classical sense and hence allow a higher amount of information to be encoded. After describing their basic properties, we generalize a recent graph theoretical approach for circularity and apply it to distinguish codes from sets (i.e. non-codes). Moreover, several methods are given to construct circular mixed sets. Finally, this approach allows us to propose a new evolution model of the present genetic code that could have evolved from a dinucleotide world to a trinucleotide world via circular mixed sets of dinucleotides and trinucleotides.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| | - Christian J Michel
- Theoretical bioinformatics, ICube, University of Strasbourg, C.N.R.S., 300 Boulevard Sébastien Brant, 67400 Illkirch, France.
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany.
| |
Collapse
|
4
|
Michel CJ, Sereni JS. Reading Frame Retrieval of Genes: A New Parameter of Codon Usage Based on the Circular Code Theory. Bull Math Biol 2023; 85:24. [PMID: 36826719 PMCID: PMC9950712 DOI: 10.1007/s11538-023-01129-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 01/26/2023] [Indexed: 02/25/2023]
Abstract
Based on the circular code theory, we define a new function f that quantifies the property of reading frame retrieval (RFR) of genes from their codon usage. This RFR function f is computed on a massive scale in genes of genomes of bacteria, eukaryotes and archaea. By expressing f as a function of the mean number [Formula: see text] of codons per gene, a "universal" property is identified, whatever the kingdom: the reading frame retrieval is enhanced in large genes. By investigating this property according to the theory developed, a Spearman's rank correlation with a strong negative coefficient is observed between the codon usage dispersion d (from the uniform codon distribution [Formula: see text]) and the RFR function f, whatever the kingdom (p-values [Formula: see text] in bacteria, [Formula: see text] in eukaryotes and [Formula: see text] in archaea). Thus, the reading frame retrieval is enhanced with the codon usage dispersion. Furthermore, this approach identifies a "genome centre" from which emerge two distinct "genome arms": an upper arm and a lower arm, respectively, above and below the linear regression. The RFR function by itself or combined with classical methods (alignment, phylogeny) could also be a new approach to classify the genomes in the future.
Collapse
Affiliation(s)
- Christian J. Michel
- grid.11843.3f0000 0001 2157 9291Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| | - Jean-Sébastien Sereni
- grid.11843.3f0000 0001 2157 9291Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| |
Collapse
|
5
|
Giannerini S, Gonzalez DL, Goracci G, Danielli A. A role for circular code properties in translation. Sci Rep 2021; 11:9218. [PMID: 33911089 PMCID: PMC8080828 DOI: 10.1038/s41598-021-87534-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 03/23/2021] [Indexed: 11/19/2022] Open
Abstract
Circular codes represent a form of coding allowing detection/correction of frame-shift errors. Building on recent theoretical advances on circular codes, we provide evidence that protein coding sequences exhibit in-frame circular code marks, that are absent in introns and are intimately linked to the keto-amino transformation of codon bases. These properties strongly correlate with translation speed, codon influence and protein synthesis levels. Strikingly, circular code marks are absent at the beginning of coding sequences, but stably occur 40 codons after the initiator codon, hinting at the translation elongation process. Finally, we use the lens of circular codes to show that codon influence on translation correlates with the strong-weak dichotomy of the first two bases of the codon. The results can lead to defining new universal tools for sequence indicators and sequence optimization for bioinformatics and biotechnological applications, and can shed light on the molecular mechanisms behind the decoding process.
Collapse
Affiliation(s)
- Simone Giannerini
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy.
| | - Diego Luis Gonzalez
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy.,Institute for Microelectronics and Microsystems - Bologna Unit, CNR, Bologna, 40129, Italy
| | - Greta Goracci
- Department of Statistical Sciences, University of Bologna, Bologna, 40126, Italy
| | - Alberto Danielli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, 40126, Italy
| |
Collapse
|
6
|
Michel CJ. Genes on the circular code alphabet. Biosystems 2021; 206:104431. [PMID: 33894288 DOI: 10.1016/j.biosystems.2021.104431] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 04/15/2021] [Accepted: 04/15/2021] [Indexed: 02/07/2023]
Abstract
The X motifs, motifs from the circular code X, are enriched in the (protein coding) genes of bacteria, archaea, eukaryotes, plasmids and viruses, moreover, in the minimal gene set belonging to the three domains of life, as well as in tRNA and rRNA sequences. They allow to retrieve, maintain and synchronize the reading frame in genes, and contribute to the regulation of gene expression. These results lead here to a theoretical study of genes based on the circular code alphabet. A new occurrence relation of the circular code X under the hypothesis of an equiprobable (balanced) strand pairing is given. Surprisingly, a statistical analysis of a large set of bacterial genes retrieves this relation on the circular code alphabet, but not on the DNA alphabet. Furthermore, the circular code X has the strongest balanced circular code pairing among 216 maximal C3 self-complementary trinucleotide circular codes, a new property of this circular code X. As an application of this theory, different tRNAs studied on the circular code alphabet reveal an unexpected stem structure. Thus, the circular code X would have constructed a coding stem in tRNAs as an outline of the future gene structure and the future DNA double helix.
Collapse
Affiliation(s)
- Christian J Michel
- Theoretical Bioinformatics, ICube, CNRS, University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France.
| |
Collapse
|
7
|
Thompson JD, Ripp R, Mayer C, Poch O, Michel CJ. Potential role of the X circular code in the regulation of gene expression. Biosystems 2021; 203:104368. [PMID: 33567309 DOI: 10.1016/j.biosystems.2021.104368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/18/2021] [Accepted: 01/20/2021] [Indexed: 02/06/2023]
Abstract
The X circular code is a set of 20 trinucleotides (codons) that has been identified in the protein-coding genes of most organisms (bacteria, archaea, eukaryotes, plasmids, viruses). It has been shown previously that the X circular code has the important mathematical property of being an error-correcting code. Thus, motifs of the X circular code, i.e. a series of codons belonging to X and called X motifs, allow identification and maintenance of the reading frame in genes. X motifs are significantly enriched in protein-coding genes, but have also been identified in many transfer RNA (tRNA) genes and in important functional regions of the ribosomal RNA (rRNA), notably in the peptidyl transferase center and the decoding center. Here, we investigate the potential role of X motifs as functional elements of protein-coding genes. First, we identify the codons of the X circular code which are frequent or rare in each domain of life (archaea, bacteria, eukaryota) and show that, for the amino acids with the highest codon bias, the preferred codon is often an X codon. We also observe a correlation between the 20 X codons and the optimal codons/dicodons that have been shown to influence translation efficiency. Then, we examined recently published experimental results concerning gene expression levels in diverse organisms. The approach used is the analysis of X motifs according to their density ds(X), i.e. the number of X motifs per kilobase in a gene sequence s. Surprisingly, this simple parameter identifies several unexpected relations between the X circular code and gene expression. For example, the X motifs are significantly enriched in the minimal gene set belonging to the three domains of life, and in codon-optimized genes. Furthermore, the density of X motifs generally correlates with experimental measures of translation efficiency and mRNA stability. Taken together, these results lead us to propose that the X motifs may represent a genetic signal contributing to the maintenance of the correct reading frame and the optimization and regulation of gene expression.
Collapse
Affiliation(s)
- Julie D Thompson
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France.
| | - Raymond Ripp
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France.
| | - Claudine Mayer
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France; Unité de Microbiologie Structurale, Institut Pasteur, CNRS, 75724, Paris Cedex 15, France; Université Paris Diderot, Sorbonne Paris Cité, 75724, Paris Cedex 15, France.
| | - Olivier Poch
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France.
| | - Christian J Michel
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France.
| |
Collapse
|
8
|
Arias L, Martínez F, González D, Flores-Ríos R, Katz A, Tello M, Moreira S, Orellana O. Modification of Transfer RNA Levels Affects Cyclin Aggregation and the Correct Duplication of Yeast Cells. Front Microbiol 2021; 11:607693. [PMID: 33519754 PMCID: PMC7843576 DOI: 10.3389/fmicb.2020.607693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/21/2020] [Indexed: 11/13/2022] Open
Abstract
Codon usage bias (the preferential use of certain synonymous codons (optimal) over others is found at the organism level (intergenomic) within specific genomes (intragenomic) and even in certain genes. Whether it is the result of genetic drift due to GC/AT content and/or natural selection is a topic of intense debate. Preferential codons are mostly found in genes encoding highly-expressed proteins, while lowly-expressed proteins usually contain a high proportion of rare (lowly-represented) codons. While optimal codons are decoded by highly expressed tRNAs, rare codons are usually decoded by lowly-represented tRNAs. Whether rare codons play a role in controlling the expression of lowly- or temporarily-expressed proteins is an open question. In this work we approached this question using two strategies, either by replacing rare glycine codons with optimal counterparts in the gene that encodes the cell cycle protein Cdc13, or by overexpression the tRNA Gly that decodes rare codons from the fission yeast, Schizosaccharomyces pombe. While the replacement of synonymous codons severely affected cell growth, increasing tRNA levels affected the aggregation status of Cdc13 and cell division. These lead us to think that rare codons in lowly-expressed cyclin proteins are crucial for cell division, and that the overexpression of tRNA that decodes rare codons affects the expression of proteins containing these rare codons. These codons may be the result of the natural selection of codons in genes that encode lowly-expressed proteins.
Collapse
Affiliation(s)
- Loreto Arias
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Fabián Martínez
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Daniela González
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Rodrigo Flores-Ríos
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Assaf Katz
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Mario Tello
- Departamento de Biología, Facultad de Química y Biología, Universidad de Santiago de Chile, Santiago, Chile
| | - Sandra Moreira
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| | - Omar Orellana
- Programa de Biología Celular y Molecular, Instituto de Ciencias Biomédicas, Facultad de Medicina, Universidad de Chile, Santiago, Chile
| |
Collapse
|
9
|
Nesterov-Mueller A, Popov R, Seligmann H. Combinatorial Fusion Rules to Describe Codon Assignment in the Standard Genetic Code. Life (Basel) 2020; 11:life11010004. [PMID: 33374866 PMCID: PMC7824455 DOI: 10.3390/life11010004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Revised: 12/15/2020] [Accepted: 12/21/2020] [Indexed: 11/16/2022] Open
Abstract
We propose combinatorial fusion rules that describe the codon assignment in the standard genetic code simply and uniformly for all canonical amino acids. These rules become obvious if the origin of the standard genetic code is considered as a result of a fusion of four protocodes: Two dominant AU and GC protocodes and two recessive AU and GC protocodes. The biochemical meaning of the fusion rules consists of retaining the complementarity between cognate codons of the small hydrophobic amino acids and large charged or polar amino acids within the protocodes. The proto tRNAs were assembled in form of two kissing hairpins with 9-base and 10-base loops in the case of dominant protocodes and two 9-base loops in the case of recessive protocodes. The fusion rules reveal the connection between the stop codons, the non-canonical amino acids, pyrrolysine and selenocysteine, and deviations in the translation of mitochondria. Using fusion rules, we predicted the existence of additional amino acids that are essential for the development of the standard genetic code. The validity of the proposed partition of the genetic code into dominant and recessive protocodes is considered referring to state-of-the-art hypotheses. The formation of two aminoacyl-tRNA synthetase classes is compatible with four-protocode partition.
Collapse
Affiliation(s)
- Alexander Nesterov-Mueller
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
- Correspondence:
| | - Roman Popov
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
| | - Hervé Seligmann
- Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), 76344 Eggenstein-Leopoldshafen, Germany; (R.P.); (H.S.)
- The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
- Laboratory AGEIS EA 7407, Team Tools for e-GnosisMedical & LabcomCNRS/UGA/OrangeLabs Telecoms4Health, Faculty of Medicine, Université Grenoble Alpes, F-38700 La Tronche, France
| |
Collapse
|
10
|
Demongeot J, Moreira A, Seligmann H. Negative CG dinucleotide bias: An explanation based on feedback loops between Arginine codon assignments and theoretical minimal RNA rings. Bioessays 2020; 43:e2000071. [PMID: 33319381 DOI: 10.1002/bies.202000071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 11/23/2020] [Accepted: 11/26/2020] [Indexed: 01/05/2023]
Abstract
Theoretical minimal RNA rings are candidate primordial genes evolved for non-redundant coding of the genetic code's 22 coding signals (one codon per biogenic amino acid, a start and a stop codon) over the shortest possible length: 29520 22-nucleotide-long RNA rings solve this min-max constraint. Numerous RNA ring properties are reminiscent of natural genes. Here we present analyses showing that all RNA rings lack dinucleotide CG (a mutable, chemically instable dinucleotide coding for Arginine), bearing a resemblance to known CG-depleted genomes. CG in "incomplete" RNA rings (not coding for all coding signals, with only 3-12 nucleotides) gradually decreases towards CG absence in complete, 22-nucleotide-long RNA rings. Presumably, feedback loops during RNA ring growth during evolution (when amino acid assignment fixed the genetic code) assigned Arg to codons lacking CG (AGR) to avoid CG. Hence, as a chemical property of base pairs, CG mutability restructured the genetic code, thereby establishing itself as genetically encoded biological information.
Collapse
Affiliation(s)
- Jacques Demongeot
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France
| | - Andrés Moreira
- Departamento de Informática, Universidad Técnica Federico Santa María, Santiago, Chile
| | - Hervé Seligmann
- Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecom4Health, Faculty of Medicine, Université Grenoble Alpes, La Tronche, France.,The National Natural History Collections, The Hebrew University of Jerusalem, Jerusalem, Israel.,Institute of Microstructure Technology, Karlsruhe Institute of Technology (KIT), Eggenstein-Leopoldshafen, Germany
| |
Collapse
|
11
|
The maximality of circular codes in genes statistically verified. Biosystems 2020; 197:104201. [DOI: 10.1016/j.biosystems.2020.104201] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Revised: 06/22/2020] [Accepted: 06/22/2020] [Indexed: 11/18/2022]
|
12
|
Fimmel E, Michel CJ, Pirot F, Sereni JS, Starman M, Strüngmann L. The Relation Between k-Circularity and Circularity of Codes. Bull Math Biol 2020; 82:105. [PMID: 32754878 PMCID: PMC7402406 DOI: 10.1007/s11538-020-00770-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 06/24/2020] [Indexed: 12/15/2022]
Abstract
A code X is k-circular if any concatenation of at most k words from X, when read on a circle, admits exactly one partition into words from X. It is circular if it is k-circular for every integer k. While it is not a priori clear from the definition, there exists, for every pair \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$(n,\ell )$$\end{document}(n,ℓ), an integer k such that every k-circular \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\ell $$\end{document}ℓ-letter code over an alphabet of cardinality n is circular, and we determine the least such integer k for all values of n and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\ell $$\end{document}ℓ. The k-circular codes may represent an important evolutionary step between the circular codes, such as the comma-free codes, and the genetic code.
Collapse
Affiliation(s)
- Elena Fimmel
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
| | - Christian J. Michel
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| | - François Pirot
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
- LORIA (Orpailleur), C.N.R.S., University of Lorraine, INRIA, Campus scientifique, 54506 Vandœuvre-lès-Nancy Cedex, France
| | - Jean-Sébastien Sereni
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| | - Martin Starman
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
- Theoretical Bioinformatics, ICube, C.N.R.S., University of Strasbourg, 300 Boulevard Sébastien Brant, 67400 Illkirch, France
| | - Lutz Strüngmann
- Institute of Mathematical Biology, Faculty for Computer Sciences, Mannheim University of Applied Sciences, 68163 Mannheim, Germany
| |
Collapse
|
13
|
Dila G, Michel CJ, Thompson JD. Optimality of circular codes versus the genetic code after frameshift errors. Biosystems 2020; 195:104134. [DOI: 10.1016/j.biosystems.2020.104134] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 03/23/2020] [Accepted: 03/25/2020] [Indexed: 12/24/2022]
|
14
|
Demongeot J, Seligmann H. Accretion history of large ribosomal subunits deduced from theoretical minimal RNA rings is congruent with histories derived from phylogenetic and structural methods. Gene 2020; 738:144436. [PMID: 32027954 DOI: 10.1016/j.gene.2020.144436] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 01/24/2020] [Accepted: 02/01/2020] [Indexed: 12/17/2022]
Abstract
Accretions of tRNAs presumably formed the large complex ribosomal RNA structures. Similarities of tRNA secondary structures with rRNA secondary structures increase with the integration order of their cognate amino acid in the genetic code, indicating tRNA evolution towards rRNA-like structures. Here analyses rank secondary structure subelements of three large ribosomal RNAs (Prokaryota: Archaea: Thermus thermophilus; Bacteria: Escherichia coli; Eukaryota: Saccharomyces cerevisiae) in relation to their similarities with secondary structures formed by presumed proto-tRNAs, represented by 25 theoretical minimal RNA rings. These ranks are compared to those derived from two independent methods (ranks provide a relative evolutionary age to the rRNA substructure), (a) cladistic phylogenetic analyses and (b) 3D-crystallography where core subelements are presumed ancient and peripheral ones recent. Comparisons of rRNA secondary structure subelements with RNA ring secondary structures show congruence between ranks deduced by this method and both (a) and (b) (more with (a) than (b)), especially for RNA rings with predicted ancient cognate amino acid. Reconstruction of accretion histories of large rRNAs will gain from adequately integrating information from independent methods. Theoretical minimal RNA rings, sequences deterministically designed in silico according to specific coding constraints, might produce adequate scales for prebiotic and early life molecular evolution.
Collapse
Affiliation(s)
- Jacques Demongeot
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecoms4Health, F-38700 La Tronche, France.
| | - Hervé Seligmann
- Université Grenoble Alpes, Faculty of Medicine, Laboratory AGEIS EA 7407, Team Tools for e-Gnosis Medical & Labcom CNRS/UGA/OrangeLabs Telecoms4Health, F-38700 La Tronche, France; The National Natural History Collections, The Hebrew University of Jerusalem, 91404 Jerusalem, Israel.
| |
Collapse
|