1
|
Salman A, Biziaev N, Shuvalova E, Alkalaeva E. mRNA context and translation factors determine decoding in alternative nuclear genetic codes. Bioessays 2024; 46:e2400058. [PMID: 38724251 DOI: 10.1002/bies.202400058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/19/2024] [Accepted: 04/23/2024] [Indexed: 06/27/2024]
Abstract
The genetic code is a set of instructions that determine how the information in our genetic material is translated into amino acids. In general, it is universal for all organisms, from viruses and bacteria to humans. However, in the last few decades, exceptions to this rule have been identified both in pro- and eukaryotes. In this review, we discuss the 16 described alternative eukaryotic nuclear genetic codes and observe theories of their appearance in evolution. We consider possible molecular mechanisms that allow codon reassignment. Most reassignments in nuclear genetic codes are observed for stop codons. Moreover, in several organisms, stop codons can simultaneously encode amino acids and serve as termination signals. In this case, the meaning of the codon is determined by the additional factors besides the triplets. A comprehensive review of various non-standard coding events in the nuclear genomes provides a new insight into the translation mechanism in eukaryotes.
Collapse
Affiliation(s)
- Ali Salman
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Nikita Biziaev
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Ekaterina Shuvalova
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| | - Elena Alkalaeva
- Engelhardt Institute of Molecular Biology, the Russian Academy of Sciences, Moscow, Russia
| |
Collapse
|
2
|
Ardern Z. Alternative Reading Frames are an Underappreciated Source of Protein Sequence Novelty. J Mol Evol 2023; 91:570-580. [PMID: 37326679 DOI: 10.1007/s00239-023-10122-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 05/31/2023] [Indexed: 06/17/2023]
Abstract
Protein-coding DNA sequences can be translated into completely different amino acid sequences if the nucleotide triplets used are shifted by a non-triplet amount on the same DNA strand or by translating codons from the opposite strand. Such "alternative reading frames" of protein-coding genes are a major contributor to the evolution of novel protein products. Recent studies demonstrating this include examples across the three domains of cellular life and in viruses. These sequences increase the number of trials potentially available for the evolutionary invention of new genes and also have unusual properties which may facilitate gene origin. There is evidence that the structure of the standard genetic code contributes to the features and gene-likeness of some alternative frame sequences. These findings have important implications across diverse areas of molecular biology, including for genome annotation, structural biology, and evolutionary genomics.
Collapse
|
3
|
Omachi Y, Saito N, Furusawa C. Rare-event sampling analysis uncovers the fitness landscape of the genetic code. PLoS Comput Biol 2023; 19:e1011034. [PMID: 37068098 PMCID: PMC10138212 DOI: 10.1371/journal.pcbi.1011034] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Revised: 04/27/2023] [Accepted: 03/16/2023] [Indexed: 04/18/2023] Open
Abstract
The genetic code refers to a rule that maps 64 codons to 20 amino acids. Nearly all organisms, with few exceptions, share the same genetic code, the standard genetic code (SGC). While it remains unclear why this universal code has arisen and been maintained during evolution, it may have been preserved under selection pressure. Theoretical studies comparing the SGC and numerically created hypothetical random genetic codes have suggested that the SGC has been subject to strong selection pressure for being robust against translation errors. However, these prior studies have searched for random genetic codes in only a small subspace of the possible code space due to limitations in computation time. Thus, how the genetic code has evolved, and the characteristics of the genetic code fitness landscape, remain unclear. By applying multicanonical Monte Carlo, an efficient rare-event sampling method, we efficiently sampled random codes from a much broader random ensemble of genetic codes than in previous studies, estimating that only one out of every 1020 random codes is more robust than the SGC. This estimate is significantly smaller than the previous estimate, one in a million. We also characterized the fitness landscape of the genetic code that has four major fitness peaks, one of which includes the SGC. Furthermore, genetic algorithm analysis revealed that evolution under such a multi-peaked fitness landscape could be strongly biased toward a narrow peak, in an evolutionary path-dependent manner.
Collapse
Affiliation(s)
- Yuji Omachi
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
| | - Nen Saito
- Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima City, Hiroshima, Japan
- Exploratory Research Center on Life and Living Systems, National Institutes of Natural Sciences, Okazaki, Aichi, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
| | - Chikara Furusawa
- Graduate School of Sciences, The University of Tokyo, Hongo, Tokyo, Japan
- Universal Biology Institute, The University of Tokyo, Hongo, Tokyo, Japan
- Center for Biosystems Dynamics Research, RIKEN, Suita, Osaka, Japan
| |
Collapse
|
4
|
Graf F, Zehentner B, Fellner L, Scherer S, Neuhaus K. Three Novel Antisense Overlapping Genes in E. coli O157:H7 EDL933. Microbiol Spectr 2023; 11:e0235122. [PMID: 36533921 PMCID: PMC9927249 DOI: 10.1128/spectrum.02351-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 12/03/2022] [Indexed: 12/23/2022] Open
Abstract
The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated. To date, only a few examples of such genes are fully established. Using RNA sequencing and ribosome profiling, we found expression of novel overlapping open reading frames in Escherichia coli O157:H7 EDL933 (EHEC). Indeed, the overlapping candidate genes are equipped with typical structural elements required for transcription and translation, i.e., promoters, transcription start sites, as well as terminators, all of which were experimentally verified. Translationally arrested mutants, unable to produce the overlapping encoded protein, were found to have a growth disadvantage when grown competitively against the wild type. Thus, the phenotypes found imply biological functionality of the genes at the level of proteins produced. The addition of 3 more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes emphasizes the underestimated coding capacity of bacterial genomes. IMPORTANCE The abundance of long overlapping genes in prokaryotic genomes is likely to be significantly underestimated, since such genes are not allowed in genome annotations. However, ribosome profiling catches mRNA in the moment of being template for protein production. Using this technique and subsequent experiments, we verified 3 novel overlapping genes encoded in antisense of known genes. This adds more examples of prokaryotic overlapping genes to the currently limited, yet constantly growing pool of such genes.
Collapse
Affiliation(s)
- Franziska Graf
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Barbara Zehentner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Lea Fellner
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Siegfried Scherer
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| | - Klaus Neuhaus
- Core Facility Microbiome, ZIEL – Institute for Food & Health, Technische Universität München, Freising, Germany
- Chair for Microbial Ecology, TUM School of Life Sciences, Technische Universität München, Freising, Germany
| |
Collapse
|
5
|
Romero Romero ML, Landerer C, Poehls J, Toth‐Petroczy A. Phenotypic mutations contribute to protein diversity and shape protein evolution. Protein Sci 2022; 31:e4397. [PMID: 36040266 PMCID: PMC9375231 DOI: 10.1002/pro.4397] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/14/2022] [Accepted: 07/04/2022] [Indexed: 11/16/2022]
Abstract
Errors in DNA replication generate genetic mutations, while errors in transcription and translation lead to phenotypic mutations. Phenotypic mutations are orders of magnitude more frequent than genetic ones, yet they are less understood. Here, we review the types of phenotypic mutations, their quantifications, and their role in protein evolution and disease. The diversity generated by phenotypic mutation can facilitate adaptive evolution. Indeed, phenotypic mutations, such as ribosomal frameshift and stop codon readthrough, sometimes serve to regulate protein expression and function. Phenotypic mutations have often been linked to fitness decrease and diseases. Thus, understanding the protein heterogeneity and phenotypic diversity caused by phenotypic mutations will advance our understanding of protein evolution and have implications on human health and diseases.
Collapse
Affiliation(s)
- Maria Luisa Romero Romero
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Cedric Landerer
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Jonas Poehls
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
| | - Agnes Toth‐Petroczy
- Max Planck Institute of Molecular Cell Biology and GeneticsDresdenGermany
- Center for Systems Biology DresdenDresdenGermany
- Cluster of Excellence Physics of LifeTU DresdenDresdenGermany
| |
Collapse
|
6
|
Wang X, Dong Q, Chen G, Zhang J, Liu Y, Cai Y. Frameshift and wild-type proteins are often highly similar because the genetic code and genomes were optimized for frameshift tolerance. BMC Genomics 2022; 23:416. [PMID: 35655139 PMCID: PMC9164415 DOI: 10.1186/s12864-022-08435-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 03/02/2022] [Indexed: 11/10/2022] Open
Abstract
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding genes have been considered to be meaningless. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while substantial changes occurred in its primary amino-acid sequence. This study shows that the similarities among frameshifts and wild types are higher than random similarities and are determined at different levels. Frameshift substitutions are more conservative than random substitutions in the standard genetic code (SGC). The frameshift substitutions score of SGC ranks in the top 2.0-3.5% of alternative genetic codes, showing that SGC is nearly optimal for frameshift tolerance. In many genes and certain genomes, frameshift-resistant codons and codon pairs appear more frequently than expected, suggesting that frameshift tolerance is achieved through not only the optimality of the genetic code but, more importantly, the further optimization of a specific gene or genome through the usages of codons/codon pairs, which sheds light on the role of frameshift mutations in molecular and genomic evolution.
Collapse
|
7
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
8
|
Wichmann S, Scherer S, Ardern Z. Biological factors in the synthetic construction of overlapping genes. BMC Genomics 2021; 22:888. [PMID: 34895142 PMCID: PMC8665328 DOI: 10.1186/s12864-021-08181-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 11/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes (OLGs) with long protein-coding overlapping sequences are disallowed by standard genome annotation programs, outside of viruses. Recently however they have been discovered in Archaea, diverse Bacteria, and Mammals. The biological factors underlying life's ability to create overlapping genes require more study, and may have important applications in understanding evolution and in biotechnology. A previous study claimed that protein domains from viruses were much better suited to forming overlaps than those from other cellular organisms - in this study we assessed this claim, in order to discover what might underlie taxonomic differences in the creation of gene overlaps. RESULTS After overlapping arbitrary Pfam domain pairs and evaluating them with Hidden Markov Models we find OLG construction to be much less constrained than expected. For instance, close to 10% of the constructed sequences cannot be distinguished from typical sequences in their protein family. Most are also indistinguishable from natural protein sequences regarding identity and secondary structure. Surprisingly, contrary to a previous study, virus domains were much less suitable for designing OLGs than bacterial or eukaryotic domains were. In general, the amount of amino acid change required to force a domain to overlap is approximately equal to the variation observed within a typical domain family. The resulting high similarity between natural sequences and those altered so as to overlap is mostly due to the combination of high redundancy in the genetic code and the evolutionary exchangeability of many amino acids. CONCLUSIONS Synthetic overlapping genes which closely resemble natural gene sequences, as measured by HMM profiles, are remarkably easy to construct, and most arbitrary domain pairs can be altered so as to overlap while retaining high similarity to the original sequences. Future work however will need to assess important factors not considered such as intragenic interactions which affect protein folding. While the analysis here is not sufficient to guarantee functional folding proteins, further analysis of constructed OLGs will improve our understanding of the origin of these remarkable genetic elements across life and opens up exciting possibilities for synthetic biology.
Collapse
Affiliation(s)
- Stefan Wichmann
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Siegfried Scherer
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany
| | - Zachary Ardern
- Chair of Microbial Ecology, Department of Molecular Life Sciences, Technical University of Munich, Freising, Germany.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
9
|
Abstract
Selection for resource conservation can shape the coding sequences of organisms living in nutrient-limited environments. Recently, it was proposed that selection for resource conservation, specifically for nitrogen and carbon content, has also shaped the structure of the standard genetic code, such that the missense mutations the code allows tend to cause small increases in the number of nitrogen and carbon atoms in amino acids. Moreover, it was proposed that this optimization is not confounded by known optimizations of the standard genetic code, such as for polar requirement or hydropathy. We challenge these claims. We show the proposed optimization for nitrogen conservation is highly sensitive to choice of null model and the proposed optimization for carbon conservation is confounded by the known conservative nature of the standard genetic code with respect to the molecular volume of amino acids. There is therefore little evidence the standard genetic code is optimized for resource conservation. We discuss our findings in the context of null models of the standard genetic code.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Quartier UNIL-Sorge, Lausanne, Switzerland
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Quartier UNIL-Sorge, Lausanne, Switzerland
| |
Collapse
|
10
|
Štambuk N, Konjevoda P, Pavan J. Antisense Peptide Technology for Diagnostic Tests and Bioengineering Research. Int J Mol Sci 2021; 22:9106. [PMID: 34502016 PMCID: PMC8431130 DOI: 10.3390/ijms22179106] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 08/10/2021] [Accepted: 08/13/2021] [Indexed: 01/01/2023] Open
Abstract
Antisense peptide technology (APT) is based on a useful heuristic algorithm for rational peptide design. It was deduced from empirical observations that peptides consisting of complementary (sense and antisense) amino acids interact with higher probability and affinity than the randomly selected ones. This phenomenon is closely related to the structure of the standard genetic code table, and at the same time, is unrelated to the direction of its codon sequence translation. The concept of complementary peptide interaction is discussed, and its possible applications to diagnostic tests and bioengineering research are summarized. Problems and difficulties that may arise using APT are discussed, and possible solutions are proposed. The methodology was tested on the example of SARS-CoV-2. It is shown that the CABS-dock server accurately predicts the binding of antisense peptides to the SARS-CoV-2 receptor binding domain without requiring predefinition of the binding site. It is concluded that the benefits of APT outweigh the costs of random peptide screening and could lead to considerable savings in time and resources, especially if combined with other computational and immunochemical methods.
Collapse
Affiliation(s)
- Nikola Štambuk
- Center for Nuclear Magnetic Resonance, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000 Zagreb, Croatia
| | - Paško Konjevoda
- Laboratory for Epigenomics, Division of Molecular Medicine, Ruđer Bošković Institute, Bijenička cesta 54, HR-10000 Zagreb, Croatia
| | - Josip Pavan
- Department of Ophthalmology, University Hospital Dubrava, Avenija Gojka Šuška 6, HR-10000 Zagreb, Croatia
| |
Collapse
|
11
|
Paredes O, Morales JA, Mendizabal AP, Romo-Vázquez R. Metacode: One code to rule them all. Biosystems 2021; 208:104486. [PMID: 34274462 DOI: 10.1016/j.biosystems.2021.104486] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 07/07/2021] [Accepted: 07/09/2021] [Indexed: 12/13/2022]
Abstract
The code of codes or metacode is a microcosm where biological layers, as well as their codes, interact together allowing the continuity of information flow in organisms by increasing biological entities' complexity. Through this novel organic code, biological systems scale towards niches with higher informatic freedom building structures that increase the entropy in the universe. Code biology has developed a novel informational framework where biological entities strive themselves through the information flow carried out through organic codes consisting of two molecular or functional landscapes intertwined through arbitrary linkages via an adaptor whose nature is autonomous from molecular determinism. Here we will integrate genomic and epigenomic codes according to the evidence released in ENCODE (phase 3), psychENCODE and GTEx project, outlining the principles of the metacode, to address the continuous nature of biological systems and their inter-layered information flow. This novel complex metacode maps from very constrained sets of elements (i.e., regulation sites modulating gene expression) to new ones with greater freedom of decoding (i.e., a continuous cell phenotypic space). This leads to a new domain in code biology where biological systems are informatic attractors that navigate an energy metaspace through a complexity-noise balance, stalling in emergent niches where organic codes take meaning.
Collapse
Affiliation(s)
- Omar Paredes
- Computer Sciences Department, CUCEI, Universidad de Guadalajara, Mexico
| | | | - Adriana P Mendizabal
- Molecular Biology Laboratory, Farmacobiology Department, CUCEI, Universidad de Guadalajara, Mexico
| | | |
Collapse
|
12
|
Thorvaldsen S, Hössjer O. Using statistical methods to model the fine-tuning of molecular machines and systems. J Theor Biol 2020; 501:110352. [PMID: 32505827 DOI: 10.1016/j.jtbi.2020.110352] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2019] [Revised: 05/26/2020] [Accepted: 05/27/2020] [Indexed: 10/24/2022]
Abstract
Fine-tuning has received much attention in physics, and it states that the fundamental constants of physics are finely tuned to precise values for a rich chemistry and life permittance. It has not yet been applied in a broad manner to molecular biology. However, in this paper we argue that biological systems present fine-tuning at different levels, e.g. functional proteins, complex biochemical machines in living cells, and cellular networks. This paper describes molecular fine-tuning, how it can be used in biology, and how it challenges conventional Darwinian thinking. We also discuss the statistical methods underpinning fine-tuning and present a framework for such analysis.
Collapse
Affiliation(s)
| | - Ola Hössjer
- Stockholm University, Dep. of Mathematics, Division of Mathematical Statistics, Sweden.
| |
Collapse
|
13
|
Ardern Z, Neuhaus K, Scherer S. Are Antisense Proteins in Prokaryotes Functional? Front Mol Biosci 2020; 7:187. [PMID: 32923454 PMCID: PMC7457138 DOI: 10.3389/fmolb.2020.00187] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 07/16/2020] [Indexed: 12/16/2022] Open
Abstract
Many prokaryotic RNAs are transcribed from loci outside of annotated protein coding genes. Across bacterial species hundreds of short open reading frames antisense to annotated genes show evidence of both transcription and translation, for instance in ribosome profiling data. Determining the functional fraction of these protein products awaits further research, including insights from studies of molecular interactions and detailed evolutionary analysis. There are multiple lines of evidence, however, that many of these newly discovered proteins are of use to the organism. Condition-specific phenotypes have been characterized for a few. These proteins should be added to genome annotations, and the methods for predicting them standardized. Evolutionary analysis of these typically young sequences also may provide important insights into gene evolution. This research should be prioritized for its exciting potential to uncover large numbers of novel proteins with extremely diverse potential practical uses, including applications in synthetic biology and responding to pathogens.
Collapse
Affiliation(s)
- Zachary Ardern
- Chair for Microbial Ecology, Technical University of Munich, Munich, Germany
| | | | | |
Collapse
|
14
|
Chan KF, Koukouravas S, Yeo JY, Koh DWS, Gan SKE. Probability of change in life: Amino acid changes in single nucleotide substitutions. Biosystems 2020; 193-194:104135. [PMID: 32259562 DOI: 10.1016/j.biosystems.2020.104135] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 03/24/2020] [Accepted: 03/27/2020] [Indexed: 12/31/2022]
Abstract
Mutations underpin the processes in life, be it beneficial or detrimental. While mutations are assumed to be random in the bereft of selection pressures, the genetic code has underlying computable probabilities in amino acid phenotypic changes. With a wide range of implications including drug resistance, understanding amino acid changes is important. In this study, we calculated the probabilities of substitutions mutations in the genetic code leading to the 20 amino acids and stop codons. Our calculations reveal an enigmatic in-built self-preserving organization of the genetic code that averts disruptive changes at the physicochemical properties level. These changes include changes to start, aromatic, negative charged amino acids and stop codons. Our findings thus reveal a statistical mechanism governing the relationship between amino acids and the universal genetic code.
Collapse
Affiliation(s)
- Kwok-Fong Chan
- Antibody & Product Development Lab, BII, A(∗)STAR, 138671, Singapore
| | | | - Joshua Yi Yeo
- Antibody & Product Development Lab, BII, A(∗)STAR, 138671, Singapore
| | | | - Samuel Ken-En Gan
- Antibody & Product Development Lab, BII, A(∗)STAR, 138671, Singapore; P53 Laboratory, A(∗)STAR, Singapore; Experimental Drug Development Centre, A(∗)STAR, Singapore.
| |
Collapse
|