1
|
Ikehara K. Why Were [GADV]-amino Acids and GNC Codons Selected and How Was GNC Primeval Genetic Code Established? Genes (Basel) 2023; 14:genes14020375. [PMID: 36833302 PMCID: PMC9957433 DOI: 10.3390/genes14020375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/25/2023] [Accepted: 01/28/2023] [Indexed: 02/05/2023] Open
Abstract
Correspondence relations between codons and amino acids are determined by genetic code. Therefore, genetic code holds a key of the life system composed of genes and protein. According to the GNC-SNS primitive genetic code hypothesis, which I have proposed, it is assumed that the genetic code originated from GNC code. In this article, first, it is discussed from a standpoint of primeval protein synthesis, why four [GADV]-amino acids were selected and used in the first GNC code. Next, it is explained from another standpoint of the most primitive anticodon-stem loop tRNAs (AntiC-SL tRNAs), how four GNCs were selected for the first codons. Furthermore, in the last section of this article, I will explain my idea of how the correspondence relations between four [GADV]-amino acids and four GNC codons were established. Namely, the origin and evolution of the genetic code was discussed comprehensively from several aspects of [GADV]-proteins, [GADV]-amino acids, GNC codons, and anticodon stem-loop tRNAs (AntiC-SL tRNAs), which relate each other to the origin of the genetic code, as integrating GNC code frozen-accident theory, coevolution theory, and adaptive theory on the origin of the genetic code.
Collapse
Affiliation(s)
- Kenji Ikehara
- G&L Kyosei Institute, The Keihanna Academy of Science and Culture (KASC), Keihanna Interaction Plaza, Lab. Wing 3F, 1-7 Hikaridai, Seika-cho, Souraku, Kyoto 619-0237, Japan;
- International Institute for Advanced Studies, Kizugawadai 9-3, Kizugawa, Kyoto 619-0225, Japan
| |
Collapse
|
2
|
Ikehara K. How Did Life Emerge in Chemically Complex Messy Environments? Life (Basel) 2022; 12:life12091319. [PMID: 36143356 PMCID: PMC9503616 DOI: 10.3390/life12091319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 08/18/2022] [Accepted: 08/25/2022] [Indexed: 11/18/2022] Open
Abstract
One of the problems that make it difficult to solve the mystery of the origin of life is determining how life emerged in chemically complex messy environments on primitive Earth. In this article, the “chemically complex messy environments” that are focused on are a mixed state of various organic compounds produced via prebiotic means and accumulated on primitive earth. The five factors described below are thought to have contributed to opening the way for the emergence of life: (1) A characteristic inherent in [GADV]-amino acids, which are easily produced via prebiotic means. [GADV] stands for four amino acids, Gly [G], Ala [A], Asp [D] and Val [V], which are indicated by a one-letter symbol. (2) The protein 0th-order structure or a [GADV]-amino acid composition generating water-soluble globular protein with some flexibility, which can be produced even by the random joining of [GADV]-amino acids. (3) The formation of versatile [GADV]-microspheres, which can grow, divide and proliferate even without a genetic system, was the emergence of proto-life. (4) The [GADV]-microspheres with a higher proliferation ability than others were able to be selected. Proto-Darwin evolution made it possible to proceed forward to the creation of a core life system composed of the (GNC)n gene, anticodon stem-loop tRNA or AntiC-SL tRNA (GNC genetic code), and [GADV]-protein. (5) Eventually, the first genuine life with a core life system emerged. Thus, the formation processes of [GADV]-protein and the (GNC)n gene in chemically complex messy environments were the steps to the emergence of genuine life.
Collapse
Affiliation(s)
- Kenji Ikehara
- G&L Kyosei Institute, The Keihanna Academy of Science and Culture (KASC), Keihanna Interaction Plaza, Lab. Wing 3F, 1-7 Hikaridai, Seika-cho, Souraku, Kyoto 619-0237, Japan; ; Tel.: +81-774-73-4478
- International Institute for Advanced Studies, Kizugawadai 9-3, Kizugawa, Kyoto 619-0225, Japan
| |
Collapse
|
3
|
Ikehara K. Evolutionary Steps in the Emergence of Life Deduced from the Bottom-Up Approach and GADV Hypothesis (Top-Down Approach). Life (Basel) 2016; 6:life6010006. [PMID: 26821048 PMCID: PMC4810237 DOI: 10.3390/life6010006] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2015] [Revised: 12/30/2015] [Accepted: 01/18/2016] [Indexed: 02/05/2023] Open
Abstract
It is no doubt quite difficult to solve the riddle of the origin of life. So, firstly, I would like to point out the kinds of obstacles there are in solving this riddle and how we should tackle these difficult problems, reviewing the studies that have been conducted so far. After that, I will propose that the consecutive evolutionary steps in a timeline can be rationally deduced by using a common event as a juncture, which is obtained by two counter-directional approaches: one is the bottom-up approach through which many researchers have studied the origin of life, and the other is the top-down approach, through which I established the [GADV]-protein world hypothesis or GADV hypothesis on the origin of life starting from a study on the formation of entirely new genes in extant microorganisms. Last, I will describe the probable evolutionary process from the formation of Earth to the emergence of life, which was deduced by using a common event-the establishment of the first genetic code encoding [GADV]-amino acids-as a juncture for the results obtained from the two approaches.
Collapse
Affiliation(s)
- Kenji Ikehara
- G & L Kyosei Institute, Keihannna Labo-401, Hikaridai 1-7, Seika-cho, Sorakugun, Kyoto 619-0237, Japan.
- International Institute for Advanced Studies of Japan, Kizugawadai 9-3, Kizugawa, Kyoto 619-0225, Japan.
| |
Collapse
|
4
|
Reichenberger ER, Rosen G, Hershberg U, Hershberg R. Prokaryotic nucleotide composition is shaped by both phylogeny and the environment. Genome Biol Evol 2015; 7:1380-9. [PMID: 25861819 PMCID: PMC4453058 DOI: 10.1093/gbe/evv063] [Citation(s) in RCA: 57] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/06/2015] [Indexed: 02/07/2023] Open
Abstract
The causes of the great variation in nucleotide composition of prokaryotic genomes have long been disputed. Here, we use extensive metagenomic and whole-genome data to demonstrate that both phylogeny and the environment shape prokaryotic nucleotide content. We show that across environments, various phyla are characterized by different mean guanine and cytosine (GC) values as well as by the extent of variation on that mean value. At the same time, we show that GC-content varies greatly as a function of environment, in a manner that cannot be entirely explained by disparities in phylogenetic composition. We find environmentally driven differences in nucleotide content not only between highly diverged environments (e.g., soil, vs. aquatic vs. human gut) but also within a single type of environment. More specifically, we demonstrate that some human guts are associated with a microbiome that is consistently more GC-rich across phyla, whereas others are associated with a more AT-rich microbiome. These differences appear to be driven both by variations in phylogenetic composition and by environmental differences-which are independent of these phylogenetic composition differences. Combined, our results demonstrate that both phylogeny and the environment significantly affect nucleotide composition and that the environmental differences affecting nucleotide composition are far subtler than previously appreciated.
Collapse
Affiliation(s)
- Erin R Reichenberger
- Department of Biomedical Engineering, Science & Health Systems, Drexel University
| | - Gail Rosen
- Department of Computer and Electrical Engineering, Drexel University
| | - Uri Hershberg
- Department of Biomedical Engineering, Science & Health Systems, Drexel University Department of Microbiology and Immunology, Drexel University College of Medicine
| | - Ruth Hershberg
- Rachel and Menachem Mendelovitch Evolutionary Processes of Mutation and Natural Selection Research Laboratory, Department of Genetics and Developmental Biology, The Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
5
|
Naamati G, Fromer M, Linial M. Expansion of tandem repeats in sea anemone Nematostella vectensis proteome: A source for gene novelty? BMC Genomics 2009; 10:593. [PMID: 20003297 PMCID: PMC2805694 DOI: 10.1186/1471-2164-10-593] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2009] [Accepted: 12/10/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The complete proteome of the starlet sea anemone, Nematostella vectensis, provides insights into gene invention dating back to the Cnidarian-Bilaterian ancestor. With the addition of the complete proteomes of Hydra magnipapillata and Monosiga brevicollis, the investigation of proteins having unique features in early metazoan life has become practical. We focused on the properties and the evolutionary trends of tandem repeat (TR) sequences in Cnidaria proteomes. RESULTS We found that 11-16% of N. vectensis proteins contain tandem repeats. Most TRs cover 150 amino acid segments that are comprised of basic units of 5-20 amino acids. In total, the N. Vectensis proteome has about 3300 unique TR-units, but only a small fraction of them are shared with H. magnipapillata, M. brevicollis, or mammalian proteomes. The overall abundance of these TRs stands out relative to that of 14 proteomes representing the diversity among eukaryotes and within the metazoan world. TR-units are characterized by a unique composition of amino acids, with cysteine and histidine being over-represented. Structurally, most TR-segments are associated with coiled and disordered regions. Interestingly, 80% of the TR-segments can be read in more than one open reading frame. For over 100 of them, translation of the alternative frames would result in long proteins. Most domain families that are characterized as repeats in eukaryotes are found in the TR-proteomes from Nematostella and Hydra. CONCLUSIONS While most TR-proteins have originated from prediction tools and are still awaiting experimental validations, supportive evidence exists for hundreds of TR-units in Nematostella. The existence of TR-proteins in early metazoan life may have served as a robust mode for novel genes with previously overlooked structural and functional characteristics.
Collapse
|
6
|
Pseudo-replication of [GADV]-proteins and origin of life. Int J Mol Sci 2009; 10:1525-1537. [PMID: 19468323 PMCID: PMC2680631 DOI: 10.3390/ijms10041525] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2009] [Revised: 03/30/2009] [Accepted: 04/01/2009] [Indexed: 11/16/2022] Open
Abstract
The RNA world hypothesis on the origin of life is generally considered as the key to solve the “chicken and egg dilemma” concerning the evolution of genes and proteins as observed in the modern organisms. This hypothesis, however, contains several serious weak points. We have a counterproposal called [GADV]-protein world hypothesis, abbreviated as GADV hypothesis, in which we have suggested that life originated from a [GADV]-protein world, which comprised proteins composed of four amino acids: Gly [G], Ala [A], Asp [D], and Val [V]. A new concept “pseudo-replication” is crucial for the description of the emergence of life. The new hypothesis not only plausibly explains how life originated from the initial chaotic protein world, but also how genes, genetic code, and proteins co-evolved.
Collapse
|
7
|
Hsieh SY, Cheng CS. Finding a maximum-density path in a tree under the weight and length constraints. INFORM PROCESS LETT 2008. [DOI: 10.1016/j.ipl.2007.08.031] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Guo FB. The distribution patterns of bases of protein-coding genes, non-coding ORFs, and intergenic sequences in pseudomonas aeruginosa PA01 genome and its implications. J Biomol Struct Dyn 2008; 25:127-33. [PMID: 17718591 DOI: 10.1080/07391102.2007.10507161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
The distribution patterns of bases of DNA fragments in different regions in P. aeruginosa genome are analyzed in this paper. It's shown that 5565 protein-coding genes, 17315 non-coding ORFs, and 1104 intergenic sequences are located into seven clusters based on their base frequencies. Almost all the protein-coding genes are contained in one of the seven clusters. The significant difference of base frequencies among three codon positions in high GC genome, which arouse the division between the distribution patterns of bases of six reading frames of protein-coding genes, is responsible for the appearance of the clustering phenomenon. In the light of the clustering phenomenon, the author supposes that the anitisense strand ORFs, particularly those corresponding to Frame 2' and Frame 3', may not code for proteins in P. aeruginosa genome.
Collapse
Affiliation(s)
- F-B Guo
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
9
|
Oba T, Fukushima J, Maruyama M, Iwamoto R, Ikehara K. Catalytic activities of [GADV]-peptides. Formation and establishment of [GADV]-protein world for the emergence of life. ORIGINS LIFE EVOL B 2005; 35:447-60. [PMID: 16231208 DOI: 10.1007/s11084-005-3519-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2004] [Accepted: 02/08/2005] [Indexed: 11/27/2022]
Abstract
We have previously postulated a novel hypothesis for the origin of life, assuming that life on the earth originated from "[GADV]-protein world", not from the "RNA world" (see Ikehara's review, 2002). The [GADV]-protein world is constituted from peptides and proteins with random sequences of four amino acids (glycine [G], alanine [A], aspartic acid [D] and valine [V]), which accumulated by pseudo-replication of the [GADV]-proteins. To obtain evidence for the hypothesis, we produced [GADV]-peptides by repeated heat-drying of the amino acids for 30 cycles ([GADV]-P(30)) and examined whether the peptides have some catalytic activities or not. From the results, it was found that the [GADV]-P(30) can hydrolyze several kinds of chemical bonds in molecules, such as umbelliferyl-beta-D-galactoside, glycine-p-nitroanilide and bovine serum albumin. This suggests that [GADV]-P(30) could play an important role in the accumulation of [GADV]-proteins through pseudo-replication, leading to the emergence of life. We further show that [GADV]-octapaptides with random sequences, but containing no cyclic compounds as diketepiperazines, have catalytic activity, hydrolyzing peptide bonds in a natural protein, bovine serum albumin. The catalytic activity of the octapeptides was much higher than the [GADV]-P(30) produced through repeated heat-drying treatments. These results also support the [GADV]-protein-world hypothesis of the origin of life (see Ikehara's review, 2002). Possible steps for the emergence of life on the primitive earth are presented.
Collapse
Affiliation(s)
- Takae Oba
- Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya-nishi-machi, Nara 630-8506, Japan
| | | | | | | | | |
Collapse
|
10
|
Abstract
Based on the fact that RNA has not only a genetic function but also a catalytic function, the RNA world theory on the origin of life was first proposed about 20 years ago. The theory assumes that RNA was amplified by self-replication to increase RNA diversity on the primitive earth. Since then, the theory has been widely accepted as the most likely explanation for the emergence of life. In contrast, we reached another hypothesis, the [GADV]-protein world hypothesis, which is based on pseudo-replication of [GADV]-proteins. We reached this hypothesis during studies on the origins of genes and the genetic code, where [G], [A], [D], and [V] refer to Gly, Ala, Asp, and Val, respectively. In this review, possible steps to the emergence of life are discussed from the standpoint of the [GADV]-protein world hypothesis, comparing it in parallel with the RNA world theory. It is also shown that [GADV]-peptides, which were produced by repeated dry-heating cycles and by solid phase peptide synthesis, have catalytic activities, hydrolyzing peptide bonds in a natural protein, bovine serum albumin. These experimental results support the [GADV]-protein world hypothesis for the origin of life.
Collapse
Affiliation(s)
- Kenji Ikehara
- Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya-nishi-machi, Nara, Nara 630-8506, Japan.
| |
Collapse
|
11
|
|
12
|
Abstract
It is known that different codons may be unified into larger groups related to the hierarchical structure, approximate hidden symmetries, and evolutionary origin of the universal genetic code. Using a simplified evolutionary motivated two-letter version of genetic code, the general principles of the most stable coding are discussed. By the complete enumeration in such a reduced code it is strictly proved that the maximum stability with respect to point mutations and shifts in the reading frame needs the fixation of the middle letters within codons in groups with different physico-chemical properties, thus, explaining a key feature of the universal genetic code. The translational stability of the genetic code is studied by the mapping of code onto de Bruijn graph providing both the compact visual representation of mutual relationships between different codons as well as between codons and protein coding DNA sequence and a powerful tool for the investigation of stability of protein coding. Then, the results are extended to four-letter codes. As is shown, the universal genetic code obeys mainly the principles of optimal coding. These results demonstrate the hierarchical character of optimization of universal genetic code with strictly optimal coding being evolved at the earliest stages of molecular evolution. Finally, the universal genetic code is compared with the other natural variants of genetic codes.
Collapse
Affiliation(s)
- V R Chechetkin
- Theoretical Department of Division for Perspective Investigations, Troitsk Institute of Innovation and Thermonuclear Investigations (TRINITI), 142190 Moscow Region, Russia.
| |
Collapse
|
13
|
Ikehara K. Origins of gene, genetic code, protein and life: comprehensive view of life systems from a GNC-SNS primitive genetic code hypothesis. J Biosci 2002; 27:165-86. [PMID: 11937687 DOI: 10.1007/bf02703773] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We have investigated the origin of genes, the genetic code, proteins and life using six indices (hydropathy, alpha-helix, beta-sheet and beta-turn formabilities, acidic amino acid content and basic amino acid content) necessary for appropriate three-dimensional structure formation of globular proteins. From the analysis of microbial genes, we have concluded that newly-born genes are products of nonstop frames (NSF) on antisense strands of microbial GC-rich genes [GC-NSF(a)] and from SNS repeating sequences [(SNS)n] similar to the GC-NSF(a) (S and N mean G or C and either of four bases, respectively). We have also proposed that the universal genetic code used by most organisms on the earth presently could be derived from a GNC-SNS primitive genetic code. We have further presented the [GADV]-protein world hypothesis of the origin of life as well as a hypothesis of protein production, suggesting that proteins were originally produced by random peptide formation of amino acids restricted in specific amino acid compositions termed as GNC-, SNS- and GC-NSF(a)-0th order structures of proteins. The [GADV]-protein world hypothesis is primarily derived from the GNC-primitive genetic code hypothesis. It is also expected that basic properties of extant genes and proteins could be revealed by considerations based on the scenario with four stages.
Collapse
Affiliation(s)
- K Ikehara
- Department of Chemistry, Faculty of Science, Nara Women's University, Kita-uoya-nishi-machi, Nara, Nara 630-8506, Japan.
| |
Collapse
|
14
|
Abstract
An increasingly comprehensive assessment is being developed of the extent and potential significance of lateral gene transfer among microbial genomes. Genomic sequences can be identified as being of putatively lateral origin by their unexpected phyletic distribution, atypical sequence composition, differential presence or absence in closely related genomes, or incongruent phylogenetic trees. These complementary approaches sometimes yield inconsistent results. Not only more data but also quantitative models and simulations are needed urgently.
Collapse
Affiliation(s)
- M A Ragan
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland 4072, Australia.
| |
Collapse
|
15
|
Glansdorff N. About the last common ancestor, the universal life-tree and lateral gene transfer: a reappraisal. Mol Microbiol 2000; 38:177-85. [PMID: 11069646 DOI: 10.1046/j.1365-2958.2000.02126.x] [Citation(s) in RCA: 95] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
An organismal tree rooted in the bacterial branch and derived from a hyperthermophilic last common ancestor (LCA) is still widely assumed to represent the path followed by evolution from the most primeval cells to the three domains recognized among contemporary organisms: Bacteria, Archaea and Eucarya. In the past few years, however, more and more discrepancies between this pattern and individual protein trees have been brought to light. There has been an overall tendency to attribute these incongruities to widespread lateral gene transfer. However, recent developments, a reappraisal of earlier evidence and considerations of our own lead us to a quite different view. It would appear (i) that the role of lateral gene transfer was overemphasized in recent discussions of molecular phylogenies; (ii) that the LCA was probably a non-thermophilic protoeukaryote from which both Archaea and Bacteria emerged by reductive evolution but not as sister groups, in keeping with a current evolutionary scheme for the biosynthesis of membrane lipids; and (iii) that thermophilic Archaea may have been the first branch to diverge from the ancestral line.
Collapse
Affiliation(s)
- N Glansdorff
- Microbiology, Free University of Brussels (VUB), Flanders Interuniversity Institute and J.-M. Wiame Microbiological Research Institute, Brussels B-1070, Belgium.
| |
Collapse
|
16
|
Wilquet V, Van de Casteele M. The role of the codon first letter in the relationship between genomic GC content and protein amino acid composition. Res Microbiol 1999; 150:21-32. [PMID: 10096131 DOI: 10.1016/s0923-2508(99)80043-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Analysis of the statistical distribution of amino acid compositions within 22 protein families shows that a GC bias generally affects proteins with a variety of functions from the extreme thermophile Thermus. This results in evident enrichment in amino acids of the group L, V, A, P, R and G and underrepresentation of amino acids of the group I, M, F, S, T, C and W. The strong amino acid composition biases noted in Thermus proteins are not related to thermoadaptation; they were also found in mesophilic homologues encoded by GC-rich genes. The results of a comparative analysis on large samples of translated sequences from 30 organisms, representing the three major kingdoms of life and including extremophiles, indicate a universal correlation between the usage of particular amino acids and the genomic GC content. It is concluded that the codon first letter plays a dominant role in translating the genomic GC signature into protein amino acid composition and sequences.
Collapse
Affiliation(s)
- V Wilquet
- Laboratoire de Microbiologie, Université Libre de Bruxelles (ULB), Belgium
| | | |
Collapse
|