Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Tsonis AA, Elsner JB, Tsonis PA. Periodicity in DNA coding sequences: implications in gene evolution. J Theor Biol 1991;151:323-31. [PMID: 1943144 DOI: 10.1016/s0022-5193(05)80381-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

For:	Tsonis AA, Elsner JB, Tsonis PA. Periodicity in DNA coding sequences: implications in gene evolution. J Theor Biol 1991;151:323-31. [PMID: 1943144 DOI: 10.1016/s0022-5193(05)80381-9] [Citation(s) in RCA: 69] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Number

Cited by Other Article(s)

Arruda M, da Silva A, de Assis F. An Adaptive Mapping Method Using Spectral Envelope Approach for DNA Spectral Analysis. ENTROPY 2022;24:e24070978. [PMID: 35885202 PMCID: PMC9323741 DOI: 10.3390/e24070978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/07/2022] [Accepted: 07/12/2022] [Indexed: 11/16/2022]

SAVMD: An adaptive signal processing method for identifying protein coding regions. Biomed Signal Process Control 2021. [DOI: 10.1016/j.bspc.2021.102998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Tsonis AA, Wang G, Zhang L, Lu W, Kayafas A, Del Rio-Tsonis K. An application of slow feature analysis to the genetic sequences of coronaviruses and influenza viruses. Hum Genomics 2021;15:26. [PMID: 33962680 PMCID: PMC8103670 DOI: 10.1186/s40246-021-00327-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 04/19/2021] [Indexed: 12/03/2022] Open

Yin C. Latent periodicity-2 in coronavirus SARS-CoV-2 genome: Evolutionary implications. J Theor Biol 2021;515:110604. [PMID: 33508323 PMCID: PMC7835100 DOI: 10.1016/j.jtbi.2021.110604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Revised: 01/02/2021] [Accepted: 01/21/2021] [Indexed: 11/25/2022]

Zheng Q, Chen T, Zhou W, Xie L, Su H. Gene prediction by the noise-assisted MEMD and wavelet transform for identifying the protein coding regions. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2020.12.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Han S, Liang Y, Ma Q, Xu Y, Zhang Y, Du W, Wang C, Li Y. LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property. Brief Bioinform 2020;20:2009-2027. [PMID: 30084867 PMCID: PMC6954391 DOI: 10.1093/bib/bby065] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2018] [Revised: 06/20/2018] [Indexed: 12/31/2022] Open

Raman Kumar M, Vaegae NK. A new numerical approach for DNA representation using modified Gabor wavelet transform for the identification of protein coding regions. Biocybern Biomed Eng 2020. [DOI: 10.1016/j.bbe.2020.03.007] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Michel CJ, Thompson JD. Identification of a circular code periodicity in the bacterial ribosome: origin of codon periodicity in genes? RNA Biol 2020;17:571-583. [PMID: 31960748 PMCID: PMC8647727 DOI: 10.1080/15476286.2020.1719311] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 01/10/2020] [Accepted: 01/14/2020] [Indexed: 02/09/2023] Open

Abstract

Three-base periodicity (TBP), where nucleotides and higher order n-tuples are preferentially spaced by 3, 6, 9, etc. bases, is a well-known intrinsic property of protein-coding DNA sequences. However, its origins are still not fully understood. One hypothesis is that the periodicity reflects a primordial coding system that was used before the emergence of the modern standard genetic code (SGC). Recent evidence suggests that the X circular code, a set of 20 trinucleotides allowing the reading frames in genes to be retrieved locally, represents a possible ancestor of the SGC. Motifs from the X circular code have been found in the reading frame of protein-coding regions in extant organisms from bacteria to eukaryotes, in many transfer RNA (tRNA) genes and in important functional regions of the ribosomal RNA (rRNA), notably in the peptidyl transferase centre and the decoding centre. Here, we have used a powerful correlation function to search for periodicity patterns involving the 20 trinucleotides of the X circular code in a large set of bacterial protein-coding genes, as well as in the translation machinery, including rRNA and tRNA sequences. As might be expected, we found a strong circular code periodicity 0 modulo 3 in the protein-coding genes. More surprisingly, we also identified a similar circular code periodicity in a large region of the 16S rRNA. This region includes the 3' major domain corresponding to the primordial proto-ribosome decoding centre and containing numerous sites that interact with the tRNA and messenger RNA (mRNA) during translation. Furthermore, 3D structural analysis shows that the periodicity region surrounds the mRNA channel that lies between the head and the body of the SSU. Our results support the hypothesis that the X circular code may constitute an ancestral translation code involved in reading frame retrieval and maintenance, traces of which persist in modern mRNA, tRNA and rRNA despite their long evolution and adaptation to the SGC.

Collapse

Kar S, Ganguly M, Das S. USING DIT-FFT ALGORITHM FOR IDENTIFICATION OF PROTEIN CODING REGION IN EUKARYOTIC GENE. BIOMEDICAL ENGINEERING-APPLICATIONS BASIS COMMUNICATIONS 2019. [DOI: 10.4015/s1016237219500029] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Anzalone AV, Zairis S, Lin AJ, Rabadan R, Cornish VW. Interrogation of Eukaryotic Stop Codon Readthrough Signals by in Vitro RNA Selection. Biochemistry 2019;58:1167-1178. [PMID: 30698415 DOI: 10.1021/acs.biochem.8b01280] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Das L, Nanda S, Das JK. An integrated approach for identification of exon locations using recursive Gauss Newton tuned adaptive Kaiser window. Genomics 2018;111:284-296. [PMID: 30342085 DOI: 10.1016/j.ygeno.2018.10.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2018] [Revised: 09/11/2018] [Accepted: 10/11/2018] [Indexed: 11/27/2022]

Zhao J, Wang J, Jiang H. Detecting Periodicities in Eukaryotic Genomes by Ramanujan Fourier Transform. J Comput Biol 2018;25:963-975. [PMID: 29963923 DOI: 10.1089/cmb.2017.0252] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Computational Techniques for a Comprehensive Understanding of Different Genotype-Phenotype Factors in Biological Systems and Their Applications. Synth Biol (Oxf) 2018. [DOI: 10.1007/978-981-10-8693-9_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Wang Y, Chen X, Sheng Y, Liu Y, Gao S. N6-adenine DNA methylation is associated with the linker DNA of H2A.Z-containing well-positioned nucleosomes in Pol II-transcribed genes in Tetrahymena. Nucleic Acids Res 2017;45:11594-11606. [PMID: 29036602 PMCID: PMC5714169 DOI: 10.1093/nar/gkx883] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Revised: 09/12/2017] [Accepted: 09/23/2017] [Indexed: 01/01/2023] Open

Messaoudi I, Elloumi Oueslati A, Lachiri Z. Inferring Helitron Structures from 1D and 2D Representations Based on the Chaos Game Theory. Ing Rech Biomed 2017. [DOI: 10.1016/j.irbm.2017.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]

Morán Losada P, Fischer S, Chouvarine P, Tümmler B. Three-base periodicity of sites of sequence variation in Pseudomonas aeruginosa and Staphylococcus aureus core genomes. FEBS Lett 2016;590:3538-3543. [PMID: 27664047 DOI: 10.1002/1873-3468.12431] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2016] [Revised: 09/08/2016] [Accepted: 09/12/2016] [Indexed: 11/11/2022]

Marhon SA, Kremer SC. Prediction of Protein Coding Regions Using a Wide-Range Wavelet Window Method. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:742-753. [PMID: 26415183 DOI: 10.1109/tcbb.2015.2476789] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Zhang X, Shen Z, Zhang G, Shen Y, Chen M, Zhao J, Wu R. Short Exon Detection via Wavelet Transform Modulus Maxima. PLoS One 2016;11:e0163088. [PMID: 27635656 PMCID: PMC5026382 DOI: 10.1371/journal.pone.0163088] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Accepted: 09/04/2016] [Indexed: 02/05/2023] Open

Howe ED, Song JS. Categorical spectral analysis of periodicity in human and viral genomes. Nucleic Acids Res 2012;41:1395-405. [PMID: 23241388 PMCID: PMC3561982 DOI: 10.1093/nar/gks1261] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Chaley M, Kutyrkin V. Profile-statistical periodicity of DNA coding regions. DNA Res 2011;18:353-62. [PMID: 21788253 PMCID: PMC3190956 DOI: 10.1093/dnares/dsr023] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Smith A, Johnson P. Gene expression in the unicellular eukaryote Trichomonas vaginalis. Res Microbiol 2011;162:646-54. [DOI: 10.1016/j.resmic.2011.04.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2011] [Accepted: 03/02/2011] [Indexed: 02/01/2023]

Trotta E. The 3-base periodicity and codon usage of coding sequences are correlated with gene expression at the level of transcription elongation. PLoS One 2011;6:e21590. [PMID: 21738721 PMCID: PMC3125259 DOI: 10.1371/journal.pone.0021590] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Accepted: 06/03/2011] [Indexed: 11/18/2022] Open

Abstract

Background

Gene transcription is regulated by DNA transcriptional regulatory elements, promoters and enhancers that are located outside the coding regions. Here, we examine the characteristic 3-base periodicity of the coding sequences and analyse its correlation with the genome-wide transcriptional profile of yeast.

Principal Findings

The analysis of coding sequences by a new class of indices proposed here identified two different sources of 3-base periodicity: the codon frequency and the codon sequence. In exponentially growing yeast cells, the codon-frequency component of periodicity accounts for 71.9% of the variability of the cellular mRNA by a strong association with the density of elongating mRNA polymerase II complexes. The mRNA abundance explains most of the correlation between the codon-frequency component of periodicity and protein levels. Furthermore, pyrimidine-ending codons of the four-fold degenerate small amino acids alanine, glycine and valine are associated with genes with double the transcription rate of those associated with purine-ending codons.

Conclusions

We demonstrate that the 3-base periodicity of coding sequences is higher than expected by the codon usage frequency (CUF) and that its components, associated with codon bias and amino acid composition, are correlated with gene expression, principally at the level of transcription elongation. This indicates a role of codon sequences in maximising the transcription efficiency in exponentially growing yeast cells. Moreover, the results contrast with the common Darwinian explanation that attributes the codon bias to translational selection by an adjustment of synonymous codon frequencies to the most abundant isoaccepting tRNA. Here, we show that selection on codon bias likely acts at both the transcriptional and translational level and that codon usage and the relative abundance of tRNA could drive each other in order to synergistically optimize the efficiency of gene expression.

Collapse

Marhon SA, Kremer SC. Gene Prediction Based on DNA Spectral Analysis: A Literature Review. J Comput Biol 2011;18:639-76. [PMID: 21381961 DOI: 10.1089/cmb.2010.0184] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Sahu SS, Panda G. Identification of protein-coding regions in DNA sequences using a time-frequency filtering approach. GENOMICS, PROTEOMICS & BIOINFORMATICS 2011;9:45-55. [PMID: 21641562 PMCID: PMC5054166 DOI: 10.1016/s1672-0229(11)60007-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Accepted: 10/31/2010] [Indexed: 11/13/2022]

Xu S, Rao N, Chen X, Zhou B. Inferring an organism-specific optimal threshold for predicting protein coding regions in eukaryotes based on a bootstrapping algorithm. Biotechnol Lett 2011;33:889-96. [DOI: 10.1007/s10529-011-0525-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Accepted: 01/06/2011] [Indexed: 11/25/2022]

Wang L, Stein LD. Localizing triplet periodicity in DNA and cDNA sequences. BMC Bioinformatics 2010;11:550. [PMID: 21059240 PMCID: PMC2992068 DOI: 10.1186/1471-2105-11-550] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2010] [Accepted: 11/08/2010] [Indexed: 01/23/2023] Open

Abstract

Background

The protein-coding regions (coding exons) of a DNA sequence exhibit a triplet periodicity (TP) due to fact that coding exons contain a series of three nucleotide codons that encode specific amino acid residues. Such periodicity is usually not observed in introns and intergenic regions. If a DNA sequence is divided into small segments and a Fourier Transform is applied on each segment, a strong peak at frequency 1/3 is typically observed in the Fourier spectrum of coding segments, but not in non-coding regions. This property has been used in identifying the locations of protein-coding genes in unannotated sequence. The method is fast and requires no training. However, the need to compute the Fourier Transform across a segment (window) of arbitrary size affects the accuracy with which one can localize TP boundaries. Here, we report a technique that provides higher-resolution identification of these boundaries, and use the technique to explore the biological correlates of TP regions in the genome of the model organism C. elegans.

Results

Using both simulated TP signals and the real C. elegans sequence F56F11 as an example, we demonstrate that, (1) Modified Wavelet Transform (MWT) can better define the boundary of TP region than the conventional Short Time Fourier Transform (STFT); (2) The scale parameter (a) of MWT determines the precision of TP boundary localization: bigger values of a give sharper TP boundaries but result in a lower signal to noise ratio; (3) RNA splicing sites have weaker TP signals than coding region; (4) TP signals in coding region can be destroyed or recovered by frame-shift mutations; (5) 6 bp periodicities in introns and intergenic region can generate false positive signals and it can be removed with 6 bp MWT.

Conclusions

MWT can provide more precise TP boundaries than STFT and the boundaries can be further refined by bigger scale MWT. Subtraction of 6 bp periodicity signals reduces the number of false positives. Experimentally-introduced frame-shift mutations help recover TP signal that have been lost by possible ancient frame-shifts. More importantly, TP signal has the potential to be used to detect the splice junctions in fully spliced mRNA sequence.

Collapse

Hirayama S, Mizuta S. Significant deviations in the configurations of homologous tandem repeats in prokaryotic genomes. GENOMICS PROTEOMICS & BIOINFORMATICS 2010;7:163-74. [PMID: 20172489 PMCID: PMC5054416 DOI: 10.1016/s1672-0229(08)60046-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Sánchez R, Grau R. An algebraic hypothesis about the primeval genetic code architecture. Math Biosci 2009;221:60-76. [PMID: 19607845 DOI: 10.1016/j.mbs.2009.07.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2008] [Revised: 06/23/2009] [Accepted: 07/09/2009] [Indexed: 11/26/2022]

Chirila TV, Minamisawa T, Keen I, Shiba K. Effect of Motif-Programmed Artificial Proteins on the Calcium Uptake in a Synthetic Hydrogel. Macromol Biosci 2009;9:959-67. [DOI: 10.1002/mabi.200900096] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

An efficient sliding window strategy for accurate location of eukaryotic protein coding regions. Comput Biol Med 2009;39:392-5. [DOI: 10.1016/j.compbiomed.2009.01.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Revised: 01/16/2009] [Accepted: 01/28/2009] [Indexed: 11/22/2022]

Chen K, Meng Q, Ma L, Liu Q, Tang P, Chiu C, Hu S, Yu J. A novel DNA sequence periodicity decodes nucleosome positioning. Nucleic Acids Res 2008;36:6228-36. [PMID: 18829715 PMCID: PMC2577358 DOI: 10.1093/nar/gkn626] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

Yin C, Yau SST. Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J Theor Biol 2007;247:687-94. [PMID: 17509616 DOI: 10.1016/j.jtbi.2007.03.038] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2006] [Revised: 03/24/2007] [Accepted: 03/26/2007] [Indexed: 11/30/2022]

Larsabal E, Danchin A. Genomes are covered with ubiquitous 11 bp periodic patterns, the "class A flexible patterns". BMC Bioinformatics 2005;6:206. [PMID: 16120222 PMCID: PMC1242344 DOI: 10.1186/1471-2105-6-206] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2005] [Accepted: 08/24/2005] [Indexed: 11/17/2022] Open

Ruvinsky A, Eskesen ST, Eskesen FN, Hurst LD. Can codon usage bias explain intron phase distributions and exon symmetry? J Mol Evol 2005;60:99-104. [PMID: 15696372 DOI: 10.1007/s00239-004-0032-9] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2004] [Accepted: 08/31/2004] [Indexed: 10/25/2022]

Abstract

More introns exist between codons (phase 0) than between the first and the second bases (phase 1) or between the second and the third base (phase 2) within the codon. Many explanations have been suggested for this excess of phase 0. It has, for example, been argued to reflect an ancient utility for introns in separating exons that code for separate protein modules. There may, however, be a simple, alternative explanation. Introns typically require, for correct splicing, particular nucleotides immediately 5' in exons (typically a G) and immediately 3' in the following exon (also often a G). Introns therefore tend to be found between particular nucleotide pairs (e.g., G|G pairs) in the coding sequence. If, owing to bias in usage of different codons, these pairs are especially common at phase 0, then intron phase biases may have a trivial explanation. Here we take codon usage frequencies for a variety of eukaryotes and use these to generate random sequences. We then ask about the phase of putative intron insertion sites. Importantly, in all simulated data sets intron phase distribution is biased in favor of phase 0. In many cases the bias is of the magnitude observed in real data and can be attributed to codon usage bias. It is also known that exons may carry either the same phase (symmetric) or different phases (asymmetric) at the opposite ends. We simulated a distribution of different types of exons using frequencies of introns observed in real genes assuming random combination of intron phases at the opposite sides of exons. Surprisingly the simulated pattern was quite similar to that observed. In the simulants we typically observe a prevalence of symmetric exons carrying phase 0 at both ends, which is common for eukaryotic genes. However, at least in some species, the extent of the bias in favor of symmetric (0,0) exons is not as great in simulants as in real genes. These results emphasize the need to construct a biologically relevant null model of successful intron insertion.

Collapse

Nikolaou C, Almirantis Y. Mutually symmetric and complementary triplets: differences in their use distinguish systematically between coding and non-coding genomic sequences. J Theor Biol 2003;223:477-87. [PMID: 12875825 DOI: 10.1016/s0022-5193(03)00123-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Kotlar D, Lavner Y. Gene prediction by spectral rotation measure: a new method for identifying protein-coding regions. Genome Res 2003;13:1930-7. [PMID: 12869578 PMCID: PMC403785 DOI: 10.1101/gr.1261703] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2003] [Accepted: 05/21/2003] [Indexed: 11/24/2022]

Fukushima A, Ikemura T, Kinouchi M, Oshima T, Kudo Y, Mori H, Kanaya S. Periodicity in prokaryotic and eukaryotic genomes identified by power spectrum analysis. Gene 2002;300:203-11. [PMID: 12468102 DOI: 10.1016/s0378-1119(02)00850-8] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Shiba K, Takahashi Y, Noda T. On the role of periodism in the origin of proteins. J Mol Biol 2002;320:833-40. [PMID: 12095259 DOI: 10.1016/s0022-2836(02)00567-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Wang Y, Zhang CT, Dong P. Recognizing shorter coding regions of human genes based on the statistics of stop codons. Biopolymers 2002;63:207-16. [PMID: 11787008 DOI: 10.1002/bip.10054] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Janssen CS, Barrett MP, Lawson D, Quail MA, Harris D, Bowman S, Phillips RS, Turner CM. Gene discovery in Plasmodium chabaudi by genome survey sequencing. Mol Biochem Parasitol 2001;113:251-60. [PMID: 11295179 DOI: 10.1016/s0166-6851(01)00224-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Kawashima T, Amano N, Koike H, Makino S, Higuchi S, Kawashima-Ohya Y, Watanabe K, Yamazaki M, Kanehori K, Kawamoto T, Nunoshiba T, Yamamoto Y, Aramaki H, Makino K, Suzuki M. Archaeal adaptation to higher temperatures revealed by genomic sequence of Thermoplasma volcanium. Proc Natl Acad Sci U S A 2000;97:14257-62. [PMID: 11121031 PMCID: PMC18905 DOI: 10.1073/pnas.97.26.14257] [Citation(s) in RCA: 150] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Jackson JH, George R, Herring PA. Vectors of shannon information from fourier signals characterizing base periodicity in genes and genomes. Biochem Biophys Res Commun 2000;268:289-92. [PMID: 10679195 DOI: 10.1006/bbrc.2000.2112] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Ramensky VE, Roytberg MA, Tumanyan VG. DNA segmentation through the Bayesian approach. J Comput Biol 2000;7:215-31. [PMID: 10890398 DOI: 10.1089/10665270050081487] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Nishizawa K, Nishizawa M, Kim KS. Tendency for local repetitiveness in amino acid usages in modern proteins. J Mol Biol 1999;294:937-53. [PMID: 10588898 DOI: 10.1006/jmbi.1999.3275] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Tatarenkov A, Sáez AG, Ayala FJ. A compact gene cluster in Drosophila: the unrelated Cs gene is compressed between duplicated amd and Ddc. Gene 1999;231:111-20. [PMID: 10231575 DOI: 10.1016/s0378-1119(99)00096-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Suckow JM, Amano N, Ohfuku Y, Kakinuma J, Koike H, Suzuki M. A transcription frame-based analysis of the genomic DNA sequence of a hyper-thermophilic archaeon for the identification of genes, pseudo-genes and operon structures. FEBS Lett 1998;426:86-92. [PMID: 9598984 DOI: 10.1016/s0014-5793(98)00323-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Almirantis Y, Provata A. The "clustered structure" of the purines/pyrimidines distribution in DNA distinguishes systematically between coding and non-coding sequences. Bull Math Biol 1997;59:975-92. [PMID: 9281907 DOI: 10.1007/bf02460002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Tsonis AA, Kumar P, Elsner JB, Tsonis PA. Wavelet analysis of DNA sequences. PHYSICAL REVIEW. E, STATISTICAL PHYSICS, PLASMAS, FLUIDS, AND RELATED INTERDISCIPLINARY TOPICS 1996;53:1828-1834. [PMID: 9964445 DOI: 10.1103/physreve.53.1828] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]