Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Aïssani B, D'Onofrio G, Mouchiroud D, Gardiner K, Gautier C, Bernardi G. The compositional properties of human genes. J Mol Evol 1991;32:493-503. [PMID: 1908020 DOI: 10.1007/bf02102651] [Citation(s) in RCA: 52] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

For:	Aïssani B, D'Onofrio G, Mouchiroud D, Gardiner K, Gautier C, Bernardi G. The compositional properties of human genes. J Mol Evol 1991;32:493-503. [PMID: 1908020 DOI: 10.1007/bf02102651] [Citation(s) in RCA: 52] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Number

Cited by Other Article(s)

Rudolph KLM, Schmitt BM, Villar D, White RJ, Marioni JC, Kutter C, Odom DT. Codon-Driven Translational Efficiency Is Stable across Diverse Mammalian Cell States. PLoS Genet 2016;12:e1006024. [PMID: 27166679 PMCID: PMC4864286 DOI: 10.1371/journal.pgen.1006024] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 04/12/2016] [Indexed: 11/19/2022] Open

Berná L, Chaurasia A, Angelini C, Federico C, Saccone S, D'Onofrio G. The footprint of metabolism in the organization of mammalian genomes. BMC Genomics 2012;13:174. [PMID: 22568857 PMCID: PMC3384468 DOI: 10.1186/1471-2164-13-174] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 05/08/2012] [Indexed: 01/02/2023] Open

Abstract

Background

At present five evolutionary hypotheses have been proposed to explain the great variability of the genomic GC content among and within genomes: the mutational bias, the biased gene conversion, the DNA breakpoints distribution, the thermal stability and the metabolic rate. Several studies carried out on bacteria and teleostean fish pointed towards the critical role played by the environment on the metabolic rate in shaping the base composition of genomes. In mammals the debate is still open, and evidences have been produced in favor of each evolutionary hypothesis. Human genes were assigned to three large functional categories (as well as to the corresponding functional classes) according to the KOG database: (i) information storage and processing, (ii) cellular processes and signaling, and (iii) metabolism. The classification was extended to the organisms so far analyzed performing a reciprocal Blastp and selecting the best reciprocal hit. The base composition was calculated for each sequence of the whole CDS dataset.

Results

The GC3 level of the above functional categories was increasing from (i) to (iii). This specific compositional pattern was found, as footprint, in all mammalian genomes, but not in frog and lizard ones. Comparative analysis of human versus both frog and lizard functional categories showed that genes involved in the metabolic processes underwent the highest GC3 increment. Analyzing the KOG functional classes of genes, again a well defined intra-genomic pattern was found in all mammals. Not only genes of metabolic pathways, but also genes involved in chromatin structure and dynamics, transcription, signal transduction mechanisms and cytoskeleton, showed an average GC3 level higher than that of the whole genome. In the case of the human genome, the genes of the aforementioned functional categories showed a high probability to be associated with the chromosomal bands.

Conclusions

In the light of different evolutionary hypotheses proposed so far, and contributing with different potential to the genome compositional heterogeneity of mammalian genomes, the one based on the metabolic rate seems to play not a minor role. Keeping in mind similar results reported in bacteria and in teleosts, the specific compositional patterns observed in mammals highlight metabolic rate as unifying factor that fits over a wide range of living organisms.

Collapse

Medvedeva YA, Fridman MV, Oparina NJ, Malko DB, Ermakova EO, Kulakovskiy IV, Heinzel A, Makeev VJ. Intergenic, gene terminal, and intragenic CpG islands in the human genome. BMC Genomics 2010;11:48. [PMID: 20085634 PMCID: PMC2817693 DOI: 10.1186/1471-2164-11-48] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open

Abstract

Background

Recently, it has been discovered that the human genome contains many transcription start sites for non-coding RNA. Regulatory regions related to transcription of this non-coding RNAs are poorly studied. Some of these regulatory regions may be associated with CpG islands located far from transcription start-sites of any protein coding gene. The human genome contains many such CpG islands; however, until now their properties were not systematically studied.

Results

We studied CpG islands located in different regions of the human genome using methods of bioinformatics and comparative genomics. We have observed that CpG islands have a preference to overlap with exons, including exons located far from transcription start site, but usually extend well into introns. Synonymous substitution rate of CpG-containing codons becomes substantially reduced in regions where CpG islands overlap with protein-coding exons, even if they are located far downstream from transcription start site. CAGE tag analysis displayed frequent transcription start sites in all CpG islands, including those found far from transcription start sites of protein coding genes. Computational prediction and analysis of published ChIP-chip data revealed that CpG islands contain an increased number of sites recognized by Sp1 protein. CpG islands containing more CAGE tags usually also contain more Sp1 binding sites. This is especially relevant for CpG islands located in 3' gene regions. Various examples of transcription, confirmed by mRNAs or ESTs, but with no evidence of protein coding genes, were found in CAGE-enriched CpG islands located far from transcription start site of any known protein coding gene.

Conclusions

CpG islands located far from transcription start sites of protein coding genes have transcription initiation activity and display Sp1 binding properties. In exons, overlapping with these islands, the synonymous substitution rate of CpG containing codons is decreased. This suggests that these CpG islands are involved in transcription initiation, possibly of some non-coding RNAs.

Collapse

Duret L, Galtier N. Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet 2009;10:285-311. [PMID: 19630562 DOI: 10.1146/annurev-genom-082908-150001] [Citation(s) in RCA: 468] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Creanza TM, Horner DS, D'Addabbo A, Maglietta R, Mignone F, Ancona N, Pesole G. Statistical assessment of discriminative features for protein-coding and non coding cross-species conserved sequence elements. BMC Bioinformatics 2009;10 Suppl 6:S2. [PMID: 19534745 PMCID: PMC2697643 DOI: 10.1186/1471-2105-10-s6-s2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Abstract

Background

The identification of protein coding elements in sets of mammalian conserved elements is one of the major challenges in the current molecular biology research. Many features have been proposed for automatically distinguishing coding and non coding conserved sequences, making so necessary a systematic statistical assessment of their differences. A comprehensive study should be composed of an association study, i.e. a comparison of the distributions of the features in the two classes, and a prediction study in which the prediction accuracies of classifiers trained on single and groups of features are analyzed, conditionally to the compared species and to the sequence lengths.

Results

In this paper we compared distributions of a set of comparative and non comparative features and evaluated the prediction accuracy of classifiers trained for discriminating sequence elements conserved among human, mouse and rat species. The association study showed that the analyzed features are statistically different in the two classes. In order to study the influence of the sequence lengths on the feature performances, a predictive study was performed on different data sets composed of coding and non coding alignments in equal number and equally long with an ascending average length. We found that the most discriminant feature was a comparative measure indicating the proportion of synonymous nucleotide substitutions per synonymous sites. Moreover, linear discriminant classifiers trained by using comparative features in general outperformed classifiers based on intrinsic ones. Finally, the prediction accuracy of classifiers trained on comparative features increased significantly by adding intrinsic features to the set of input variables, independently on sequence length (Kolmogorov-Smirnov P-value ≤ 0.05).

Conclusion

We observed distinct and consistent patterns for individual and combined use of comparative and intrinsic classifiers, both with respect to different lengths of sequences/alignments and with respect to error rates in the classification of coding and non-coding elements. In particular, we noted that comparative features tend to be more accurate in the classification of coding sequences – this is likely related to the fact that such features capture deviations from strictly neutral evolution expected as a consequence of the characteristics of the genetic code.

Collapse

Elhaik E, Landan G, Graur D. Can GC content at third-codon positions be used as a proxy for isochore composition? Mol Biol Evol 2009;26:1829-33. [PMID: 19443854 DOI: 10.1093/molbev/msp100] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Different functional classes of genes are characterized by different compositional properties. FEBS Lett 2007;581:5819-24. [DOI: 10.1016/j.febslet.2007.11.052] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Revised: 11/14/2007] [Accepted: 11/16/2007] [Indexed: 11/19/2022]

Bag SK, Paul S, Ghosh S, Dutta C. Reverse polarization in amino acid and nucleotide substitution patterns between human-mouse orthologs of two compositional extrema. DNA Res 2007;14:141-54. [PMID: 17895298 PMCID: PMC2533592 DOI: 10.1093/dnares/dsm015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Gu J, Li WH. Are GC-rich isochores vanishing in mammals? Gene 2006;385:50-6. [PMID: 16987615 DOI: 10.1016/j.gene.2006.03.026] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2005] [Accepted: 03/30/2006] [Indexed: 10/25/2022]

Fortes GG, Bouza C, Martínez P, Sánchez L. Diversity in isochore structure among cold-blooded vertebrates based on GC content of coding and non-coding sequences. Genetica 2006;129:281-9. [PMID: 16897446 DOI: 10.1007/s10709-006-0009-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2005] [Accepted: 04/19/2006] [Indexed: 11/29/2022]

Belle EMS, Duret L, Galtier N, Eyre-Walker A. The decline of isochores in mammals: an assessment of the GC content variation along the mammalian phylogeny. J Mol Evol 2004;58:653-60. [PMID: 15461422 DOI: 10.1007/s00239-004-2587-x] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Jabbari K, Bernardi G. Comparative genomics of Anopheles gambiae and Drosophila melanogaster. Gene 2004;333:183-6. [PMID: 15177694 DOI: 10.1016/j.gene.2004.02.038] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2003] [Accepted: 02/10/2004] [Indexed: 10/26/2022]

Duret L, Semon M, Piganeau G, Mouchiroud D, Galtier N. Vanishing GC-rich isochores in mammalian genomes. Genetics 2002;162:1837-47. [PMID: 12524353 PMCID: PMC1462357 DOI: 10.1093/genetics/162.4.1837] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

D'Onofrio G, Ghosh TC, Bernardi G. The base composition of the genes is correlated with the secondary structures of the encoded proteins. Gene 2002;300:179-87. [PMID: 12468099 DOI: 10.1016/s0378-1119(02)01045-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Vinogradov AE. Within-intron correlation with base composition of adjacent exons in different genomes. Gene 2001;276:143-51. [PMID: 11591481 DOI: 10.1016/s0378-1119(01)00638-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Bernardi G. Isochores and the evolutionary genomics of vertebrates. Gene 2000;241:3-17. [PMID: 10607893 DOI: 10.1016/s0378-1119(99)00485-0] [Citation(s) in RCA: 357] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Majumdar S, Gupta SK, Sundararajan VS, Ghosh TC. Compositional correlation studies among the three different codon positions in 12 bacterial genomes. Biochem Biophys Res Commun 1999;266:66-71. [PMID: 10581166 DOI: 10.1006/bbrc.1999.1774] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Abstract

Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.

Collapse

D'Onofrio G, Jabbari K, Musto H, Bernardi G. The correlation of protein hydropathy with the base composition of coding sequences. Gene 1999;238:3-14. [PMID: 10570978 DOI: 10.1016/s0378-1119(99)00257-7] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]

Eyre-Walker A. Evidence of selection on silent site base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics 1999;152:675-83. [PMID: 10353909 PMCID: PMC1460637 DOI: 10.1093/genetics/152.2.675] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Biunno I, Rogozin IB, Appierto V, Milanesi L, Mostardini M, Mumm S, Pergolizzi R, Zucchi I, De Bellis G. Sequence and gene content in 35 kb genomic clone mapping in the human Xq27.1 region. DNA SEQUENCE : THE JOURNAL OF DNA SEQUENCING AND MAPPING 1998;8:1-15. [PMID: 9522116 DOI: 10.3109/10425179709020880] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Chiapello H, Lisacek F, Caboche M, Hénaut A. Codon usage and gene function are related in sequences of Arabidopsis thaliana. Gene 1998;209:GC1-GC38. [PMID: 9583944 DOI: 10.1016/s0378-1119(97)00671-9] [Citation(s) in RCA: 126] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Bernardi G, Hughes S, Mouchiroud D. The major compositional transitions in the vertebrate genome. J Mol Evol 1997;44 Suppl 1:S44-51. [PMID: 9071011 DOI: 10.1007/pl00000051] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Gardiner K. Base composition and gene distribution: critical patterns in mammalian genome organization. Trends Genet 1996;12:519-24. [PMID: 9257535 DOI: 10.1016/s0168-9525(97)81400-x] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Saccone S, Cacciò S, Kusuda J, Andreozzi L, Bernardi G. Identification of the gene-richest bands in human chromosomes. Gene 1996;174:85-94. [PMID: 8863733 DOI: 10.1016/0378-1119(96)00392-7] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Abstract

The human genome is a mosaic of isochores, long DNA segments which are compositionally homogeneous and which can be partitioned into five families, L1, L2, H1, H2 and H3, characterized by increasing GC levels and by increasing gene concentrations. Previous investigations showed that in situ hybridization with a DNA fraction derived from the GC-richest and gene-richest isochores of the H3 family produced the highest concentration of signals on 25 R(everse) bands that include the 22 most thermal-denaturation-resistant T(elomeric) bands, a subset of R bands. Using an improved protocol for in situ hybridization and cloned H3 isochore DNA, we have now shown (i) that the number of bands which are characterized by strong hybridization signals, and which are here called T or H3+, is 28; (ii) that 31 additional R bands, here called T'or H3* bands, also contain H3 isochores, although at a lower concentration than H3+ bands; and (iii) that the remaining R bands (about 140 out of 200, at a resolution of 400 bands), here called R" or H3- bands, do not contain any detectable H3 isochores. H3+ and H3* bands contain all the gene-richest isochores of the human genome. The existence of three distinct sets of R bands is further supported (i) by the different compositional features of genes located in them; (ii) by the very low gene density of chromosomes 13 and 18, in which all R bands are H3- bands; (iii) by the compositional map of a H3* band, Xq28; (iv) by the overwhelming presence of GC-rich and GC-poor long (> 50 kb) DNA sequences in H3+/H3* and in H3-/G bands, respectively; and (v) by the large degree of coincidence of H3+ and H3* bands with CpG island-positive bands. These observations have implications for our understanding of the causes of chromosome banding and provide a classification of chromosomal bands that is related to GC level (and to gene concentration).

Collapse

Zoubak S, Clay O, Bernardi G. The gene distribution of the human genome. Gene X 1996;174:95-102. [PMID: 8863734 DOI: 10.1016/0378-1119(96)00393-9] [Citation(s) in RCA: 159] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

De Sario A, Geigl EM, Palmieri G, D'Urso M, Bernardi G. A compositional map of human chromosome band Xq28. Proc Natl Acad Sci U S A 1996;93:1298-302. [PMID: 8577758 PMCID: PMC40074 DOI: 10.1073/pnas.93.3.1298] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open

Musto H, Rodríguez-Maseda H, Alvarez F. Compositional correlations in the nuclear genes of the flatworm Schistosoma mansoni. J Mol Evol 1995;40:343-6. [PMID: 7723062 DOI: 10.1007/bf00163240] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Mouchiroud D, Gautier C, Bernardi G. Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions. J Mol Evol 1995;40:107-13. [PMID: 7714909 DOI: 10.1007/bf00166602] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Rodríguez-Maseda H, Musto H. The compositional compartments of the nuclear genomes of Trypanosoma brucei and T. cruzi. Gene 1994;151:221-4. [PMID: 7828878 DOI: 10.1016/0378-1119(94)90660-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Cacciò S, Perani P, Saccone S, Kadi F, Bernardi G. Single-copy sequence homology among the GC-richest isochores of the genomes from warm-blooded vertebrates. J Mol Evol 1994;39:331-9. [PMID: 7966363 DOI: 10.1007/bf00160265] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Zoubak S, Richardson JH, Rynditch A, Höllsberg P, Hafler DA, Boeri E, Lever AM, Bernardi G. Regional specificity of HTLV-I proviral integration in the human genome. Gene X 1994;143:155-63. [PMID: 8206368 DOI: 10.1016/0378-1119(94)90091-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Berkhout B, van Hemert FJ. The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. Nucleic Acids Res 1994;22:1705-11. [PMID: 8202375 PMCID: PMC308053 DOI: 10.1093/nar/22.9.1705] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open

Sabeur G, Macaya G, Kadi F, Bernardi G. The isochore patterns of mammalian genomes and their phylogenetic implications. J Mol Evol 1993;37:93-108. [PMID: 8411213 DOI: 10.1007/bf02407344] [Citation(s) in RCA: 54] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Collins DW, Jukes TH. Relationship between G + C in silent sites of codons and amino acid composition of human proteins. J Mol Evol 1993;36:201-13. [PMID: 8483158 DOI: 10.1007/bf00160475] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Bettecken T, Aissani B, Müller CR, Bernardi G. Compositional mapping of the human dystrophin-encoding gene. Gene 1992;122:329-35. [PMID: 1487147 DOI: 10.1016/0378-1119(92)90222-b] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Karlin S, Bucher P. Correlation analysis of amino acid usage in protein classes. Proc Natl Acad Sci U S A 1992;89:12165-9. [PMID: 1465457 PMCID: PMC50719 DOI: 10.1073/pnas.89.24.12165] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Eyre-Walker A. The role of DNA replication and isochores in generating mutation and silent substitution rate variance in mammals. Genet Res (Camb) 1992;60:61-7. [PMID: 1452015 DOI: 10.1017/s0016672300030676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Eyre-Walker A. Evidence that both G + C rich and G + C poor isochores are replicated early and late in the cell cycle. Nucleic Acids Res 1992;20:1497-501. [PMID: 1579441 PMCID: PMC312229 DOI: 10.1093/nar/20.7.1497] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Sueoka N. Directional mutation pressure, selective constraints, and genetic equilibria. J Mol Evol 1992;34:95-114. [PMID: 1556753 DOI: 10.1007/bf00182387] [Citation(s) in RCA: 113] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

D'Onofrio G, Bernardi G. A universal compositional correlation among codon positions. Gene 1992;110:81-8. [PMID: 1544580 DOI: 10.1016/0378-1119(92)90447-w] [Citation(s) in RCA: 56] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Matassi G, Melis R, Macaya G, Bernardi G. Compositional bimodality of the nuclear genome of tobacco. Nucleic Acids Res 1991;19:5561-7. [PMID: 1658735 PMCID: PMC328957 DOI: 10.1093/nar/19.20.5561] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

Aïssani B, Bernardi G. CpG islands, genes and isochores in the genomes of vertebrates. Gene X 1991;106:185-95. [PMID: 1937049 DOI: 10.1016/0378-1119(91)90198-k] [Citation(s) in RCA: 69] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Krane DE, Hartl DL, Ochman H. Rapid determination of nucleotide content and its application to the study of genome structure. Nucleic Acids Res 1991;19:5181-5. [PMID: 1833723 PMCID: PMC328873 DOI: 10.1093/nar/19.19.5181] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

VanWye JD, Bronson EC, Anderson JN. Species-specific patterns of DNA bending and sequence. Nucleic Acids Res 1991;19:5253-61. [PMID: 1923808 PMCID: PMC328884 DOI: 10.1093/nar/19.19.5253] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Abstract

Nucleotide sequences in the GenEMBL database were analyzed using strategies designed to reveal species-specific patterns of DNA bending and DNA sequence. The results uncovered striking species-dependent patterns of bending with more variations among individual organisms than between prokaryotes and eukaryotes. The frequency of bent sites in sequences from different bacteria was related to genomic A + T content and this relationship was confirmed by electrophoretic analysis of genomic DNA. However, base composition was not an accurate predictor for DNA bending in eukaryotes. Sequences from C. elegans exhibited the highest frequency of bent sites in the database and the RNA polymerase II locus from the nematode was the most bent gene in GenEMBL. Bent DNA extended throughout most introns and gene flanking segments from C.elegans while exon regions lacked A-tract bending characteristics. Independent evidence for the strong bending character of this genome was provided by electrophoretic studies which revealed that a large number of the fragments from C.elegans DNA exhibited anomalous gel mobilities when compared to genomic fragments from over 20 other organisms. The prevalence of bent sites in this genome enabled us to detect selectively C.elegans sequences in a computer search of the database using as probes C.elegans introns, bending elements, and a 20 nucleotide consensus sequence for bent DNA. This approach was also used to provide additional examples of species-specific sequence patterns in eukaryotes where it was shown that (A) greater than or equal to 10 and (A.T) greater than or equal to 5 tracts are prevalent throughout the untranslated DNA of D.discodium and P.falciparum, respectively. These results provide new insight into the organization of eukaryotic DNA because they show that species-specific patterns of simple sequences are found in introns and in other untranslated regions of the genome.

Collapse

D'Onofrio G, Mouchiroud D, Aïssani B, Gautier C, Bernardi G. Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 1991;32:504-10. [PMID: 1908021 DOI: 10.1007/bf02102652] [Citation(s) in RCA: 124] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Mouchiroud D, D'Onofrio G, Aïssani B, Macaya G, Gautier C, Bernardi G. The distribution of genes in the human genome. Gene X 1991;100:181-7. [PMID: 2055469 DOI: 10.1016/0378-1119(91)90364-h] [Citation(s) in RCA: 181] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open