1
|
Rudolph KLM, Schmitt BM, Villar D, White RJ, Marioni JC, Kutter C, Odom DT. Codon-Driven Translational Efficiency Is Stable across Diverse Mammalian Cell States. PLoS Genet 2016; 12:e1006024. [PMID: 27166679 PMCID: PMC4864286 DOI: 10.1371/journal.pgen.1006024] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 04/12/2016] [Indexed: 11/19/2022] Open
Abstract
Whether codon usage fine-tunes mRNA translation in mammals remains controversial, with recent papers suggesting that production of proteins in specific Gene Ontological (GO) pathways can be regulated by actively modifying the codon and anticodon pools in different cellular conditions. In this work, we compared the sequence content of genes in specific GO categories with the exonic genome background. Although a substantial fraction of variability in codon usage could be explained by random sampling, almost half of GO sets showed more variability in codon usage than expected by chance. Nevertheless, by quantifying translational efficiency in healthy and cancerous tissues in human and mouse, we demonstrated that a given tRNA pool can equally well translate many different sets of mRNAs, irrespective of their cell-type specificity. This disconnect between variations in codon usage and the stability of translational efficiency is best explained by differences in GC content between gene sets. GC variation across the mammalian genome is most likely a result of the interplay between genome repair and gene duplication mechanisms, rather than selective pressures caused by codon-driven translational rates. Consequently, codon usage differences in mammalian transcriptomes are most easily explained by well-understood mutational biases acting on the underlying genome.
Collapse
Affiliation(s)
- Konrad L. M. Rudolph
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, United Kingdom
| | - Bianca M. Schmitt
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
| | - Diego Villar
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
| | - Robert J. White
- University of York, Department of Biology, York, United Kingdom
| | - John C. Marioni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, United Kingdom
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
- Wellcome Trust Sanger Institute, Cambridge, United Kingdom
| | - Claudia Kutter
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
- Science for Life Laboratory, Karolinska Institute, Department of Microbiology, Tumor and Cell Biology, Stockholm, Sweden
| | - Duncan T. Odom
- University of Cambridge, Cancer Research UK Cambridge Institute, Cambridge, United Kingdom
- Wellcome Trust Sanger Institute, Cambridge, United Kingdom
| |
Collapse
|
2
|
Berná L, Chaurasia A, Angelini C, Federico C, Saccone S, D'Onofrio G. The footprint of metabolism in the organization of mammalian genomes. BMC Genomics 2012; 13:174. [PMID: 22568857 PMCID: PMC3384468 DOI: 10.1186/1471-2164-13-174] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 05/08/2012] [Indexed: 01/02/2023] Open
Abstract
Background At present five evolutionary hypotheses have been proposed to explain the great variability of the genomic GC content among and within genomes: the mutational bias, the biased gene conversion, the DNA breakpoints distribution, the thermal stability and the metabolic rate. Several studies carried out on bacteria and teleostean fish pointed towards the critical role played by the environment on the metabolic rate in shaping the base composition of genomes. In mammals the debate is still open, and evidences have been produced in favor of each evolutionary hypothesis. Human genes were assigned to three large functional categories (as well as to the corresponding functional classes) according to the KOG database: (i) information storage and processing, (ii) cellular processes and signaling, and (iii) metabolism. The classification was extended to the organisms so far analyzed performing a reciprocal Blastp and selecting the best reciprocal hit. The base composition was calculated for each sequence of the whole CDS dataset. Results The GC3 level of the above functional categories was increasing from (i) to (iii). This specific compositional pattern was found, as footprint, in all mammalian genomes, but not in frog and lizard ones. Comparative analysis of human versus both frog and lizard functional categories showed that genes involved in the metabolic processes underwent the highest GC3 increment. Analyzing the KOG functional classes of genes, again a well defined intra-genomic pattern was found in all mammals. Not only genes of metabolic pathways, but also genes involved in chromatin structure and dynamics, transcription, signal transduction mechanisms and cytoskeleton, showed an average GC3 level higher than that of the whole genome. In the case of the human genome, the genes of the aforementioned functional categories showed a high probability to be associated with the chromosomal bands. Conclusions In the light of different evolutionary hypotheses proposed so far, and contributing with different potential to the genome compositional heterogeneity of mammalian genomes, the one based on the metabolic rate seems to play not a minor role. Keeping in mind similar results reported in bacteria and in teleosts, the specific compositional patterns observed in mammals highlight metabolic rate as unifying factor that fits over a wide range of living organisms.
Collapse
Affiliation(s)
- Luisa Berná
- Genome Evolution and Organization - Department Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy
| | | | | | | | | | | |
Collapse
|
3
|
Medvedeva YA, Fridman MV, Oparina NJ, Malko DB, Ermakova EO, Kulakovskiy IV, Heinzel A, Makeev VJ. Intergenic, gene terminal, and intragenic CpG islands in the human genome. BMC Genomics 2010; 11:48. [PMID: 20085634 PMCID: PMC2817693 DOI: 10.1186/1471-2164-11-48] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open
Abstract
Background Recently, it has been discovered that the human genome contains many transcription start sites for non-coding RNA. Regulatory regions related to transcription of this non-coding RNAs are poorly studied. Some of these regulatory regions may be associated with CpG islands located far from transcription start-sites of any protein coding gene. The human genome contains many such CpG islands; however, until now their properties were not systematically studied. Results We studied CpG islands located in different regions of the human genome using methods of bioinformatics and comparative genomics. We have observed that CpG islands have a preference to overlap with exons, including exons located far from transcription start site, but usually extend well into introns. Synonymous substitution rate of CpG-containing codons becomes substantially reduced in regions where CpG islands overlap with protein-coding exons, even if they are located far downstream from transcription start site. CAGE tag analysis displayed frequent transcription start sites in all CpG islands, including those found far from transcription start sites of protein coding genes. Computational prediction and analysis of published ChIP-chip data revealed that CpG islands contain an increased number of sites recognized by Sp1 protein. CpG islands containing more CAGE tags usually also contain more Sp1 binding sites. This is especially relevant for CpG islands located in 3' gene regions. Various examples of transcription, confirmed by mRNAs or ESTs, but with no evidence of protein coding genes, were found in CAGE-enriched CpG islands located far from transcription start site of any known protein coding gene. Conclusions CpG islands located far from transcription start sites of protein coding genes have transcription initiation activity and display Sp1 binding properties. In exons, overlapping with these islands, the synonymous substitution rate of CpG containing codons is decreased. This suggests that these CpG islands are involved in transcription initiation, possibly of some non-coding RNAs.
Collapse
Affiliation(s)
- Yulia A Medvedeva
- Research Institute for Genetics and Selection of Industrial Microorganisms, Genetika, 1st Dorozhny proezd, 1, Moscow, 117545, Russia.
| | | | | | | | | | | | | | | |
Collapse
|
4
|
Duret L, Galtier N. Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet 2009; 10:285-311. [PMID: 19630562 DOI: 10.1146/annurev-genom-082908-150001] [Citation(s) in RCA: 468] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recombination is typically thought of as a symmetrical process resulting in large-scale reciprocal genetic exchanges between homologous chromosomes. Recombination events, however, are also accompanied by short-scale, unidirectional exchanges known as gene conversion in the neighborhood of the initiating double-strand break. A large body of evidence suggests that gene conversion is GC-biased in many eukaryotes, including mammals and human. AT/GC heterozygotes produce more GC- than AT-gametes, thus conferring a population advantage to GC-alleles in high-recombining regions. This apparently unimportant feature of our molecular machinery has major evolutionary consequences. Structurally, GC-biased gene conversion explains the spatial distribution of GC-content in mammalian genomes-the so-called isochore structure. Functionally, GC-biased gene conversion promotes the segregation and fixation of deleterious AT --> GC mutations, thus increasing our genomic mutation load. Here we review the recent evidence for a GC-biased gene conversion process in mammals, and its consequences for genomic landscapes, molecular evolution, and human functional genomics.
Collapse
Affiliation(s)
- Laurent Duret
- Université de Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France.
| | | |
Collapse
|
5
|
Creanza TM, Horner DS, D'Addabbo A, Maglietta R, Mignone F, Ancona N, Pesole G. Statistical assessment of discriminative features for protein-coding and non coding cross-species conserved sequence elements. BMC Bioinformatics 2009; 10 Suppl 6:S2. [PMID: 19534745 PMCID: PMC2697643 DOI: 10.1186/1471-2105-10-s6-s2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background The identification of protein coding elements in sets of mammalian conserved elements is one of the major challenges in the current molecular biology research. Many features have been proposed for automatically distinguishing coding and non coding conserved sequences, making so necessary a systematic statistical assessment of their differences. A comprehensive study should be composed of an association study, i.e. a comparison of the distributions of the features in the two classes, and a prediction study in which the prediction accuracies of classifiers trained on single and groups of features are analyzed, conditionally to the compared species and to the sequence lengths. Results In this paper we compared distributions of a set of comparative and non comparative features and evaluated the prediction accuracy of classifiers trained for discriminating sequence elements conserved among human, mouse and rat species. The association study showed that the analyzed features are statistically different in the two classes. In order to study the influence of the sequence lengths on the feature performances, a predictive study was performed on different data sets composed of coding and non coding alignments in equal number and equally long with an ascending average length. We found that the most discriminant feature was a comparative measure indicating the proportion of synonymous nucleotide substitutions per synonymous sites. Moreover, linear discriminant classifiers trained by using comparative features in general outperformed classifiers based on intrinsic ones. Finally, the prediction accuracy of classifiers trained on comparative features increased significantly by adding intrinsic features to the set of input variables, independently on sequence length (Kolmogorov-Smirnov P-value ≤ 0.05). Conclusion We observed distinct and consistent patterns for individual and combined use of comparative and intrinsic classifiers, both with respect to different lengths of sequences/alignments and with respect to error rates in the classification of coding and non-coding elements. In particular, we noted that comparative features tend to be more accurate in the classification of coding sequences – this is likely related to the fact that such features capture deviations from strictly neutral evolution expected as a consequence of the characteristics of the genetic code.
Collapse
Affiliation(s)
- Teresa M Creanza
- Istituto di Studi sui Sistemi Intelligenti per l'Automazione, CNR, Via Amendola 122/D-I, Bari, Italy.
| | | | | | | | | | | | | |
Collapse
|
6
|
Elhaik E, Landan G, Graur D. Can GC content at third-codon positions be used as a proxy for isochore composition? Mol Biol Evol 2009; 26:1829-33. [PMID: 19443854 DOI: 10.1093/molbev/msp100] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The isochore theory depicts the genomes of warm-blooded vertebrates as a mosaic of long genomic regions that are characterized by relatively homogeneous GC content. In the absence of genomic data, the GC content at third-codon positions of protein-coding genes (GC3) was commonly used as a proxy for the GC content of isochores. Oddly, in the postgenomic era, GC3 is still sometimes used as a proxy for the GC composition of isochores. Here, we use genic and genomic sequences from human, chimpanzee, cow, mouse, rat, chicken, and zebrafish to show that GC3 only explains a very small proportion of the variation in GC content of long genomic sequences flanking the genes (GCf), and what little correlation there is between GC3 and GCf was found to decay rapidly with distance from the gene. The coefficient of variation of GC3 was found to be much larger than that of GCf and, therefore, GC3 and GCf values are not comparable with each other. Comparisons of orthologous gene pairs from 1) human and chimpanzee and 2) mouse and rat show strong correlations between their GC3 values, but very weak correlations between their GCf values. We conclude that the GC content of third-codon position cannot be used as stand-in for isochoric composition.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Biology and Biochemistry, University of Houston, TX, USA
| | | | | |
Collapse
|
7
|
Different functional classes of genes are characterized by different compositional properties. FEBS Lett 2007; 581:5819-24. [DOI: 10.1016/j.febslet.2007.11.052] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Revised: 11/14/2007] [Accepted: 11/16/2007] [Indexed: 11/19/2022]
|
8
|
Bag SK, Paul S, Ghosh S, Dutta C. Reverse polarization in amino acid and nucleotide substitution patterns between human-mouse orthologs of two compositional extrema. DNA Res 2007; 14:141-54. [PMID: 17895298 PMCID: PMC2533592 DOI: 10.1093/dnares/dsm015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Genome-wide analysis of sequence divergence patterns in 12,024 human-mouse orthologous pairs reveals, for the first time, that the trends in nucleotide and amino acid substitutions in orthologs of high and low GC composition are highly asymmetric and polarized to opposite directions. The entire dataset has been divided into three groups on the basis of the GC content at third codon sites of human genes: high, medium, and low. High-GC orthologs exhibit significant bias in favor of the replacements, Thr --> Ala, Ser --> Ala, Val --> Ala, Lys --> Arg, Asn --> Ser, Ile --> Val etc., from mouse to human, whereas in low-GC orthologs, the reverse trends prevail. In general, in the high-GC group, residues encoded by A/U-rich codons of mouse proteins tend to be replaced by the residues encoded by relatively G/C-rich codons in their human orthologs, whereas the opposite trend is observed among the low-GC orthologous pairs. The medium-GC group shares some trends with high-GC group and some with low-GC group. The only significant trend common in all groups of orthologs, irrespective of their GC bias, is (Asp)(Mouse) --> (Glu)(Human) replacement. At the nucleotide level, high-GC orthologs have undergone a large excess of (A/T)(Mouse) --> (G/C)(Human) substitutions over (G/C)(Mouse) --> (A/T)(Human) at each codon position, whereas for low-GC orthologs, the reverse is true.
Collapse
Affiliation(s)
- Sumit K. Bag
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata 700 032, India
| | - Sandip Paul
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata 700 032, India
| | - Subhagata Ghosh
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700 032, India
| | - Chitra Dutta
- Bioinformatics Centre, Indian Institute of Chemical Biology, Kolkata 700 032, India
- Structural Biology and Bioinformatics Division, Indian Institute of Chemical Biology, 4, Raja S. C. Mullick Road, Kolkata 700 032, India
- To whom correspondence should be addressed. Tel. +91 33-2473-3491. Fax. +91 33-2473-0284. E-mail:
| |
Collapse
|
9
|
Abstract
Several studies of nucleotide substitution patterns in mammalian species suggested that GC-rich isochores might be vanishing in mammalian genomes. However, the number of genes and the number of genomes included in these studies might not have given a reliable broad view of the trend in GC change in mammals. It is therefore worth exploiting this issue with a broader coverage of mammalian genomes using a reliable approach, the maximum likelihood approach. We have applied two maximum likelihood methods to infer the ancestral GC contents of 176 mammalian genes from representative eutherian species and at least one marsupial species. Except for a large GC decrease in marsupial genes, we found no general decreasing trend in GC content in GC-rich genes or in other genes among eutherian mammals; indeed, the GC content of GC-rich genes appears to have increased in recent times in some genomes, e.g., the rabbit. For the large GC decrease in marsupials, it could be mainly due to the great reduction in chromosome number, which could lead to a large reduction in recombination rate and thus also a large reduction in the rate of gene conversion. Since many eutherian mammals still maintain a fairly large number of chromosomes, it is unlikely that GC-rich isochores are vanishing in these mammals.
Collapse
Affiliation(s)
- Jianying Gu
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| | | |
Collapse
|
10
|
Fortes GG, Bouza C, Martínez P, Sánchez L. Diversity in isochore structure among cold-blooded vertebrates based on GC content of coding and non-coding sequences. Genetica 2006; 129:281-9. [PMID: 16897446 DOI: 10.1007/s10709-006-0009-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2005] [Accepted: 04/19/2006] [Indexed: 11/29/2022]
Abstract
To review the general consideration about the different compositional structure of warm and cold-blooded vertebrates genomes, we used of the increasing number of genetic sequences, including coding (exons) and non-coding (introns) regions, that have been deposited on the databases throughout last years. The nucleotide distributions of the third codon positions (GC3) have been analyzed in 1510 coding sequences (CDS) of fish, 1414 CDS of amphibians and 320 CDS of reptiles. Also, the relationship between GC content of 74, 56 and 25 CDS of fish, amphibians and reptiles, respectively and that of their corresponding introns (GCI) have been considerated. In accordance with recent data, sequence analysis showed the presence of very GC3-rich CDS in these poikilotherm vertebrates. However, very high diversity in compositional patterns among different orders of fish, amphibians and reptiles was found. Significant positive correlations between GC3 and GCI was also confirmed for the genes analyzed. Nevertheless, introns resulted to be poorer in GC than their corresponding CDS, this difference being larger than in human genome. Because the limited number of available sequences including exons and introns we must be cautious about the results derived from them. However, the indicious of higher GC richness of coding sequences than of their corresponding introns could aid to understand the discrepancy of sequence analysis with the ultracentrifugation studies in cold-blooded vertebrates that did not predict the existence of GC-rich isochores.
Collapse
Affiliation(s)
- Gloria G Fortes
- Departamento de Genética, Facultad de Veterinaria, Universidad de Santiago de Compostela, Lugo, Spain
| | | | | | | |
Collapse
|
11
|
Belle EMS, Duret L, Galtier N, Eyre-Walker A. The decline of isochores in mammals: an assessment of the GC content variation along the mammalian phylogeny. J Mol Evol 2004; 58:653-60. [PMID: 15461422 DOI: 10.1007/s00239-004-2587-x] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Whether isochores, the large-scale variation of the GC content in mammalian genomes, are being maintained has recently been questioned. It has been suggested that GC-rich isochores originated in the ancestral amniote genome but that whatever force gave rise to them is no longer effective and that isochores are now disappearing from mammalian genomes. Here we investigated the evolution of the GC content of 41 coding genes in 6 to 66 species of mammals by estimating the ancestral GC content using a method which allows for different rates of substitution between sites. We found a highly significant decrease in the GC content during early mammalian evolution, as well as a weaker but still significant decrease in the GC content of GC-rich genes later in at least three groups of mammals: primates, rodents, and carnivores. These results are of interest because they confirm the recently suggested disappearance of GC-rich isochores in some mammalian genomes, and more importantly, they suggest that this disappearance started very early in mammalian evolution.
Collapse
Affiliation(s)
- Elise M S Belle
- Centre for the Study of Evolution-School of Life Sciences, University of Sussex, Brighton BN1 9QG, UK.
| | | | | | | |
Collapse
|
12
|
Jabbari K, Bernardi G. Comparative genomics of Anopheles gambiae and Drosophila melanogaster. Gene 2004; 333:183-6. [PMID: 15177694 DOI: 10.1016/j.gene.2004.02.038] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2003] [Accepted: 02/10/2004] [Indexed: 10/26/2022]
Abstract
A sequence analysis of the genomes of Anopheles gambiae and Drosophila melanogaster reveals that Anopheles DNA is more heterogeneous and GC-richer than Drosophila DNA. The gene concentration across the Anopheles genome is characterized by low levels in the GC-poor part of the genome and a 3-fold increase in the GC-richest part; this gene density gradient is approximately half that of Drosophila. GC levels of introns and flanking sequences are correlated with GC(3) values (GC levels of third codon positions) of the corresponding genes with slopes much lower than unity; in other words, most introns and intergenic sequences are less GC-rich than the corresponding GC(3) values. These findings, which describe a compositional shift within Diptera, is of interest because of their parallels in the well studied major shift in vertebrates.
Collapse
Affiliation(s)
- Kamel Jabbari
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, 2 Place Jussieu, F-75005 Paris, France
| | | |
Collapse
|
13
|
Duret L, Semon M, Piganeau G, Mouchiroud D, Galtier N. Vanishing GC-rich isochores in mammalian genomes. Genetics 2002; 162:1837-47. [PMID: 12524353 PMCID: PMC1462357 DOI: 10.1093/genetics/162.4.1837] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To understand the origin and evolution of isochores-the peculiar spatial distribution of GC content within mammalian genomes-we analyzed the synonymous substitution pattern in coding sequences from closely related species in different mammalian orders. In primate and cetartiodactyls, GC-rich genes are undergoing a large excess of GC --> AT substitutions over AT --> GC substitutions: GC-rich isochores are slowly disappearing from the genome of these two mammalian orders. In rodents, our analyses suggest both a decrease in GC content of GC-rich isochores and an increase in GC-poor isochores, but more data will be necessary to assess the significance of this pattern. These observations question the conclusions of previous works that assumed that base composition was at equilibrium. Analysis of allele frequency in human polymorphism data, however, confirmed that in the GC-rich parts of the genome, GC alleles have a higher probability of fixation than AT alleles. This fixation bias appears not strong enough to overcome the large excess of GC --> AT mutations. Thus, whatever the evolutionary force (neutral or selective) at the origin of GC-rich isochores, this force is no longer effective in mammals. We propose a model based on the biased gene conversion hypothesis that accounts for the origin of GC-rich isochores in the ancestral amniote genome and for their decline in present-day mammals.
Collapse
Affiliation(s)
- Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, UMR CNRS 5558 Université Claude Bernard Lyon 1, 69622 Villeurbanne Cedex, France.
| | | | | | | | | |
Collapse
|
14
|
D'Onofrio G, Ghosh TC, Bernardi G. The base composition of the genes is correlated with the secondary structures of the encoded proteins. Gene 2002; 300:179-87. [PMID: 12468099 DOI: 10.1016/s0378-1119(02)01045-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The analysis of a non-redundant set of human proteins, for which both the crystallographic structures and the corresponding gene sequences are available, show that bases at third codon position are non-uniformly distributed along the coding sequences. Significant compositional differences are found by comparing the gene regions corresponding to the different secondary structures of the proteins. Inter-and intra-structure differences were most pronounced in the GC-richest genes. These results are not compatible with any proposed hypotheses based on a neutral process of formation/maintenance of the high GC(3) levels of the genes localized in the GC-richest isochores of the human genome.
Collapse
Affiliation(s)
- Giuseppe D'Onofrio
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica A. Dohrn, Naples, Italy.
| | | | | |
Collapse
|
15
|
Abstract
Within-intron difference of correlation with base composition of the adjacent exons was studied in the genomes of 34 species. For this purpose, GC-percent was determined for segments of 50 bp in length taken at both intron margins and in the internal part of the intron. It was found that in certain genomes the coefficient of correlation with GC-percent of the adjacent exon was significantly higher for the intron margin than for the internal part of the intron (homeotherms, cereals). Only part of this difference can be explained by unequal probability of insertion of transposable elements. Those multicellular organisms which have a low or no within-intron difference in correlation with the adjacent exons (anamniotes, invertebrates, dicots) show a higher local compositional heterogeneity (a greater exon/intron contrast in the GC-content). These results are evidence against the mutational bias being a possible explanation for the compositional genome heterogeneity. Thus, in the genomes with a high global heterogeneity there seems to be a selective force for compliance of intron base composition with the adjacent exons. This force is stronger in those parts of the intron that are closer to exons. In addition, the previously found positive general correlation between the genome size and average intron length was confirmed with a much larger dataset. However, within separate phylogenetic groups this rule can be broken, as it occurs in the cereals (family Poaceae), where a negative correlation was found.
Collapse
Affiliation(s)
- A E Vinogradov
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, 194064, St. Petersburg, Russia.
| |
Collapse
|
16
|
Abstract
The nuclear genomes of vertebrates are mosaics of isochores, very long stretches (>>300kb) of DNA that are homogeneous in base composition and are compositionally correlated with the coding sequences that they embed. Isochores can be partitioned in a small number of families that cover a range of GC levels (GC is the molar ratio of guanine+cytosine in DNA), which is narrow in cold-blooded vertebrates, but broad in warm-blooded vertebrates. This difference is essentially due to the fact that the GC-richest 10-15% of the genomes of the ancestors of mammals and birds underwent two independent compositional transitions characterized by strong increases in GC levels. The similarity of isochore patterns across mammalian orders, on the one hand, and across avian orders, on the other, indicates that these higher GC levels were then maintained, at least since the appearance of ancestors of warm-blooded vertebrates. After a brief review of our current knowledge on the organization of the vertebrate genome, evidence will be presented here in favor of the idea that the generation and maintenance of the GC-richest isochores in the genomes of warm-blooded vertebrates were due to natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli, Italy.
| |
Collapse
|
17
|
Majumdar S, Gupta SK, Sundararajan VS, Ghosh TC. Compositional correlation studies among the three different codon positions in 12 bacterial genomes. Biochem Biophys Res Commun 1999; 266:66-71. [PMID: 10581166 DOI: 10.1006/bbrc.1999.1774] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Compositional distributions in the three codon positions of the coding sequences of 12 fully sequenced prokaryotic genomes, which are publicly available, were investigated. A universal compositional correlation was observed in most of the genomes under investigation irrespective of their overall genomic GC contents. In all the genomes, the GC contents at the first codon positions are always greater than the overall GC contents of the genomes whereas the reverse is true in the case of second codon positions. GC contents at the third codon positions are higher than the overall genomic GC contents in high GC containing genomes, and the opposite situation was found in case of low GC genomes except for Helicobacter pylori. In high-GC rich genomes, the GC contents at the first + second codon positions are less than the GC contents at the third codon positions, and they are low in low-GC genomes except for Helicobacter pylori. The distributions of four bases at the three different positions were also investigated for all 12 organisms. It was observed that in high-GC genomes G is the most dominant base and in low-GC genomes A is the most dominant base in the first codon positions. But purine bases, i.e., (A + G), predominantly occur in the first codon position. In the second codon position, A is the most dominant base in most of the organisms and G is the least dominant base in all the organisms. There is no unique regular pattern of individual bases at the third codon positions; however, there are significant differences in the occurrences of (G + C) contents in the third codon positions among the different organisms. Calculations of dinucleotide frequencies in 12 different organisms indicate that in GC-rich genomes GG, GC, CC, and CG dinucleotides are the most dominant whereas the reverse is true in case of low-GC genomes. Biological implications of these results are discussed in this paper.
Collapse
Affiliation(s)
- S Majumdar
- Distributed Information Centre, Bose Institute, P 1/12, C.I.T. Scheme, VII M Calcutta, 700 054, India
| | | | | | | |
Collapse
|
18
|
D'Onofrio G, Jabbari K, Musto H, Bernardi G. The correlation of protein hydropathy with the base composition of coding sequences. Gene 1999; 238:3-14. [PMID: 10570978 DOI: 10.1016/s0378-1119(99)00257-7] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The "universal correlation" (D'Onofrio, G., Bernardi, G., 1992. A universal compositional correlation among codon positions. Gene 110, 81-88.) that holds between <GC3> and <GC1> or <GC2> (<GC> values are the average values of the coding sequences of each genome analyzed) at both the inter- and intra-genomic level, was re-analyzed on a vastly larger dataset. The results showed a slight, but significant, difference in the <GC3> vs. <GC1> correlations exhibited by prokaryotes and eukaryotes. This finding prompted an analysis of the correlation between <GC3> and the amino acid frequencies in the encoded proteins, which has shown that positive correlations exist between <GC3> values of coding sequences and the hydropathy of the corresponding proteins. These correlations are due to the fact that hydrophobic and amphypathic amino acids increase, whereas hydrophilic amino acids decrease with increasing <GC3> values. Hydropathy values of prokaryotic proteins are systematically higher than those of eukaryotes, but the slopes of the regression lines are identical. The lower hydrophobicity of eukaryotic proteins is due to differences in the amino acid composition. In particular, the twofold higher cysteine (and disulfide bond) level of eukaryotic proteins compared to prokaryotic proteins most probably compensates for their lower hydrophobicity. This supports the viewpoint that hydrophobicity plays a structural and functional role as far as protein stability is concerned.
Collapse
Affiliation(s)
- G D'Onofrio
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli, Italy
| | | | | | | |
Collapse
|
19
|
Eyre-Walker A. Evidence of selection on silent site base composition in mammals: potential implications for the evolution of isochores and junk DNA. Genetics 1999; 152:675-83. [PMID: 10353909 PMCID: PMC1460637 DOI: 10.1093/genetics/152.2.675] [Citation(s) in RCA: 133] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
It has been suggested that mutation bias is the major determinant of base composition bias at synonymous, intron, and flanking DNA sites in mammals. Here I test this hypothesis using population genetic data from the major histocompatibility genes of several mammalian species. The results of two tests are inconsistent with the mutation hypothesis in coding, noncoding, CpG-island, and non-CpG-island DNA, but are consistent with selection or biased gene conversion. It is argued that biased gene conversion is unlikely to affect silent site base composition in mammals. The results therefore suggest that selection is acting upon silent site G + C content. This may have broad implications, since silent site base composition reflects large-scale variation in G + C content along mammalian chromosomes. The results therefore suggest that selection may be acting upon the base composition of isochores and large sections of junk DNA.
Collapse
Affiliation(s)
- A Eyre-Walker
- Centre for the Study of Evolution and School of Biological Sciences, University of Sussex, Brighton, BN1 9QG, United Kingdom.
| |
Collapse
|
20
|
Biunno I, Rogozin IB, Appierto V, Milanesi L, Mostardini M, Mumm S, Pergolizzi R, Zucchi I, De Bellis G. Sequence and gene content in 35 kb genomic clone mapping in the human Xq27.1 region. DNA SEQUENCE : THE JOURNAL OF DNA SEQUENCING AND MAPPING 1998; 8:1-15. [PMID: 9522116 DOI: 10.3109/10425179709020880] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
This paper presents detailed analysis of the entire sequence of a cosmid clone, 26H7, containing 35 kb of human DNA. This cosmid resides on the q27.1 region of the human X chromosome between, DXS1232 and DXS119 loci. Novel potential small exons were detected for which conventional gene identification strategies (Northern blot analysis and extensive cDNA library screening) proved to be inefficient. Of the standard repetitive elements we found: 8 Alu's making up 6.2% of the sequence; 10 MIR segments (4.1%); 5 LINE1 elements (4.8%), 3 MIR2 (1.0%); 2 MLT (2.9%), and 1 MSTA (0.7%) representing about 20% of the total sequence. The overall GC content was rather low, only 42% and no CpG island was detected using rare restriction enzymes. However, a CpG-rich region was identified. Computer aided analysis of the sequence inferred the presence of three possible genes: one of them was found to be homologous to the U7 RNA family elements; a second is reported in this paper, however at the moment no significant homology has been found in the data bank. The third predicted gene has not as yet been found to be detectable by RT-PCR. We also report in this paper the identification of X-chromosome specific repeated sequences.
Collapse
Affiliation(s)
- I Biunno
- Consiglio Nazionale delle Ricerche, Istituto Tecnologie Biomediche Avanzate, Milano, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Chiapello H, Lisacek F, Caboche M, Hénaut A. Codon usage and gene function are related in sequences of Arabidopsis thaliana. Gene 1998; 209:GC1-GC38. [PMID: 9583944 DOI: 10.1016/s0378-1119(97)00671-9] [Citation(s) in RCA: 126] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In this paper, the relationship between codon usage and the physiological pattern of expression of a gene is investigated while considering a dataset of 815 nuclear genes of Arabidopsis thaliana. Factorial Correspondence Analysis, a commonly used multivariate statistical approach in codon usage analysis, was used in order to analyse codon usage bias gene by gene. The analysis reveals a single major trend in codon usage among genes in Arabidopsis. At one end of the trend lie genes with a highly G/C biased codon usage. This group contains mainly photosynthetic and housekeeping genes which are known to encode the most abundant proteins of the vegetal cell. At the other extreme lie genes with a weaker A/T-biased codon usage. This group contain genes with various functions which exhibits most of the time a strong tissue-specific pattern of expression in relation, for example, to stress conditions. These observations were confirmed by the detailed analysis of codon usage in the multigene family of tubulins and appear to be general in plant species, even as distant from Arabidopsis thaliana as a monocotyledonous plant such as maize.
Collapse
Affiliation(s)
- H Chiapello
- Laboratoire de Biologie Cellulaire, INRA, Cedex, France
| | | | | | | |
Collapse
|
22
|
Bernardi G, Hughes S, Mouchiroud D. The major compositional transitions in the vertebrate genome. J Mol Evol 1997; 44 Suppl 1:S44-51. [PMID: 9071011 DOI: 10.1007/pl00000051] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The vertebrate genome underwent two major compositional transitions, between therapsids and mammals and between dinosaurs and birds. These transitions concerned a sizable part (roughly one-third) of the genome, the gene-richest part of it, and consisted in an increase in GC levels (GC is the molar fraction of guanine + cytosine in DNA) which affected both coding sequences (especially third codon positions) and noncoding sequences. These major transitions were studied here by comparing GC3 levels (GC3 is the GC of third codon positions) of orthologous genes from Xenopus, chicken, calf, and man.
Collapse
Affiliation(s)
- G Bernardi
- Laboratoire de Genetique Moleculaire, Institut Jacques Monod, Paris, France
| | | | | |
Collapse
|
23
|
Abstract
Recent success in developing transcriptional maps of large genomic regions provide excellent opportunities for the investigation of mammalian genome organization. Detailed definition of organizational features will, in the short term, aid in prioritizing genomic sequencing efforts and in interpreting sequencing results and, in the long term, will surely provide insights into the structural, functional and evolutionary basis for the mammalian chromosome and chromosomal banding patterns. For such efforts, human chromosome 21 provides an excellent model system because the physical and clone maps are detailed, and several transcriptional mapping projects have provided large numbers of novel genes. It is, therefore, valuable at this point to examine these transcriptional mapping data and to compare them with the isochore model of the mammalian genome, which describes patterns in base composition and predicts gene distributions. Not only do compelling organizational patterns appear, but new questions about additional possible patterns in gene size, structure, conservation and transcription can be asked.
Collapse
Affiliation(s)
- K Gardiner
- Eleanor Roosevelt Institute, Denver, CO 80206-1210, USA
| |
Collapse
|
24
|
Saccone S, Cacciò S, Kusuda J, Andreozzi L, Bernardi G. Identification of the gene-richest bands in human chromosomes. Gene 1996; 174:85-94. [PMID: 8863733 DOI: 10.1016/0378-1119(96)00392-7] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The human genome is a mosaic of isochores, long DNA segments which are compositionally homogeneous and which can be partitioned into five families, L1, L2, H1, H2 and H3, characterized by increasing GC levels and by increasing gene concentrations. Previous investigations showed that in situ hybridization with a DNA fraction derived from the GC-richest and gene-richest isochores of the H3 family produced the highest concentration of signals on 25 R(everse) bands that include the 22 most thermal-denaturation-resistant T(elomeric) bands, a subset of R bands. Using an improved protocol for in situ hybridization and cloned H3 isochore DNA, we have now shown (i) that the number of bands which are characterized by strong hybridization signals, and which are here called T or H3+, is 28; (ii) that 31 additional R bands, here called T'or H3* bands, also contain H3 isochores, although at a lower concentration than H3+ bands; and (iii) that the remaining R bands (about 140 out of 200, at a resolution of 400 bands), here called R" or H3- bands, do not contain any detectable H3 isochores. H3+ and H3* bands contain all the gene-richest isochores of the human genome. The existence of three distinct sets of R bands is further supported (i) by the different compositional features of genes located in them; (ii) by the very low gene density of chromosomes 13 and 18, in which all R bands are H3- bands; (iii) by the compositional map of a H3* band, Xq28; (iv) by the overwhelming presence of GC-rich and GC-poor long (> 50 kb) DNA sequences in H3+/H3* and in H3-/G bands, respectively; and (v) by the large degree of coincidence of H3+ and H3* bands with CpG island-positive bands. These observations have implications for our understanding of the causes of chromosome banding and provide a classification of chromosomal bands that is related to GC level (and to gene concentration).
Collapse
Affiliation(s)
- S Saccone
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | | | |
Collapse
|
25
|
Abstract
Linear correlations exist between the GC levels of third codon positions (GC3) of individual human genes and the GC levels of long genomic sequences and DNA molecules (50-100 kb in size) embedding the genes. These linear relationships allow the positioning of the GC3 histogram of cDNA sequences from the databases relative to the CsCl profile of human DNA. In turn, this allows an estimate of the relative concentrations of genes in genomic regions of different GC content. An estimate obtained by using current sequence data and Gaussian decompositions of the GC3 histogram and of the CsCl profile indicates that the GC-richest (non-ribosomal) component of the human genome is at least 17 times as gene-rich as the GC-poor regions. Moreover, our results suggest that the most recent physical maps of the human genome consisting of overlapping YACs cover less than 50% of the genes.
Collapse
Affiliation(s)
- S Zoubak
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | |
Collapse
|
26
|
De Sario A, Geigl EM, Palmieri G, D'Urso M, Bernardi G. A compositional map of human chromosome band Xq28. Proc Natl Acad Sci U S A 1996; 93:1298-302. [PMID: 8577758 PMCID: PMC40074 DOI: 10.1073/pnas.93.3.1298] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The molar fractions of guanine plus cytosine (GC) in DNA were determined for 36 yeast artificial chromosomes (YACs) which almost completely cover human chromosome band Xq28, a terminal reverse band, corresponding to about 8 Mb of DNA. This allowed the construction of the most complete compositional map to date of a chromosomal band; three regions were observed: (i) a proximal 3.5-Mb region formed by GC-poor L and GC-rich H1 isochores; (ii) a middle 2,2-Mb region essentially formed by a GC-rich H2 isochore and a very GC-rich H3 isochore separated by a GC-poor L isochore, YACs from this region being characterized by a striking compositional heterogeneity and instability; and (iii) a distal 1.3-Mb region exclusively formed by GC-poor L isochores. Gene and CpG island concentrations increased with the GC levels of the isochores, as expected. Xq28 exemplifies a subset of reverse bands which are different from the two other subsets, namely from telomeric bands, which are characterized by specific cytogenetic properties and by the predominance of H2 and H3 isochores, and from the majority of reverse bands, which do not contain H2 and H3 isochores.
Collapse
Affiliation(s)
- A De Sario
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | | | |
Collapse
|
27
|
Musto H, Rodríguez-Maseda H, Alvarez F. Compositional correlations in the nuclear genes of the flatworm Schistosoma mansoni. J Mol Evol 1995; 40:343-6. [PMID: 7723062 DOI: 10.1007/bf00163240] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We have investigated the genome organization in the flatworm Schistosoma mansoni. First, we analyzed the compositional distributions of the three codon positions. Second, we investigated the correlations that exist between (1) the GC levels of exons against flanking regions, (2) the GC levels of third codon positions against flanking regions, (3) the dinucleotide frequencies of exons against flanking regions, and (4) the GC levels of 5' against 3' regions. The modality of the distribution of third codon positions, together with the significant correlations found, leads us to propose that the nuclear genome of this species is compositionally compartmentalized.
Collapse
Affiliation(s)
- H Musto
- Sección Bioquímica, Instituto de Biología, Facultad de Ciencias, Montevideo, Uruguay
| | | | | |
Collapse
|
28
|
Mouchiroud D, Gautier C, Bernardi G. Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions. J Mol Evol 1995; 40:107-13. [PMID: 7714909 DOI: 10.1007/bf00166602] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The frequencies of synonymous substitutions of mammalian genes cover a much wider range than previously thought. We report here that the different frequencies found in homologous genes from a given mammalian pair are correlated with those in the same homologous genes from a different mammalian pair. This indicates that the frequencies of synonymous substitutions are gene-specific (as are the frequencies of nonsynonymous substitutions), or, in other words, that "fast" and "slow" genes in one mammal are fast and slow, respectively, in any other one. Moreover, the frequencies of synonymous substitutions are correlated with the frequencies of nonsynonymous substitution in the same genes.
Collapse
Affiliation(s)
- D Mouchiroud
- Laboratoire de Biométrie, Génétique et Biologie des Populations, U.R.A. 243, Université Claude Bernard, Villeurbanne, France
| | | | | |
Collapse
|
29
|
Rodríguez-Maseda H, Musto H. The compositional compartments of the nuclear genomes of Trypanosoma brucei and T. cruzi. Gene 1994; 151:221-4. [PMID: 7828878 DOI: 10.1016/0378-1119(94)90660-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Fractionation of DNA from Trypanosoma brucei and T. equiperdum by centrifugation in a Cs2SO4/BAMD density gradient indicated that these genomes are compositionally compartmentalized, a conclusion confirmed by the analysis of the compositional distribution of third codon positions from T. brucei and T. cruzi. In order to investigate whether this compartmentalization is accompanied by the often different properties of coding sequences, we have analyzed and compared the compositional compartments with respect to dinucleotide frequency and amino-acid usage of the encoded proteins of all gene sequences available in the GenBank database from T. brucei and T. cruzi. In all cases, the compartments displayed remarkable differences. These results are similar to findings obtained in highly compartmentalized genomes, like those of warm-blooded vertebrates.
Collapse
Affiliation(s)
- H Rodríguez-Maseda
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | |
Collapse
|
30
|
Cacciò S, Perani P, Saccone S, Kadi F, Bernardi G. Single-copy sequence homology among the GC-richest isochores of the genomes from warm-blooded vertebrates. J Mol Evol 1994; 39:331-9. [PMID: 7966363 DOI: 10.1007/bf00160265] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We have hybridized a human DNA fraction corresponding to the GC-richest and gene-richest isochore family, H3, on compositional fractions of DNAs from 12 mammalian species and three avian species, representing eight and three orders, respectively. Under conditions in which repetitive sequences are competed out, the H3 isochore probe only or predominantly hybridized on the GC-richest fractions of main-band DNA from all the species investigated. These results indicate that single-copy sequences from the human H3 isochores share homology with sequences located in the compositionally corresponding compartments of the vertebrate genomes tested. These sequences are likely to be essentially formed by conserved coding sequences. The present results add to other lines of evidence indicating that isochore patterns are highly conserved in warm-blooded vertebrate genomes. Moreover, they refine recent reports (Sabeur et al., 1993; Kadi et al., 1993), and correct them in some details and also in demonstrating that the shrew genome does not exhibit the general mammalian pattern, but a special pattern.
Collapse
Affiliation(s)
- S Cacciò
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | | | |
Collapse
|
31
|
Zoubak S, Richardson JH, Rynditch A, Höllsberg P, Hafler DA, Boeri E, Lever AM, Bernardi G. Regional specificity of HTLV-I proviral integration in the human genome. Gene X 1994; 143:155-63. [PMID: 8206368 DOI: 10.1016/0378-1119(94)90091-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The location of HTLV-I (human T-cell leukemia virus type 1) proviral sequences in the genome of infected human cells was explored by hybridization of a viral probe with compositional fractions of host-cell DNAs. In the twelve cases examined, HTLV-I sequences were absent from the GC-poorest 40% of the host genome (namely, from isochores that are below 39% GC). Transcriptionally inactive proviral sequences were localized in GC-poor isochores (comprised between 39% and 42-44% GC) of the human genome, which are characterized by a constant and low gene concentration. In contrast, transcriptionally active proviral sequences were found in the GC-rich and very GC-rich isochores, which are gene rich, transcriptionally and recombinationally active, and endowed with an open chromatin structure. Since GC-rich isochores are present in R'-bands and very GC-rich isochores form T-bands, these results also provide information on the location of HTLV-I proviral sequences in human chromosomes. The results obtained with HTLV-I are in agreement with the non-random, compartmentalized integration of animal retroviral sequences that had been previously observed in other viral-host systems. They provide, however, much more detailed information on the regional location of proviral sequences in the host genome and on the correlation between their transcription and their location.
Collapse
Affiliation(s)
- S Zoubak
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | | | | | | | | | |
Collapse
|
32
|
Berkhout B, van Hemert FJ. The unusual nucleotide content of the HIV RNA genome results in a biased amino acid composition of HIV proteins. Nucleic Acids Res 1994; 22:1705-11. [PMID: 8202375 PMCID: PMC308053 DOI: 10.1093/nar/22.9.1705] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Extremely high frequencies of the A nucleotide are found in the RNA genomes of the lentivirus group of retroviruses. It is presently unknown what molecular force is responsible for this A-pressure. In this manuscript, we demonstrate a correlation between this 'A-pressure' and the amino acid-usage of the lentivirus family. We compared the amino acid composition of the Gag and Pol proteins of the human immunodeficiency viruses type 1 and 2 (HIV-1 and HIV-2) with that of the second group of human retroviruses; the human T-cell leukemia viruses type I and II (HTLV-I and HTLV-II). Differences in total amino acid content correlate with the preference for A-rich codons in the HIV genome. A pair-wise comparison of homologous amino acid positions in the Pol proteins indicates that both conservative and non-conservative changes can be accounted for by this A-bias. The putative molecular mechanism underlying this A-pressure and the evolutionary consequences are discussed.
Collapse
Affiliation(s)
- B Berkhout
- Department of Virology, University of Amsterdam, The Netherlands
| | | |
Collapse
|
33
|
Sabeur G, Macaya G, Kadi F, Bernardi G. The isochore patterns of mammalian genomes and their phylogenetic implications. J Mol Evol 1993; 37:93-108. [PMID: 8411213 DOI: 10.1007/bf02407344] [Citation(s) in RCA: 54] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The compositional distributions of high molecular weight DNA fragments from 20 species belonging to 9 out of the 17 eutherian orders were investigated by analytical CsCl density gradient centrifugation and by preparative fractionation in Cs2SO4/BAMD density gradients followed by analysis of the fractions in CsCl. These compositional distributions reflect those of the isochores making up the corresponding genomes. A "general distribution" was found in species belonging to eight mammalian orders. A "myomorph distribution" was found in Myomorpha, but not in the other rodent infraorders Sciuromorpha and Histricomorpha, which share the general distribution. Two other distributions were found in a megachiropteran (but not in microchiropteran, which, again, shares the general distribution) and in pangolin (a species from the only genus of the order Pholidota), respectively. The main difference between the general distribution and all other distributions is that the former contains sizable amounts (6-10%) of GC-rich isochores (detected as DNA fragments equal to, or higher than, 1.710 g/cm3 in modal buoyant density), which are scarce, or absent, in the other distributions. This difference is remarkable because gene concentrations in mammalian genomes are paralleled by GC levels, the highest gene concentrations being present in the GC-richest isochores. The compositional distributions of mammalian genomes reported here shed light on mammalian phylogeny. Indeed, all orders investigated, with the exception of Pholidota, seem to share a common ancestor. The compositional patterns of the megachiropteran and of Myomorpha may be derived from the general pattern or have independent origins.
Collapse
Affiliation(s)
- G Sabeur
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | |
Collapse
|
34
|
Collins DW, Jukes TH. Relationship between G + C in silent sites of codons and amino acid composition of human proteins. J Mol Evol 1993; 36:201-13. [PMID: 8483158 DOI: 10.1007/bf00160475] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
We have investigated the relationship between the G + C content of silent (synonymous) sites in codons and the amino acid composition of encoded proteins for approximately 1,600 human genes. There are positive correlations between silent site G + C and the proportions of codons for Arg, Pro, Ala, Trp, His, Gln, and Leu and negative ones for Tyr, Phe, Asn, Ile, Lys, Asp, Thr, and Glu. The median proteins coded by groups of genes that differ in silent-site G + C content also differ in amino acid composition, as do some proteins coded by homologous genes. The pattern of compositional change can be largely explained by directional mutation pressure, the genetic code, and differences in the frequencies of accepted amino acid substitutions; the shifts in protein composition are likely to be selectively neutral.
Collapse
Affiliation(s)
- D W Collins
- Space Sciences Laboratory, University of California, Berkeley
| | | |
Collapse
|
35
|
Bettecken T, Aissani B, Müller CR, Bernardi G. Compositional mapping of the human dystrophin-encoding gene. Gene 1992; 122:329-35. [PMID: 1487147 DOI: 10.1016/0378-1119(92)90222-b] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The genomes of warm-blooded vertebrates are mosaics of long DNA segments (> 300 kb, on the average), the isochores, homogeneous in GC levels, which belong to a small number of compositional families. In the present work, the human dystrophin-encoding gene, spanning more than 2.3 Mb in Giemsa band Xp21 (on the short arm of the X chromosome), was analyzed in its isochore organization by hybridizing cDNA probes, corresponding to eight contiguous segments of the coding sequence, on compositional fractions from human DNA. Five DNA regions of uniform (+/- 0.5%) GC content, separated by compositional discontinuities of about 2% GC, were found, so providing the first high-resolution compositional map obtained for a human genome locus and the first direct estimate of isochore size (360 kb to more than 770 kb, in the locus under consideration). One of the isochores contains 71% and another one 21% of deletion breakpoints found in patients suffering from Duchenne's and Becker's muscular dystrophies.
Collapse
Affiliation(s)
- T Bettecken
- Institut für Humangenetik, Universität Würzburg, Biozentrum, Germany
| | | | | | | |
Collapse
|
36
|
Abstract
We present a comparative study of residue usage correlations of various organism protein sets of diverse phylogenetic species and of open reading frames of several large human viral genomes. Our correlation analysis reveals three major tendencies: (i) charge compensation reflected by the high correlation of basic with acidic residues; (ii) the positive correlations of functionally and structurally similar amino acids including many pairs of hydrophobic amino acids, all pairs of aromatic amino acids, the anionic pair (glutamate and aspartate), but not the cationic pair (lysine and arginine), moderately the hydroxyl pair (serine and threonine), the small amino acids (glycine and alanine), and many (but not all) of those having high values in the Dayhoff substitutability matrix (characteristics such as amino acid polarity or codon usage agreement, except for the wobble position, do not necessarily imply significant positive correlations); (iii) a widespread negative correlation of the aggregate strong codon group amino acids (Ala, Gly, Pro) versus the weak codon group amino acids (Lys, Ile, Tyr, Asn, Phe). Discussion and speculations relate amino acid usage correlations to protein function/structure, cellular localization, proximity in amino acid biosynthetic pathways, amino acid relative abundances, tRNA and aminoacyl synthetase availabilities, and evolutionary processes.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, CA 94305
| | | |
Collapse
|
37
|
Eyre-Walker A. The role of DNA replication and isochores in generating mutation and silent substitution rate variance in mammals. Genet Res (Camb) 1992; 60:61-7. [PMID: 1452015 DOI: 10.1017/s0016672300030676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
It has been suggested that isochores are maintained by mutation biases, and that this leads to variation in the rate of mutation across the genome. A model of DNA replication is presented in which the probabilities of misincorporation and proofreading are affected by the composition and concentration of the free nucleotide pools. The relationship between sequence G+C content and the mutation rate is investigated. It is found that there is very little variation in the mutation rate between sequences of different G+C contents if the total concentration of the free nucleotides remains constant. However, variation in the mutation rate can be arbitrarily large if some mismatches are proofread and the total concentration of free nucleotides varies. Hence the model suggests that the maintenance of isochores by the replication of DNA in free nucleotide pools of biased composition does not lead per se to mutation rate variance. However, it is possible that changes in composition could be accompanied by changes in concentration, thus generating mutation rate variance. Furthermore, there is the possibility that germ-line selection could lead to alterations in the overall free nucleotide concentration through the cell cycle. These findings are discussed with reference to the variance in mammalian silent substitution rates.
Collapse
Affiliation(s)
- A Eyre-Walker
- Institute of Cell Animal and Population Biology, University of Edinburgh, Great Britain
| |
Collapse
|
38
|
Eyre-Walker A. Evidence that both G + C rich and G + C poor isochores are replicated early and late in the cell cycle. Nucleic Acids Res 1992; 20:1497-501. [PMID: 1579441 PMCID: PMC312229 DOI: 10.1093/nar/20.7.1497] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Since the G + C content of a gene is correlated to that of the isochore in which it resides, and early replicating isochores are thought to be relatively G + C rich, early replicating genes should also be rich in G + C. This hypothesis is tested on a sample of 44 mammalian genes for which replication time data and sequence information are available. Early replicating genes do not appear to be more G + C rich than late replicating genes, instead there is considerable variation in the G + C content of genes replicated during both halves of S phase. These results show that both G + C rich and poor fractions of the genome are replicated early and late in the cell cycle, and suggest that isochores are not maintained by the replication of DNA sequences in compositionally biased free nucleotide pools.
Collapse
Affiliation(s)
- A Eyre-Walker
- Institute of Cell Animal and Population Biology, University of Edinburgh, UK
| |
Collapse
|
39
|
Abstract
Rates of substitution mutations in two directions, v [from an A-T or T-A nucleotide pair (AT-pair) to a G-C or C-G nucleotide pair (GC-pair)] and u [from a GC-pair to an AT-pair], are usually not the same. The net effect, v/(u + v), has previously been defined as directional mutation pressure (mu D), which explains the wide interspecific variation and narrow intragenomic heterogeneity of DNA G + C content in bacteria. In this article, first, a theory of the evolution of DNA G + C content is presented that is based on the equilibrium among three components: directional mutation pressure, DNA G + C content, and selective constraints. According to this theory, consideration of both u and v as well as selective constraints is essential to explain the molecular evolution of the DNA base composition and sequence. Second, the theory of directional mutation pressure is applied to the analysis of the wide intragenomic heterogeneity of DNA G + C content in multicellular eukaryotes. The theory explains the extensive intragenomic heterogeneity of G + C content of higher eukaryotes primarily as the result of the intragenomic differences of directional mutation pressure and selective constraints rather than the result of positive selections for functional advantages of the DNA G + C content itself.
Collapse
Affiliation(s)
- N Sueoka
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder 80309-0347
| |
Collapse
|
40
|
Abstract
We have investigated the compositional distributions of third codon positions of genes from the 16 prokaryotes and seven eukaryotes for which the largest numbers of coding sequences are available in data banks. In prokaryotes, both narrow and broad distributions were found. In eukaryotes, distributions were very broad (except for Saccharomyces cerevisiae) and remarkably different for different genomes. In low-GC genomes, third codon positions were lower in GC than first + second codon positions and trailed towards high GC; the opposite situation was found for high-GC genomes. In all genomes, first codon positions were higher in GC than second codon positions. We then investigated the compositional correlations between third and first + second codon positions in prokaryotic genomes (the 16 mentioned above plus 87 additional ones) and in genome compartments of eukaryotes. A general, common relationship was found, which also holds within the same (heterogeneous) genomes. This universal correlation is due to the fact that the relative effects of compositional constraints on different codon positions are the same, on the average, whatever the genome under consideration.
Collapse
Affiliation(s)
- G D'Onofrio
- Laboratoire de Génétique Moleculaire, Institut Jacques Monod, Paris, France
| | | |
Collapse
|
41
|
Matassi G, Melis R, Macaya G, Bernardi G. Compositional bimodality of the nuclear genome of tobacco. Nucleic Acids Res 1991; 19:5561-7. [PMID: 1658735 PMCID: PMC328957 DOI: 10.1093/nar/19.20.5561] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
We have studied the compositional distribution of six genes (or small multigene families) and of one family of transposable elements, Tnt1, in DNA fractions from tobacco (Nicotiana tabacum) separated according to base composition. We have shown that gene distribution is bimodal and that such bimodality is due to the different base composition of the two parental genomes of tobacco (N.sylvestris and N.tomentosiformis) and to the different parental origin of the genes tested. These results indicate a physical separation and an absence of extensive recombination of the parental genomes, which have been together in the tobacco nucleus for a small span of their evolutionary life, and a conservation of their compositional patterns, including gene localization.
Collapse
Affiliation(s)
- G Matassi
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | |
Collapse
|
42
|
Abstract
We have shown that human genes associated with CpG islands increase in number as they increase in % of guanine + cytosine (GC) levels, and that most genes associated with CpG islands are located in the GC-richest compartment of the human genome. This is an independent confirmation of the concentration gradient of CpG islands (detected as HpaII tiny fragments, or HTF) which was demonstrated in the genome of warm-blooded vertebrates [Aïssani and Bernardi, Gene 106 (1991) 173-183]. We then reassessed the location of CpG islands using the data currently available and confirmed that CpG islands are most frequently located in the 5'-flanking sequences of genes and that they overlap genes to variable extents. We have shown that such extents increase with the increasing GC levels of genes, the GC-richest genes being completely included in CpG islands. Under such circumstances, we have investigated the properties of the 'extragenic' CpG islands located in the 5'-flanking segments of homologous genes from both warm- and cold-blooded vertebrates. We have confirmed that, in cold-blooded vertebrates, CpG islands are often absent; when present, they have lower GC and CpG levels; the latter attain, however, statistically expected values. Finally, we have shown that CpG doublets increase with the increasing GC of exons, introns and intergenic sequences (including 'extragenic' CpG islands) in the genomes from both warm- and cold-blooded vertebrates. The correlations found are the same for both classes of vertebrates, and are similar for exons, introns and intergenic sequences (including 'extragenic' CpG islands). The findings just outlined indicate that the origin and evolution of CpG islands in the vertebrate genome are associated with compositional transitions (GC increases) in genes and isochores.
Collapse
Affiliation(s)
- B Aïssani
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | |
Collapse
|
43
|
Krane DE, Hartl DL, Ochman H. Rapid determination of nucleotide content and its application to the study of genome structure. Nucleic Acids Res 1991; 19:5181-5. [PMID: 1833723 PMCID: PMC328873 DOI: 10.1093/nar/19.19.5181] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
We have developed a sensitive, reliable and accurate procedure for estimating the base composition of small samples of DNAs. This method has been applied to the analysis of genomic DNAs from several sources including large regions of human DNA cloned as yeast artificial chromosomes. To determine whether the human genome is compartmentalized into large segments of homogeneous base composition, we examined the GC content of a 1.2 megabase contig spanning the cystic fibrosis gene.
Collapse
Affiliation(s)
- D E Krane
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110
| | | | | |
Collapse
|
44
|
VanWye JD, Bronson EC, Anderson JN. Species-specific patterns of DNA bending and sequence. Nucleic Acids Res 1991; 19:5253-61. [PMID: 1923808 PMCID: PMC328884 DOI: 10.1093/nar/19.19.5253] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Nucleotide sequences in the GenEMBL database were analyzed using strategies designed to reveal species-specific patterns of DNA bending and DNA sequence. The results uncovered striking species-dependent patterns of bending with more variations among individual organisms than between prokaryotes and eukaryotes. The frequency of bent sites in sequences from different bacteria was related to genomic A + T content and this relationship was confirmed by electrophoretic analysis of genomic DNA. However, base composition was not an accurate predictor for DNA bending in eukaryotes. Sequences from C. elegans exhibited the highest frequency of bent sites in the database and the RNA polymerase II locus from the nematode was the most bent gene in GenEMBL. Bent DNA extended throughout most introns and gene flanking segments from C.elegans while exon regions lacked A-tract bending characteristics. Independent evidence for the strong bending character of this genome was provided by electrophoretic studies which revealed that a large number of the fragments from C.elegans DNA exhibited anomalous gel mobilities when compared to genomic fragments from over 20 other organisms. The prevalence of bent sites in this genome enabled us to detect selectively C.elegans sequences in a computer search of the database using as probes C.elegans introns, bending elements, and a 20 nucleotide consensus sequence for bent DNA. This approach was also used to provide additional examples of species-specific sequence patterns in eukaryotes where it was shown that (A) greater than or equal to 10 and (A.T) greater than or equal to 5 tracts are prevalent throughout the untranslated DNA of D.discodium and P.falciparum, respectively. These results provide new insight into the organization of eukaryotic DNA because they show that species-specific patterns of simple sequences are found in introns and in other untranslated regions of the genome.
Collapse
Affiliation(s)
- J D VanWye
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | | | | |
Collapse
|
45
|
D'Onofrio G, Mouchiroud D, Aïssani B, Gautier C, Bernardi G. Correlations between the compositional properties of human genes, codon usage, and amino acid composition of proteins. J Mol Evol 1991; 32:504-10. [PMID: 1908021 DOI: 10.1007/bf02102652] [Citation(s) in RCA: 124] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
We have analyzed the correlation that exists between the GC levels of third and first or second codon position for about 1400 human coding sequences. The linear relationship that was found indicates that the large differences in GC level of third codon positions of human genes are paralleled by smaller differences in GC levels of first and second codon positions. Whereas third codon position differences correspond to very large differences in codon usage within the human genome, the first and second codon position differences correspond to smaller, yet very remarkable, differences in the amino acid composition of encoded proteins. Because GC levels of codon positions are linearly correlated with the GC levels of the isochores harboring the corresponding genes, both codon usage and amino acid composition are different for proteins encoded by genes located in isochores of different GC levels. Furthermore, we have also shown that a linear relationship with a unit slope and a correlation coefficient of 0.77 exists between GC levels of introns and exons from the 238 human genes currently available for this analysis. Introns are, however, about 5% lower in GC, on average, than exons from the same genes.
Collapse
Affiliation(s)
- G D'Onofrio
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris, France
| | | | | | | | | |
Collapse
|
46
|
Mouchiroud D, D'Onofrio G, Aïssani B, Macaya G, Gautier C, Bernardi G. The distribution of genes in the human genome. Gene X 1991; 100:181-7. [PMID: 2055469 DOI: 10.1016/0378-1119(91)90364-h] [Citation(s) in RCA: 181] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Previous investigations on the human genome determined: (i) the base compositions (GC levels) and the relative amounts of its isochore families; (ii) the compositional correlations (i.e., the correlations between GC levels) between third codon positions of a set of genes and the DNA fractions in which the genes were localized; and (iii) the compositional correlations between (a) third and first + second codon positions, as well as that between (b) introns and exons from the set of 'localized genes' and from all the coding sequences and genes (genomic sequences of exons + introns) available in gene banks. Here, we have shown that the correlations (iii, a and b) for 'localized genes' and genes from the bank are in full agreement, indicating that the former set is representative of the latter. We have then used the data (i) and the correlation (ii) to estimate the distribution of genes in isochore families. We have found that 34% of the genes are located in the GC-poor isochores (which represent 62% of the genome), 38% in the GC-rich isochores (31% of the genome) and 28% in the GC-richest isochores (3% of the genome). There is, therefore, a compositional gradient of gene concentration in the human genome. The gene density in the GC-richest 3% of the genome is about eight times higher than in the GC-rich 31%, and about 16 times higher than in the GC-poorest 62%.
Collapse
Affiliation(s)
- D Mouchiroud
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, Paris France
| | | | | | | | | | | |
Collapse
|