1
|
Mejias A, Diez-Hermano S, Ganfornina MD, Gutierrez G, Sanchez D. Characterization of mammalian Lipocalin UTRs in silico: Predictions for their role in post-transcriptional regulation. PLoS One 2019; 14:e0213206. [PMID: 30840684 PMCID: PMC6402760 DOI: 10.1371/journal.pone.0213206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 02/15/2019] [Indexed: 01/20/2023] Open
Abstract
The Lipocalin family is a group of homologous proteins characterized by its big array of functional capabilities. As extracellular proteins, they can bind small hydrophobic ligands through a well-conserved β-barrel folding. Lipocalins evolutionary history sprawls across many different taxa and shows great divergence even within chordates. This variability is also found in their heterogeneous tissue expression pattern. Although a handful of promoter regions have been previously described, studies on UTR regulatory roles in Lipocalin gene expression are scarce. Here we report a comprehensive bioinformatic analysis showing that complex post-transcriptional regulation exists in Lipocalin genes, as suggested by the presence of alternative UTRs with substantial sequence conservation in mammals, alongside a high diversity of transcription start sites and alternative promoters. Strong selective pressure could have operated upon Lipocalins UTRs, leading to an enrichment in particular sequence motifs that limit the choice of secondary structures. Mapping these regulatory features to the expression pattern of early and late diverging Lipocalins suggests that UTRs represent an additional phylogenetic signal, which may help to uncover how functional pleiotropy originated within the Lipocalin family.
Collapse
Affiliation(s)
- Andres Mejias
- Departamento de Genetica, Universidad de Sevilla, Sevilla, Spain
| | - Sergio Diez-Hermano
- Instituto de Biologia y Genetica Molecular-Departamento de Bioquimica y Biologia Molecular y Fisiologia, Universidad de Valladolid-CSIC, Valladolid, Spain
- Departamento de Matemática Aplicada, Universidad Complutense, Madrid, Spain
| | - Maria D. Ganfornina
- Instituto de Biologia y Genetica Molecular-Departamento de Bioquimica y Biologia Molecular y Fisiologia, Universidad de Valladolid-CSIC, Valladolid, Spain
| | | | - Diego Sanchez
- Instituto de Biologia y Genetica Molecular-Departamento de Bioquimica y Biologia Molecular y Fisiologia, Universidad de Valladolid-CSIC, Valladolid, Spain
- * E-mail:
| |
Collapse
|
2
|
Jabbari K, Nürnberg P. A genomic view on epilepsy and autism candidate genes. Genomics 2016; 108:31-6. [PMID: 26772991 DOI: 10.1016/j.ygeno.2016.01.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 12/15/2015] [Accepted: 01/01/2016] [Indexed: 01/25/2023]
Abstract
Epilepsy is a common complex disorder most frequently associated with psychiatric and neurological diseases. Massive parallel sequencing of individual or cohort genomes and exomes led the identification of several disease associated genes. We review here the candidate genes in epilepsy genetics with focus on exome and gene panel data. Together with the examination of brain expressed genes and post synaptic proteome the results show that: (1) Non-metabolic epilepsies and autism candidate genes tend to be AT-rich and (2) large transcript size and local AT-richness are characteristic features of genes involved in developmental brain disorders and synaptic functions. These results point to the preferential location of core epilepsy and autism candidate genes in late replicating, GC-poor chromosomal regions (isochores). These results indicate that the genomic alterations leading to some brain disorders are confined to responsive chromatin areas harboring brain critical genes.
Collapse
Affiliation(s)
- Kamel Jabbari
- Cologne Center for Genomics, University of Cologne, Cologne, Germany.
| | - Peter Nürnberg
- Cologne Center for Genomics, University of Cologne, Cologne, Germany
| |
Collapse
|
3
|
Ranade SS, Lin YC, Zuccolo A, Van de Peer Y, García-Gil MDR. Comparative in silico analysis of EST-SSRs in angiosperm and gymnosperm tree genera. BMC PLANT BIOLOGY 2014; 14:220. [PMID: 25143005 PMCID: PMC4160553 DOI: 10.1186/s12870-014-0220-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 08/05/2014] [Indexed: 05/24/2023]
Abstract
BACKGROUND Simple Sequence Repeats (SSRs) derived from Expressed Sequence Tags (ESTs) belong to the expressed fraction of the genome and are important for gene regulation, recombination, DNA replication, cell cycle and mismatch repair. Here, we present a comparative analysis of the SSR motif distribution in the 5'UTR, ORF and 3'UTR fractions of ESTs across selected genera of woody trees representing gymnosperms (17 species from seven genera) and angiosperms (40 species from eight genera). RESULTS Our analysis supports a modest contribution of EST-SSR length to genome size in gymnosperms, while EST-SSR density was not associated with genome size in neither angiosperms nor gymnosperms. Multiple factors seem to have contributed to the lower abundance of EST-SSRs in gymnosperms that has resulted in a non-linear relationship with genome size diversity. The AG/CT motif was found to be the most abundant in SSRs of both angiosperms and gymnosperms, with a relative increase in AT/AT in the latter. Our data also reveals a higher abundance of hexamers across the gymnosperm genera. CONCLUSIONS Our analysis provides the foundation for future comparative studies at the species level to unravel the evolutionary processes that control the SSR genesis and divergence between angiosperm and gymnosperm tree species.
Collapse
Affiliation(s)
- Sonali Sachin Ranade
- />Umeå Plant Science Centre (UPSC), Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, SE-901-83 Umeå, Sweden
| | - Yao-Cheng Lin
- />Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
| | - Andrea Zuccolo
- />Istituto di Genomica Applicata, Via J. Linussio 51, 33100 Udine, Italy
- />Institute of Life Sciences, Scuola Superiore Sant’Anna, 56127 Pisa, Italy
| | - Yves Van de Peer
- />Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, 9052 Ghent, Belgium
- />Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria, 0028 South Africa
| | - María del Rosario García-Gil
- />Umeå Plant Science Centre (UPSC), Department of Forest Genetics and Plant Physiology, Swedish University of Agricultural Sciences, SE-901-83 Umeå, Sweden
| |
Collapse
|
4
|
Characterization of NOL7 gene point mutations, promoter methylation, and protein expression in cervical cancer. Int J Gynecol Pathol 2014; 31:15-24. [PMID: 22123719 DOI: 10.1097/pgp.0b013e318220ba16] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
NOL7 is a putative tumor suppressor gene localized to 6p23, a region with frequent loss of heterozygosity in a number of cancers, including cervical cancer (CC). We have previously demonstrated that reintroduction of NOL7 into CC cells altered the angiogenic phenotype and suppressed tumor growth in vivo by 95%. Therefore, to understand its mechanism of inactivation in CC, we investigated the genetic and epigenetic regulation of NOL7. NOL7 mRNA and protein levels were assessed in 13 CC cell lines and 23 consecutive CC specimens by real-time quantitative polymerase chain reaction, western blotting, and immunohistochemistry. Methylation of the NOL7 promoter was analyzed by bisulfite sequencing and mutations were identified through direct sequencing. A CpG island with multiple CpG dinucleotides spanned the 5' untranslated region and first exon of NOL7. However, bisulfite sequencing failed to identify persistent sites of methylation. Mutational sequencing revealed that 40% of the CC specimens and 31% of the CC cell lines harbored somatic mutations that may affect the in vivo function of NOL7. Endogenous NOL7 mRNA and protein expression in CC cell lines were significantly decreased in 46% of the CC cell lines. Finally, immunohistochemistry demonstrated strong NOL7 nucleolar staining in normal tissues that decreased with histologic progression toward CC. NOL7 is inactivated in CC in accordance with the Knudson 2-hit hypothesis through loss of heterozygosity and mutation. Together with evidence of its in vivo tumor suppression, these data support the hypothesis that NOL7 is the legitimate tumor suppressor gene located on 6p23.
Collapse
|
5
|
Arhondakis S, Auletta F, Bernardi G. Isochores and the regulation of gene expression in the human genome. Genome Biol Evol 2012; 3:1080-9. [PMID: 21979159 PMCID: PMC3227402 DOI: 10.1093/gbe/evr017] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
It is well established that changes in the phenotype depend much more on changes in gene expression than on changes in protein-coding genes, and that cis-regulatory sequences and chromatin structure are two major factors influencing gene expression. Here, we investigated these factors at the genome-wide level by focusing on the trinucleotide patterns in the 0.1- to 25-kb regions flanking the human genes that are present in the GC-poorest L1 and GC-richest H3 isochore families, the other families exhibiting intermediate patterns. We could show 1) that the trinucleotide patterns of the 25-kb gene-flanking regions are representative of the very different patterns already reported for the whole isochores from the L1 and H3 families and, expectedly, identical in upstream and downstream locations; 2) that the patterns of the 0.1- to 0.5-kb regions in the L1 and H3 isochores are remarkably more divergent and more specific when compared with those of the 25-kb regions, as well as different in the upstream and downstream locations; and 3) that these patterns fade into the 25-kb patterns around 5kb in both upstream and downstream locations. The 25-kb findings indicate differences in nucleosome positioning and density in different isochore families, those of the 0.1- to 0.5-kb sequences indicate differences in the transcription factors that bind upstream and downstream of genes. These results indicate differences in the regulation of genes located in different isochore families, a point of functional and evolutionary relevance.
Collapse
Affiliation(s)
- Stilianos Arhondakis
- Bioinformatics and Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | | | | |
Collapse
|
6
|
Mankame TP, Zhou G, Lingen MW. Identification and characterization of the human NOL7 gene promoter. Gene 2010; 456:36-44. [PMID: 20206243 PMCID: PMC3408873 DOI: 10.1016/j.gene.2010.02.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2009] [Revised: 01/29/2010] [Accepted: 02/16/2010] [Indexed: 11/24/2022]
Abstract
NOL7 is a candidate tumor suppressor gene that localizes to 6p23, a chromosomal region frequently associated with loss of heterozygosity in a number of malignancies including cervical cancer (CC). Re-expression of NOL7 in CC cells suppresses in vivo tumor growth by 95% and alters the angiogenic phenotype by modulating the expression of VEGF and TSP1. Here, we describe the determination of two NOL7 transcriptional start sites (TSS), the cloning of its regulatory promoter region, and the identification of transcription factors that regulate its expression. Using 5' Rapid amplification of complementary DNA ends (RACE), two transcriptional start sites were identified. Deletion analysis determined that the essential elements required for the optimal promoter activity of NOL7 were 560 bp upstream of its translation start site. In silico analysis suggested that the promoter region contained potential binding sites for the SP1, c-Myc and RXRalpha transcription factors as well as an overall GC content of greater than 60%. Chromatin immunoprecipitation (ChIP) confirmed that SP1, c-Myc and RXRalpha bound to the NOL7 promoter region. Finally, we demonstrate that NOL7 expression was positively regulated by c-Myc and RXRalpha. These results demonstrate that the NOL7 promoter region possesses each of the key elements of a TATA-less promoter. In addition, the positive regulation of NOL7 by c-Myc and RXRalpha provides additional mechanistic insights into the potential role of NOL7 in CC and other malignancies.
Collapse
Affiliation(s)
- Tanmayi P Mankame
- Department of Pathology, The University of Chicago, Chicago, IL 60637, USA
| | | | | |
Collapse
|
7
|
Urrutia AO, Ocaña LB, Hurst LD. Do Alu repeats drive the evolution of the primate transcriptome? Genome Biol 2008; 9:R25. [PMID: 18241332 PMCID: PMC2374697 DOI: 10.1186/gb-2008-9-2-r25] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2007] [Revised: 01/02/2008] [Accepted: 02/01/2008] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Of all repetitive elements in the human genome, Alus are unusual in being enriched near to genes that are expressed across a broad range of tissues. This has led to the proposal that Alus might be modifying the expression breadth of neighboring genes, possibly by providing CpG islands, modifying transcription factor binding, or altering chromatin structure. Here we consider whether Alus have increased expression breadth of genes in their vicinity. RESULTS Contrary to the modification hypothesis, we find that those genes that have always had broad expression are richest in Alus, whereas those that are more likely to have become more broadly expressed have lower enrichment. This finding is consistent with a model in which Alus accumulate near broadly expressed genes but do not affect their expression breadth. Furthermore, this model is consistent with the finding that expression breadth of mouse genes predicts Alu density near their human orthologs. However, Alus were found to be related to some alternative measures of transcription profile divergence, although evidence is contradictory as to whether Alus associate with lowly or highly diverged genes. If Alu have any effect it is not by provision of CpG islands, because they are especially rare near to transcriptional start sites. Previously reported Alu enrichment for genes serving certain cellular functions, suggested to be evidence of functional importance of Alus, appears to be partly a byproduct of the association with broadly expressed genes. CONCLUSION The abundance of Alu near broadly expressed genes is better explained by their preferential preservation near to housekeeping genes rather than by a modifying effect on expression of genes.
Collapse
Affiliation(s)
- Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA4 7AY, UK.
| | | | | |
Collapse
|
8
|
Ren L, Gao G, Zhao D, Ding M, Luo J, Deng H. Developmental stage related patterns of codon usage and genomic GC content: searching for evolutionary fingerprints with models of stem cell differentiation. Genome Biol 2007; 8:R35. [PMID: 17349061 PMCID: PMC1868930 DOI: 10.1186/gb-2007-8-3-r35] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2006] [Revised: 01/08/2007] [Accepted: 03/12/2007] [Indexed: 11/26/2022] Open
Abstract
Developmental-stage-related patterns of gene expression correlate with codon usage and genomic GC content in stem cell hierarchies. Background The usage of synonymous codons shows considerable variation among mammalian genes. How and why this usage is non-random are fundamental biological questions and remain controversial. It is also important to explore whether mammalian genes that are selectively expressed at different developmental stages bear different molecular features. Results In two models of mouse stem cell differentiation, we established correlations between codon usage and the patterns of gene expression. We found that the optimal codons exhibited variation (AT- or GC-ending codons) in different cell types within the developmental hierarchy. We also found that genes that were enriched (developmental-pivotal genes) or specifically expressed (developmental-specific genes) at different developmental stages had different patterns of codon usage and local genomic GC (GCg) content. Moreover, at the same developmental stage, developmental-specific genes generally used more GC-ending codons and had higher GCg content compared with developmental-pivotal genes. Further analyses suggest that the model of translational selection might be consistent with the developmental stage-related patterns of codon usage, especially for the AT-ending optimal codons. In addition, our data show that after human-mouse divergence, the influence of selective constraints is still detectable. Conclusion Our findings suggest that developmental stage-related patterns of gene expression are correlated with codon usage (GC3) and GCg content in stem cell hierarchies. Moreover, this paper provides evidence for the influence of natural selection at synonymous sites in the mouse genome and novel clues for linking the molecular features of genes to their patterns of expression during mammalian ontogenesis.
Collapse
Affiliation(s)
- Lichen Ren
- College of Life Sciences, Shanghai Jiao Tong University, Shanghai, 200240, PR China
| | - Ge Gao
- Center for Bioinformatics, College of Life Sciences, National Laboratory of Protein Engineering and Plant Genetics Engineering, Peking University, Beijing, 100871, PR China
| | - Dongxin Zhao
- Department of Cell Biology and Genetics, College of Life Sciences, Peking University, Beijing, 100871, PR China
| | - Mingxiao Ding
- Department of Cell Biology and Genetics, College of Life Sciences, Peking University, Beijing, 100871, PR China
| | - Jingchu Luo
- Center for Bioinformatics, College of Life Sciences, National Laboratory of Protein Engineering and Plant Genetics Engineering, Peking University, Beijing, 100871, PR China
| | - Hongkui Deng
- Department of Cell Biology and Genetics, College of Life Sciences, Peking University, Beijing, 100871, PR China
| |
Collapse
|
9
|
Wren JD, Conway T. Meta-analysis of published transcriptional and translational fold changes reveals a preference for low-fold inductions. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2006; 10:15-27. [PMID: 16584315 DOI: 10.1089/omi.2006.10.15] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The goals of this study were to gain a better quantitative understanding of the dynamic range of transcriptional and translational response observed in biological systems and to examine the reporting of regulatory events for trends and biases. A straightforward pattern-matching routine extracted 3,408 independent observations regarding transcriptional fold-changes and 1,125 regarding translational fold-changes from over 15 million MEDLINE abstracts. Approximately 95% of reported changes were > or =2-fold. Further, the historical trend of reporting individual fold-changes is declining in favor of high-throughput methods for transcription but not translation. Where it was possible to compare the average fold-changes in transcription and translation for the same gene/product (203 examples), approximately 53% were a < or =2-fold difference, suggesting a loose tendency for the two to be coupled in magnitude. We found also that approximately three-fourths of reported regulatory events have been at the transcriptional level. The frequency distribution appears to be normally distributed and peaks near 2-fold, suggesting that nature selects for a low-energy solution to regulatory responses. Because high-throughput technologies ordinarily sacrifice measurement quality for quantity, this also suggests that many regulatory events may not be reliably detectable by such technologies. Text mining of regulatory events and responses provides additional information incorporable into microarray analysis, such as prior fold-change observations and flagging genes that are regulated post-transcription. All extracted regulation and response patterns can be downloaded at the following website: www.ou.edu/microarray/ oumcf/Meta_analysis.xls.
Collapse
Affiliation(s)
- Jonathan D Wren
- Advanced Center for Genome Technology, Department of Botany and Microbiology, The University of Oklahoma, Norman, 73019, USA.
| | | |
Collapse
|
10
|
Abstract
MOTIVATION We present a novel algorithm, MaMF, for identifying transcription factor (TF) binding site motifs. The method is deterministic and depends on an indexing technique to optimize the search process. On common yeast datasets, MaMF performs competitively with other methods. We also present results on a challenging group of eight sets of human genes known to be responsive to a diverse group of TFs. In every case, MaMF finds the annotated motif among the top scoring putative motifs. We compared MaMF against other motif finders on a larger human group of 21 gene sets and found that MaMF performs better than other algorithms. We analyzed the remaining high scoring motifs and show that many correspond to other TFs that are known to co-occur with the annotated TF motifs. The significant and frequent presence of co-occurring transcription factor binding sites explains in part the difficulty of human motif finding. MaMF is a very fast algorithm, suitable for application to large numbers of interesting gene sets.
Collapse
Affiliation(s)
- Lawrence S Hon
- UCSF Cancer Research Institute and Comprehensive Cancer Center, University of California San Francisco, CA, USA
| | | |
Collapse
|
11
|
Cheng G, Cohen L, Ndegwa D, Davis RE. The Flatworm Spliced Leader 3′-Terminal AUG as a Translation Initiator Methionine. J Biol Chem 2006; 281:733-43. [PMID: 16230357 DOI: 10.1074/jbc.m506963200] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Spliced leader (SL) RNA trans-splicing contributes the 5' termini to mRNAs in a variety of eukaryotes. In contrast with some transsplicing metazoan groups (e.g. nematodes), flatworm spliced leaders are variable in both sequence and length in different flatworm taxa. However, an absolutely conserved and unique feature of all flatworm spliced leaders is the presence of a 3'-terminal AUG. We previously suggested that the Schistosoma mansoni spliced leader AUG might contribute a required translation initiator methionine to recipient mRNAs. Here we identified and examined trans-spliced cDNAs from a large set of newly available schistosome cDNAs. 28% of the trans-spliced cDNAs have the SL AUG in-frame with the major open reading frame of the mRNA. We identified over 40 cDNAs (40% of the SL AUG in-frame clones) that require the SL AUG as an initiator methionine to synthesize phylogenetically conserved N-terminal residues characteristic of orthologous proteins. RNA transfection experiments using several schistosome stages demonstrated that the flatworm SL AUG can serve as a translation initiator methionine in vivo. We also present in vivo translation studies of the schistosome initiator methionine context and the effect of the spliced leader AUG added upstream and out-of-frame with the main open reading of recipient mRNAs. Overall, our data have provided evidence that another function of flatworm spliced leader trans-splicing is to provide some recipient mRNAs with an initiator methionine for translation initiation.
Collapse
Affiliation(s)
- Guofeng Cheng
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, 80045, USA
| | | | | | | |
Collapse
|
12
|
Wang M, Marín A. Characterization and prediction of alternative splice sites. Gene 2005; 366:219-27. [PMID: 16226402 DOI: 10.1016/j.gene.2005.07.015] [Citation(s) in RCA: 192] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2004] [Revised: 04/20/2005] [Accepted: 07/08/2005] [Indexed: 11/16/2022]
Abstract
Human alternative isoform, cryptic, skipped, and constitutive splice sites from the ALTEXTRON database were analysed regarding splice site strength, composition, GC content, position and binding site strength of polypyrimidine tract and branch site. Several features were identified which distinguish alternative isoform and cryptic splice sites, but not skipped splice sites from constitutive ones. These include splice site strength, introns GC content, U2AF35 binding site score, and oligonucleotide frequencies. For the predictive classification of splice sites, pattern recognition models for different splicing factor binding sites and oligonucleotide frequency models (OFMs) were combined using backpropagation networks. 67.45% of acceptor sites and 71.23% of donor sites are correctly classified by networks trained for classification of constitutive and alternative isoform/cryptic splice sites. A web-application for the prediction of alternative splice sites is available at http://es.embnet.org/~mwang/assp.html .
Collapse
Affiliation(s)
- Magnus Wang
- Departamento de Genética, Facultad de Biología, Universidad de Sevilla, Avenida de Reina Mercedes 6, E-41012 Sevilla, Spain.
| | | |
Collapse
|
13
|
Yamashita R, Suzuki Y, Sugano S, Nakai K. Genome-wide analysis reveals strong correlation between CpG islands with nearby transcription start sites of genes and their tissue specificity. Gene 2005; 350:129-36. [PMID: 15784181 DOI: 10.1016/j.gene.2005.01.012] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2004] [Revised: 12/28/2004] [Accepted: 01/24/2005] [Indexed: 10/25/2022]
Abstract
It has been envisaged that CpG islands are often observed near the transcriptional start sites (TSS) of housekeeping genes. However, neither the precise positions of CpG islands relative to TSS of genes nor the correlation between the presence of the CpG islands and the expression specificity of these genes is well-understood. Using thousands of sequences with known TSS in human and mouse, we found that there is a clear peak in the distribution of CpG islands around TSS in the genes of these two species. Thus, we classified human (mouse) genes into 6600 (2948) CpG+ genes and 2619 (1830) CpG- ones, based on the presence of a CpG island within the -100: +100 region. We estimated the degree of each gene being a housekeeper by the number of cDNA libraries where its ESTs were detected. Then, the tendency that a gene lacking CpG islands around its TSS is expressed with a higher degree of tissue specificity turned out to be evolutionarily conserved. We also confirmed this tendency by analyzing the gene ontology annotation of classified genes. Since no such clear correlation was found in the control data (mRNAs, pre-mRNAs, and chromosome banding pattern), we concluded that the effect of a CpG island near the TSS should be more important than the global GC content of the region where the gene resides.
Collapse
Affiliation(s)
- Riu Yamashita
- Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1, Shirokane-dai Minato-ku, Tokyo 108-8639, Japan
| | | | | | | |
Collapse
|
14
|
Vinogradov AE. Noncoding DNA, isochores and gene expression: nucleosome formation potential. Nucleic Acids Res 2005; 33:559-63. [PMID: 15673716 PMCID: PMC548339 DOI: 10.1093/nar/gki184] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2004] [Revised: 12/21/2004] [Accepted: 12/21/2004] [Indexed: 12/04/2022] Open
Abstract
The nucleosome formation potential of introns, intergenic spacers and exons of human genes is shown here to negatively correlate with among-tissues breadth of gene expression. The nucleosome formation potential is also found to negatively correlate with the GC content of genomic sequences; the slope of regression line is steeper in exons compared with noncoding DNA (introns and intergenic spacers). The correlation with GC content is independent of sequence length; in turn, the nucleosome formation potential of introns and intergenic spacers positively (albeit weakly) correlates with sequence length independently of GC content. These findings help explain the functional significance of the isochores (regions differing in GC content) in the human genome as a result of optimization of genomic structure for epigenetic complexity and support the notion that noncoding DNA is important for orderly chromatin condensation and chromatin-mediated suppression of tissue-specific genes.
Collapse
|
15
|
Kochetov AV. AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context. Bioinformatics 2004; 21:837-40. [PMID: 15531618 DOI: 10.1093/bioinformatics/bti136] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. However, mRNAs with a suboptimal context of start AUG codon are relatively abundant. It is likely that at least some mRNAs with suboptimal start codon context contain the other signals providing additional information for efficient AUG recognition. RESULTS Frequency of AUG codons at the beginning of the coding part of eukaryotic mRNAs was analyzed in relation to the context of translation start codon. It was found that the observed downstream AUG content in the mRNAs with optimal start codon context was close to the expected value, whereas it was significantly higher in the mRNAs with a suboptimal context. It is likely that downstream AUG codons can often be utilized as additional start sites to increase translation rate of mRNAs with a suboptimal context of the annotated start codon and many eukaryotic proteins can be characterized by some N-end heterogeneity.
Collapse
Affiliation(s)
- Alex V Kochetov
- Institute of Cytology and Genetics Lavrentieva 10, Novosibirsk 630090 Russia.
| |
Collapse
|
16
|
Vinogradov AE. Compactness of human housekeeping genes: selection for economy or genomic design? Trends Genet 2004; 20:248-53. [PMID: 15109779 DOI: 10.1016/j.tig.2004.03.006] [Citation(s) in RCA: 123] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Alexander E Vinogradov
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Ave 4, St Petersburg 194064, Russia.
| |
Collapse
|
17
|
Hon LS, Jain AN. Compositional structure of repetitive elements is quantitatively related to co-expression of gene pairs. J Mol Biol 2003; 332:305-10. [PMID: 12948482 DOI: 10.1016/s0022-2836(03)00926-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
A sequence similarity metric operating on 10 kb upstream regions of gene pairs quantitatively predicts a portion of co-variation of expression of gene pairs in large-scale gene expression studies in human tumors and tumor-derived cell lines. The signal on which the metric depends most strongly originates in the compositional structure of repetitive genomic sequences (particularly Alu elements) present in these upstream regions. This effect is completely separable from effects of isochore composition on gene expression. The results implicate repetitive elements with some functional role in transcriptional regulation of the specific genes in whose promoter regions they reside and lend credence to suggestions that the general phenomenon of repetitive element insertions may be a fundamental evolutionary mechanism for modulating gene transcription.
Collapse
Affiliation(s)
- Lawrence S Hon
- Cancer Research Institute, University of California, 2340 Sutter Street S-336, Box 0128, San Francisco, CA 94143-0128, USA
| | | |
Collapse
|
18
|
Vinogradov AE. Isochores and tissue-specificity. Nucleic Acids Res 2003; 31:5212-20. [PMID: 12930973 PMCID: PMC212799 DOI: 10.1093/nar/gkg699] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2003] [Revised: 05/11/2003] [Accepted: 07/03/2003] [Indexed: 11/13/2022] Open
Abstract
The housekeeping (ubiquitously expressed) genes in the mammal genome were shown here to be on average slightly GC-richer than tissue-specific genes. Both housekeeping and tissue-specific genes occupy similar ranges of GC content, but the former tend to concentrate in the upper part of the range. In the human genome, tissue-specific genes show two maxima, GC-poor and GC-rich. The strictly tissue-specific human genes tend to concentrate in the GC-poor region; their distribution is left-skewed and thus reciprocal to the distribution of housekeeping genes. The intermediately tissue-specific genes show an intermediate GC content and the right-skewed distribution. Both in the human and mouse, genes specific for some tissues (e.g., parts of the central nervous system) have a higher average GC content than housekeeping genes. Since they are not transcribed in the germ line (in contrast to housekeeping genes), and therefore have a lower probability of inheritable gene conversion, this finding contradicts the biased gene conversion (BGC) explanation for elevated GC content in the heavy isochores of mammal genome. Genes specific for germ-line tissues (ovary, testes) show a low average GC content, which is also in contradiction to the BGC explanation. Both for the total data set and for the most part of tissues taken separately, a weak positive correlation was found between gene GC content and expression level. The fraction of ubiquitously expressed genes is nearly 1.5-fold higher in the mouse than in the human. This suggests that mouse tissues are comparatively less differentiated (on the molecular level), which can be related to a less pronounced isochoric structure of the mouse genome. In each separate tissue (in both species), tissue-specific genes do not form a clear-cut frequency peak (in contrast to housekeeping genes), but constitute a continuum with a gradually increasing degree of tissue-specificity, which probably reflects the path of cell differentiation and/or an independent use of the same protein in several unrelated tissues.
Collapse
Affiliation(s)
- Alexander E Vinogradov
- Institute of Cytology, Russian Academy of Sciences, Tikhoretsky Avenue 4, St Petersburg 194064, Russia.
| |
Collapse
|
19
|
Abstract
Genes are non-uniformly distributed in the human genome, reaching the highest concentration in GC-rich isochores. This is one of the fundamental aspects of the human genome organization (Gene 241/259 (2000a,b) 3/31, for a review). In the present paper the gene distribution was analyzed in relationship to the gene expression pattern and levels. In this study evidence is produced showing: (i) that a biased gene distribution towards GC-rich isochores applies to both tissue-specific and housekeeping genes; and (ii) that genes localized in GC-rich isochores have high transcriptional levels. Since gene density and transcriptional levels are correlated with each other and both are correlated with the GC content of the isochores, the biased gene distribution in the human genome presumably is the result of selection at the gene expression levels.
Collapse
Affiliation(s)
- Giuseppe D'Onofrio
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Naples, Italy.
| |
Collapse
|
20
|
Abstract
Gene expression is finely regulated at the post-transcriptional level. Features of the untranslated regions of mRNAs that control their translation, degradation and localization include stem-loop structures, upstream initiation codons and open reading frames, internal ribosome entry sites and various cis-acting elements that are bound by RNA-binding proteins.
Collapse
Affiliation(s)
- Flavio Mignone
- Dipartimento di Fisiologia e Biochimica Generali, Università di Milano, Via Celoria, 26, 20133 Milano, Italy.
| | | | | | | |
Collapse
|
21
|
Ruiz-Chica J, Medina MA, Sánchez-Jiménez F, Ramírez FJ. Raman study of the interaction between polyamines and a GC oligonucleotide. Biochem Biophys Res Commun 2001; 285:437-46. [PMID: 11444862 DOI: 10.1006/bbrc.2001.5192] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The interaction between the oligonucleotide d[G(CG)(7)]. d[C(GC)(7)] and the three biogenic polyamines putrescine, spermidine, and spermine under physiological conditions has been studied by Raman spectroscopy. The results indicate the formation of highly ordered aggregated structures in solution, largely stabilized by electrostatic attractions, which have been described as cholesteric phases. Aggregation seems to be preceded by a partial B --> Z conformational transition for spermidine and spermine, which would allow for a deeper oligonucleotide-polyamine interaction. Interaction with the nucleic bases has also been evidenced for aggregates. At low polyamine concentrations the preferential binding sites are similar to those proposed for their interactions with ct-DNA. With increasing the polyamine concentration, the oligonucleotide-polyamine interactions involve both minor and major grooves, which is consistent with the formation of cholesteric phases.
Collapse
Affiliation(s)
- J Ruiz-Chica
- Departamento de Química Fisica, Facultad de Ciencias, Universidad de Málaga, 29071 Málaga, Spain
| | | | | | | |
Collapse
|
22
|
Pesole G, Gissi C, Grillo G, Licciulli F, Liuni S, Saccone C. Analysis of oligonucleotide AUG start codon context in eukariotic mRNAs. Gene 2000; 261:85-91. [PMID: 11164040 DOI: 10.1016/s0378-1119(00)00471-6] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The AUG start codon context features have been investigated by analyzing eukaryotic mRNAs belonging to various taxonomic groups. The functional relevance of each specific position surrounding the AUG start codon has been established as a function of the measured shift between base composition observed at that particular position, and base composition averaged over all the 5'untranslated regions. A more detailed analysis carried out on human genes belonging to different isochores showed significant isochore-specific fea-tures that cannot be explained only by a mutational bias effect. The most represented heptamers spanning from position -3 to +4 with respect to the initiator AUG have been determined for mRNAs belonging to different taxonomic groups and a web page utility has been set up (http://bigarea.area.ba.cnr.it:8000/BioWWW/ATG.html) to determine the relative abundance of a user submitted oligonucleotide context in a given species or taxon.
Collapse
Affiliation(s)
- G Pesole
- Dipartimento di Fisiologia e Biochimica Generali, Università di Milano, Via Celoria 26, 20133, Milan, Italy.
| | | | | | | | | | | |
Collapse
|
23
|
Abstract
The compositional evolution of vertebrate genomes is characterized: (i) by one predominant conservative mode, in which nucleotide changes occur, but the base composition of DNA sequences in general, and of coding sequences in particular, does not change; and (ii) by three different shifting or transitional modes, in which nucleotide changes are accompanied by changes in the base composition of sequences. Investigations on these evolutionary modes have shed new light on a central problem in molecular evolution, namely the role played by natural selection in modulating the mutational input. This review will present first the intragenomic shifts, the 'major shifts' and the 'minor shift', and then the 'whole-genome', or 'horizontal', shift. In each case, the shifts were preceded and followed by a conservative mode of evolution. This review expands on a previous one [Bernardi, Gene 241 (2000) 3-17], and summarizes the evidence that the changes of the compositional patterns of the genome and their maintenance are controlled by Darwinian natural selection.
Collapse
Affiliation(s)
- G Bernardi
- Laboratorio di Evoluzione Molecolare, Stazione Zoologica Anton Dohrn, Napoli 80121, Italy.
| |
Collapse
|