1
|
Zhang XO, Pratt H, Weng Z. Investigating the Potential Roles of SINEs in the Human Genome. Annu Rev Genomics Hum Genet 2021; 22:199-218. [PMID: 33792357 DOI: 10.1146/annurev-genom-111620-100736] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Short interspersed nuclear elements (SINEs) are nonautonomous retrotransposons that occupy approximately 13% of the human genome. They are transcribed by RNA polymerase III and can be retrotranscribed and inserted back into the genome with the help of other autonomous retroelements. Because they are preferentially located close to or within gene-rich regions, they can regulate gene expression by various mechanisms that act at both the DNA and the RNA levels. In this review, we summarize recent findings on the involvement of SINEs in different types of gene regulation and discuss the potential regulatory functions of SINEs that are in close proximity to genes, Pol III-transcribed SINE RNAs, and embedded SINE sequences within Pol II-transcribed genes in the human genome. These discoveries illustrate how the human genome has exapted some SINEs into functional regulatory elements.
Collapse
Affiliation(s)
- Xiao-Ou Zhang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA; .,Current affiliation: School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Henry Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA;
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA;
| |
Collapse
|
2
|
Karakülah G, Arslan N, Yandım C, Suner A. TEffectR: an R package for studying the potential effects of transposable elements on gene expression with linear regression model. PeerJ 2019; 7:e8192. [PMID: 31824778 PMCID: PMC6899341 DOI: 10.7717/peerj.8192] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Accepted: 11/11/2019] [Indexed: 01/24/2023] Open
Abstract
Introduction Recent studies highlight the crucial regulatory roles of transposable elements (TEs) on proximal gene expression in distinct biological contexts such as disease and development. However, computational tools extracting potential TE -proximal gene expression associations from RNA-sequencing data are still missing. Implementation Herein, we developed a novel R package, using a linear regression model, for studying the potential influence of TE species on proximal gene expression from a given RNA-sequencing data set. Our R package, namely TEffectR, makes use of publicly available RepeatMasker TE and Ensembl gene annotations as well as several functions of other R-packages. It calculates total read counts of TEs from sorted and indexed genome aligned BAM files provided by the user, and determines statistically significant relations between TE expression and the transcription of nearby genes under diverse biological conditions. Availability TEffectR is freely available at https://github.com/karakulahg/TEffectR along with a handy tutorial as exemplified by the analysis of RNA-sequencing data including normal and tumour tissue specimens obtained from breast cancer patients.
Collapse
Affiliation(s)
- Gökhan Karakülah
- Izmir Biomedicine and Genome Center, Izmir, Turkey.,Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | | | - Cihangir Yandım
- Izmir Biomedicine and Genome Center, Izmir, Turkey.,Department of Genetics and Bioengineering, Faculty of Engineering, Izmir University of Economics, Izmir, Turkey
| | - Aslı Suner
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Ege University, Izmir, Turkey
| |
Collapse
|
3
|
Zeng L, Pederson SM, Kortschak RD, Adelson DL. Transposable elements and gene expression during the evolution of amniotes. Mob DNA 2018; 9:17. [PMID: 29942365 PMCID: PMC5998507 DOI: 10.1186/s13100-018-0124-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 06/01/2018] [Indexed: 01/24/2023] Open
Abstract
Background Transposable elements (TEs) are primarily responsible for the DNA losses and gains in genome sequences that occur over time within and between species. TEs themselves evolve, with clade specific LTR/ERV, LINEs and SINEs responsible for the bulk of species-specific genomic features. Because TEs can contain regulatory motifs, they can be exapted as regulators of gene expression. While TE insertions can provide evolutionary novelty for the regulation of gene expression, their overall impact on the evolution of gene expression is unclear. Previous investigators have shown that tissue specific gene expression in amniotes is more similar across species than within species, supporting the existence of conserved developmental gene regulation. In order to understand how species-specific TE insertions might affect the evolution/conservation of gene expression, we have looked at the association of gene expression in six tissues with TE insertions in six representative amniote genomes. Results A novel bootstrapping approach has been used to minimise the conflation of effects of repeat types on gene expression. We compared the expression of orthologs containing recent TE insertions to orthologs that contained older TE insertions, and the expression of non-orthologs containing recent TE insertions to non-orthologs with older TE insertions. Both orthologs and non-orthologs showed significant differences in gene expression associated with TE insertions. TEs were found associated with species-specific changes in gene expression, and the magnitude and direction of expression changes were noteworthy. Overall, orthologs containing species-specific TEs were associated with lower gene expression, while in non-orthologs, non-species specific TEs were associated with higher gene expression. Exceptions were SINE elements in human and chicken, which had an opposite association with gene expression compared to other species. Conclusions Our observed species-specific associations of TEs with gene expression support a role for TEs in speciation/response to selection by species. TEs do not exhibit consistent associations with gene expression and observed associations can vary depending on the age of TE insertions. Based on these observations, it would be prudent to refrain from extrapolating these and previously reported associations to distantly related species.
Collapse
Affiliation(s)
- Lu Zeng
- 1School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005 Australia
| | - Stephen M Pederson
- 2Bioinformatics Hub, The University of Adelaide, North Terrace, Adelaide, 5005 Australia
| | - R Daniel Kortschak
- 1School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005 Australia
| | - David L Adelson
- 1School of Biological Sciences, The University of Adelaide, North Terrace, Adelaide, 5005 Australia
| |
Collapse
|
4
|
Zhang L, Xiao M, Zhou J, Yu J. Lineage-associated underrepresented permutations (LAUPs) of mammalian genomic sequences based on a Jellyfish-based LAUPs analysis application (JBLA). Bioinformatics 2018; 34:3624-3630. [DOI: 10.1093/bioinformatics/bty392] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 05/09/2018] [Indexed: 12/25/2022] Open
Affiliation(s)
- Le Zhang
- College of Computer Science, Sichuan University, Chengdu, China
- School of Computer and Information Science, Southwest University, Chongqing, China
| | - Ming Xiao
- School of Computer and Information Science, Southwest University, Chongqing, China
- College of Mobile Telecommunications, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Jingsong Zhou
- College of Computer Science, Sichuan University, Chengdu, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
5
|
Biswas K, Acharya D, Podder S, Ghosh TC. Evolutionary rate heterogeneity between multi- and single-interface hubs across human housekeeping and tissue-specific protein interaction network: Insights from proteins' and its partners' properties. Genomics 2017; 110:283-290. [PMID: 29198610 DOI: 10.1016/j.ygeno.2017.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 11/10/2017] [Accepted: 11/29/2017] [Indexed: 12/12/2022]
Abstract
Integrating gene expression into protein-protein interaction network (PPIN) leads to the construction of tissue-specific (TS) and housekeeping (HK) sub-networks, with distinctive TS- and HK-hubs. All such hub proteins are divided into multi-interface (MI) hubs and single-interface (SI) hubs, where MI hubs evolve slower than SI hubs. Here we explored the evolutionary rate difference between MI and SI proteins within TS- and HK-PPIN and observed that this difference is present only in TS, but not in HK-class. Next, we explored whether proteins' own properties or its partners' properties are more influential in such evolutionary discrepancy. Statistical analyses revealed that this evolutionary rate correlates negatively with protein's own properties like expression level, miRNA count, conformational diversity and functional properties and with its partners' properties like protein disorder and tissue expression similarity. Moreover, partial correlation and regression analysis revealed that both proteins' and its partners' properties have independent effects on protein evolutionary rate.
Collapse
Affiliation(s)
- Kakali Biswas
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Debarun Acharya
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Soumita Podder
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India; Department of Microbiology, Raiganj University, Raiganj, Uttar Dinajpur 733134, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P-1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| |
Collapse
|
6
|
Dong Y, Huang Z, Kuang Q, Wen Z, Liu Z, Li Y, Yang Y, Li M. Expression dynamics and relations with nearby genes of rat transposable elements across 11 organs, 4 developmental stages and both sexes. BMC Genomics 2017; 18:666. [PMID: 28851270 PMCID: PMC5576108 DOI: 10.1186/s12864-017-4078-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2017] [Accepted: 08/21/2017] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND TEs pervade mammalian genomes. However, compared with mice, fewer studies have focused on the TE expression patterns in rat, particularly the comparisons across different organs, developmental stages and sexes. In addition, TEs can influence the expression of nearby genes. The temporal and spatial influences of TEs remain unclear yet. RESULTS To evaluate the TEs transcription patterns, we profiled their transcript levels in 11 organs for both sexes across four developmental stages of rat. The results show that most short interspersed elements (SINEs) are commonly expressed in all conditions, which are also the major TE types with commonly expression patterns. In contrast, long terminal repeats (LTRs) are more likely to exhibit specific expression patterns. The expression tendency of TEs and genes are similar in most cases. For example, few specific genes and TEs are in the liver, muscle and heart. However, TEs perform superior over genes on classing organ, which imply their higher organ specificity than genes. By associating the TEs with the closest genes in genome, we find their expression levels are correlated, independent of their distance in some cases. CONCLUSIONS TEs sex-dependently associate with nearest genes. A gene would be associated with more than one TE. Our works can help to functionally annotate the genome and further understand the role of TEs in gene regulation.
Collapse
Affiliation(s)
- Yongcheng Dong
- College of Life Science, Sichuan University, Chengdu, 610064, China
| | - Ziyan Huang
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Qifan Kuang
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Zhining Wen
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| | - Zhibin Liu
- College of Life Science, Sichuan University, Chengdu, 610064, China
| | - Yizhou Li
- College of Chemistry, Sichuan University, Chengdu, 610064, China.
| | - Yi Yang
- College of Life Science, Sichuan University, Chengdu, 610064, China.
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, 610064, China
| |
Collapse
|
7
|
Ma Z, Kong X, Liu S, Yin S, Zhao Y, Liu C, Lv Z, Wang X. Combined sense-antisense Alu elements activate the EGFP reporter gene when stable transfection. Mol Genet Genomics 2017; 292:833-846. [PMID: 28357596 DOI: 10.1007/s00438-017-1312-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Accepted: 03/20/2017] [Indexed: 01/28/2023]
Abstract
Alu elements in the human genome are present in more than one million copies, accounting for 10% of the genome. However, the biological functions of most Alu repeats are unknown. In this present study, we detected the effects of Alu elements on EGFP gene expression using a plasmid system to find the roles of Alu elements in human genome. We inserted 5'-4TMI-Alus-CMV promoter-4TMI-Alus (or antisense Alus)-3' sequences into the pEGFP-C1 vector to construct expression vectors. We altered the copy number of Alus, the orientation of the Alus, and the presence of an enhancer (4TMI) in the inserted 5'-4TMI-Alus-CMV promoter-4TMI-Alus (or antisense Alus)-3' sequences. These expression vectors were stably transfected into HeLa cells, and EGFP reporter gene expression was determined. Our results showed that combined sense-antisense Alu elements activated the EGFP reporter gene in the presence of enhancers and stable transfection. The combined sense-antisense Alu vectors carrying four copies of Alus downstream of inserted CMV induced much stronger EGFP gene expression than two copies. Alus downstream of inserted CMV were replaced to AluJBs (having 76% homology with Alu) to construct expression vectors. We found that combined sense-antisense Alu (or antisense AluJB) vectors induced strong EGFP gene expression after stable transfection and heat shock. To further explore combined sense-antisense Alus activating EGFP gene expression, we constructed Tet-on system vectors, mini-C1-Alu-sense-sense and mini-C1-Alu-sense-antisense (EGFP gene was driven by mini-CMV). We found that combined sense-antisense Alus activated EGFP gene in the presence of reverse tetracycline repressor (rTetR) and doxycycline (Dox). Clone experiments showed that Mini-C1-Alu-sense-antisense vector had more positive cells than that of Mini-C1-Alu-sense-sense vector. The results in this paper proved that Alu repetitive sequences inhibited gene expression and combined sense-antisense Alus activated EGFP reporter gene when Alu transcribes, which suggests that Alus play roles in maintaining gene expression (silencing genes or activating genes) in human genome.
Collapse
Affiliation(s)
- Zhihong Ma
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Xianglong Kong
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Shufeng Liu
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Shuxian Yin
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Yuehua Zhao
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Chao Liu
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China
| | - Zhanjun Lv
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China.
| | - Xiufang Wang
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, 050017, China.
| |
Collapse
|
8
|
Ge SX. Exploratory bioinformatics investigation reveals importance of "junk" DNA in early embryo development. BMC Genomics 2017; 18:200. [PMID: 28231763 PMCID: PMC5324221 DOI: 10.1186/s12864-017-3566-0] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 02/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Instead of testing predefined hypotheses, the goal of exploratory data analysis (EDA) is to find what data can tell us. Following this strategy, we re-analyzed a large body of genomic data to study the complex gene regulation in mouse pre-implantation development (PD). RESULTS Starting with a single-cell RNA-seq dataset consisting of 259 mouse embryonic cells derived from zygote to blastocyst stages, we reconstructed the temporal and spatial gene expression pattern during PD. The dynamics of gene expression can be partially explained by the enrichment of transposable elements in gene promoters and the similarity of expression profiles with those of corresponding transposons. Long Terminal Repeats (LTRs) are associated with transient, strong induction of many nearby genes at the 2-4 cell stages, probably by providing binding sites for Obox and other homeobox factors. B1 and B2 SINEs (Short Interspersed Nuclear Elements) are correlated with the upregulation of thousands of nearby genes during zygotic genome activation. Such enhancer-like effects are also found for human Alu and bovine tRNA SINEs. SINEs also seem to be predictive of gene expression in embryonic stem cells (ESCs), raising the possibility that they may also be involved in regulating pluripotency. We also identified many potential transcription factors underlying PD and discussed the evolutionary necessity of transposons in enhancing genetic diversity, especially for species with longer generation time. CONCLUSIONS Together with other recent studies, our results provide further evidence that many transposable elements may play a role in establishing the expression landscape in early embryos. It also demonstrates that exploratory bioinformatics investigation can pinpoint developmental pathways for further study, and serve as a strategy to generate novel insights from big genomic data.
Collapse
Affiliation(s)
- Steven Xijin Ge
- Department of Mathematics and Statistics, South Dakota State University, Box 2225, Brookings, SD, 57110, USA.
| |
Collapse
|
9
|
Wang X, Ma Z, Kong X, Lv Z. Effects of RNAs on chromatin accessibility and gene expression suggest RNA-mediated activation. Int J Biochem Cell Biol 2016; 79:24-32. [PMID: 27497987 DOI: 10.1016/j.biocel.2016.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Revised: 08/02/2016] [Accepted: 08/03/2016] [Indexed: 01/20/2023]
Abstract
The study of the interaction between RNA and DNA sequences in activating genes has important significance for understanding the mechanisms of RNA-mediated activation. Here, we used in vitro chromatin reconstitution approach to observe whether RNAs increase DNase I digestion, plasmid transfection to observe whether RNAs promote gene expression, and bioinformatics analysis to predict the binding ability of RNAs to centromere DNA (constitutive heterochromatin). Synthetic RNAs (23nt) that were complementary to mouse albumin gene and total liver RNA increased DNase I digestion sensitivity of mouse albumin gene, suggesting that RNAs can increase chromatin accessibility. Transcribed sense-antisense tandem Alu elements activated an enhanced green fluorescent protein reporter gene after stable transfection. Bioinformatics analysis showed that the binding strength of RNA population to centromere DNAs is significantly lower than that of their flanking sequences, which suggests that the centromere is not easily affected by RNAs produced from other transcribed regions and may be the reason why centromeres consist of constitutive heterochromatin. The results in this paper illustrate that RNAs complementary to DNA sequences play roles in activating genes. Since RNA is mainly produced from the cell's own DNA, the work presented in this paper suggests that RNAs transcribed from DNA create feedback that activates DNA transcription.
Collapse
Affiliation(s)
- Xiufang Wang
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, China.
| | - Zhihong Ma
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, China; Clinical Laboratory, The Second Hospital of Tangshan, 21 North Jianshe Road, Tangshan, Hebei Province, China.
| | - Xianglong Kong
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, China; Clinical Laboratory, Hebei Chest Hospital, 372 Shengli North Street, Shijiazhuang, Hebei Province, China.
| | - Zhanjun Lv
- Department of Genetics, Hebei Medical University, Hebei Key Lab of Laboratory Animal, Shijiazhuang, Hebei Province, China.
| |
Collapse
|
10
|
Insights into the Evolution of a Snake Venom Multi-Gene Family from the Genomic Organization of Echis ocellatus SVMP Genes. Toxins (Basel) 2016; 8:toxins8070216. [PMID: 27420095 PMCID: PMC4963849 DOI: 10.3390/toxins8070216] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Revised: 06/29/2016] [Accepted: 07/06/2016] [Indexed: 02/04/2023] Open
Abstract
The molecular events underlying the evolution of the Snake Venom Metalloproteinase (SVMP) family from an A Disintegrin And Metalloproteinase (ADAM) ancestor remain poorly understood. Comparative genomics may provide decisive information to reconstruct the evolutionary history of this multi-locus toxin family. Here, we report the genomic organization of Echis ocellatus genes encoding SVMPs from the PII and PI classes. Comparisons between them and between these genes and the genomic structures of Anolis carolinensis ADAM28 and E. ocellatus PIII-SVMP EOC00089 suggest that insertions and deletions of intronic regions played key roles along the evolutionary pathway that shaped the current diversity within the multi-locus SVMP gene family. In particular, our data suggest that emergence of EOC00028-like PI-SVMP from an ancestral PII(e/d)-type SVMP involved splicing site mutations that abolished both the 3′ splice AG acceptor site of intron 12* and the 5′ splice GT donor site of intron 13*, and resulted in the intronization of exon 13* and the consequent destruction of the structural integrity of the PII-SVMP characteristic disintegrin domain.
Collapse
|
11
|
Knothe C, Shiratori H, Resch E, Ultsch A, Geisslinger G, Doehring A, Lötsch J. Disagreement between two common biomarkers of global DNA methylation. Clin Epigenetics 2016; 8:60. [PMID: 27222668 PMCID: PMC4877994 DOI: 10.1186/s13148-016-0227-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2016] [Accepted: 05/10/2016] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The quantification of global DNA methylation has been established in epigenetic screening. As more practicable alternatives to the HPLC-based gold standard, the methylation analysis of CpG islands in repeatable elements (LINE-1) and the luminometric methylation assay (LUMA) of overall 5-methylcytosine content in "CCGG" recognition sites are most widely used. Both methods are applied as virtually equivalent, despite the hints that their results only partly agree. This triggered the present agreement assessments. RESULTS Three different human cell types (cultured MCF7 and SHSY5Y cell lines treated with different chemical modulators of DNA methylation and whole blood drawn from pain patients and healthy volunteers) were submitted to the global DNA methylation assays employing LINE-1 or LUMA-based pyrosequencing measurements. The agreement between the two bioassays was assessed using generally accepted approaches to the statistics for laboratory method comparison studies. Although global DNA methylation levels measured by the two methods correlated, five different lines of statistical evidence consistently rejected the assumption of complete agreement. Specifically, a bias was observed between the two methods. In addition, both the magnitude and direction of bias were tissue-dependent. Interassay differences could be grouped based on Bayesian statistics, and these groups allowed in turn to re-identify the originating tissue. CONCLUSIONS Although providing partly correlated measurements of DNA methylation, interchangeability of the quantitative results obtained with LINE-1 and LUMA was jeopardized by a consistent bias between the results. Moreover, the present analyses strongly indicate a tissue specificity of the differences between the two methods.
Collapse
Affiliation(s)
- Claudia Knothe
- />Institute of Clinical Pharmacology, Goethe University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Hiromi Shiratori
- />Project Group Translational Medicine and Pharmacology TMP, Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Eduard Resch
- />Project Group Translational Medicine and Pharmacology TMP, Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Alfred Ultsch
- />DataBionics Research Group, University of Marburg, Hans-Meerwein-Straße, 35032 Marburg, Germany
| | - Gerd Geisslinger
- />Institute of Clinical Pharmacology, Goethe University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
- />Project Group Translational Medicine and Pharmacology TMP, Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Alexandra Doehring
- />Institute of Clinical Pharmacology, Goethe University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| | - Jörn Lötsch
- />Institute of Clinical Pharmacology, Goethe University, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
- />Project Group Translational Medicine and Pharmacology TMP, Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Theodor-Stern-Kai 7, 60590 Frankfurt am Main, Germany
| |
Collapse
|
12
|
Daniel C, Behm M, Öhman M. The role of Alu elements in the cis-regulation of RNA processing. Cell Mol Life Sci 2015; 72:4063-76. [PMID: 26223268 PMCID: PMC11113721 DOI: 10.1007/s00018-015-1990-3] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2015] [Revised: 07/06/2015] [Accepted: 07/09/2015] [Indexed: 01/18/2023]
Abstract
The human genome is under constant invasion by retrotransposable elements. The most successful of these are the Alu elements; with a copy number of over a million, they occupy about 10 % of the entire genome. Interestingly, the vast majority of these Alu insertions are located in gene-rich regions, and one-third of all human genes contains an Alu insertion. Alu sequences are often embedded in gene sequence encoding pre-mRNAs and mature mRNAs, usually as part of their intron or UTRs. Once transcribed, they can regulate gene expression as well as increase the number of RNA isoforms expressed in a tissue or a species. They also regulate the function of other RNAs, like microRNAs, circular RNAs, and potentially long non-coding RNAs. Mechanistically, Alu elements exert their effects by influencing diverse processes, such as RNA editing, exonization, and RNA processing. In so doing, they have undoubtedly had a profound effect on human evolution.
Collapse
Affiliation(s)
- Chammiran Daniel
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrheniusväg 20C, 106 91, Stockholm, Sweden
| | - Mikaela Behm
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrheniusväg 20C, 106 91, Stockholm, Sweden
| | - Marie Öhman
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrheniusväg 20C, 106 91, Stockholm, Sweden.
| |
Collapse
|
13
|
Liang KC, Tseng JT, Tsai SJ, Sun HS. Characterization and distribution of repetitive elements in association with genes in the human genome. Comput Biol Chem 2015; 57:29-38. [DOI: 10.1016/j.compbiolchem.2015.02.007] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Accepted: 02/03/2015] [Indexed: 11/27/2022]
|
14
|
Abstract
The searching of human housekeeping (HK) genes has been a long quest since the emergence of transcriptomics, and is instrumental for us to understand the structure of genome and the fundamentals of biological processes. The resolved genes are frequently used in evolution studies and as normalization standards in quantitative gene-expression analysis. Within the past 20 years, more than a dozen HK-gene studies have been conducted, yet none of them sampled human tissues completely. We believe an integration of these results will help remove false positive genes owing to the inadequate sampling. Surprisingly, we only find one common gene across 15 examined HK-gene datasets comprising 187 different tissue and cell types. Our subsequent analyses suggest that it might not be appropriate to rigidly define HK genes as expressed in all tissue types that have diverse developmental, physiological, and pathological states. It might be beneficial to use more robustly identified HK functions for filtering criteria, in which the representing genes can be a subset of genome. These genes are not necessarily the same, and perhaps need not to be the same, everywhere in our body.
Collapse
Affiliation(s)
- Yijuan Zhang
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Ding Li
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Bingyun Sun
- Department of Chemistry and Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| |
Collapse
|
15
|
Variation and constraints in species-specific promoter sequences. J Theor Biol 2014; 363:357-66. [PMID: 25149367 DOI: 10.1016/j.jtbi.2014.08.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2014] [Revised: 07/30/2014] [Accepted: 08/04/2014] [Indexed: 11/24/2022]
Abstract
A vast literature is nowadays devoted to the search of correlations between transcription related functions and the composition of sequences upstream the Transcription Start Site. Little is known about the possible functional effects of nucleotide distributions on the conformational landscape of DNA in such regions. We have used suitable statistical indicators for identifying sequences that may play an important role in regulating transcription processes. In particular, we have analyzed base composition, periodicity and information content in sets of aligned promoters clustered according to functional information in order to obtain an insight on the main structural differences between promoters regulating genes with different functions. Our results show that when we select promoters according to some biological information, in a single species, at least in vertebrates, we observe structurally different classes of sequences. The highly variable and differentiated gene expression patterns may explain the great extent of structural differentiation observed in complex organisms. In fact, despite our analysis is focused on Homo sapiens, we provide also a comparison with other species, selected at different positions in the phylogenetic tree.
Collapse
|
16
|
Olins AL, Ishaque N, Chotewutmontri S, Langowski J, Olins DE. Retrotransposon Alu is enriched in the epichromatin of HL-60 cells. Nucleus 2014; 5:237-46. [PMID: 24824428 DOI: 10.4161/nucl.29141] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Epichromatin, the surface of chromatin facing the nuclear envelope in an interphase nucleus, reveals a "rim" staining pattern with specific mouse monoclonal antibodies against histone H2A/H2B/DNA and phosphatidylserine epitopes. Employing a modified ChIP-Seq procedure on undifferentiated and differentiated human leukemic (HL-60/S4) cells,>95% of assembled epichromatin regions overlapped with Alu retrotransposons. They also exhibited enrichment of the AluS subfamily and of Alu oligomers. Furthermore, mapping epichromatin regions to the human chromosomes revealed highly similar localization patterns in the various cell states and with the different antibodies. Comparisons with available epigenetic databases suggested that epichromatin is neither "classical" heterochromatin nor highly expressing genes, implying another function at the surface of interphase chromatin. A modified chromatin immunoprecipitation procedure (xxChIP) was developed because the studied antibodies react generally with mononucleosomes and lysed chromatin. A second fixation is necessary to securely attach the antibodies to the epichromatin epitopes of the intact nucleus.
Collapse
Affiliation(s)
- Ada L Olins
- Department of Pharmaceutical Sciences; College of Pharmacy; University of New England; Portland, ME USA
| | - Naveed Ishaque
- Division of Theoretical Bioinformatics; German Cancer Research Center (DKFZ); Heidelberg, Germany; Heidelberg Center for Personalized Oncology; German Cancer Research Center (DKFZ); Heidelberg, Germany
| | - Sasithorn Chotewutmontri
- German Cancer Research Center; Genomics and Proteomics Core Facility, High Throughput Sequencing Unit; Heidelberg, Germany
| | - Jörg Langowski
- Biophysik der Makromoleküle; German Cancer Research Center; Heidelberg, Germany
| | - Donald E Olins
- Department of Pharmaceutical Sciences; College of Pharmacy; University of New England; Portland, ME USA
| |
Collapse
|
17
|
Jjingo D, Conley AB, Wang J, Mariño-Ramírez L, Lunyak VV, Jordan IK. Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression. Mob DNA 2014; 5:14. [PMID: 25018785 PMCID: PMC4090950 DOI: 10.1186/1759-8753-5-14] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 04/10/2014] [Indexed: 11/26/2022] Open
Abstract
Background Mammalian-wide interspersed repeats (MIRs) are the most ancient family of transposable elements (TEs) in the human genome. The deep conservation of MIRs initially suggested the possibility that they had been exapted to play functional roles for their host genomes. MIRs also happen to be the only TEs whose presence in-and-around human genes is positively correlated to tissue-specific gene expression. Similar associations of enhancer prevalence within genes and tissue-specific expression, along with MIRs’ previous implication as providing regulatory sequences, suggested a possible link between MIRs and enhancers. Results To test the possibility that MIRs contribute functional enhancers to the human genome, we evaluated the relationship between MIRs and human tissue-specific enhancers in terms of genomic location, chromatin environment, regulatory function, and mechanistic attributes. This analysis revealed MIRs to be highly concentrated in enhancers of the K562 and HeLa human cell-types. Significantly more enhancers were found to be linked to MIRs than would be expected by chance, and putative MIR-derived enhancers are characterized by a chromatin environment highly similar to that of canonical enhancers. MIR-derived enhancers show strong associations with gene expression levels, tissue-specific gene expression and tissue-specific cellular functions, including a number of biological processes related to erythropoiesis. MIR-derived enhancers were found to be a rich source of transcription factor binding sites, underscoring one possible mechanistic route for the element sequences co-option as enhancers. There is also tentative evidence to suggest that MIR-enhancer function is related to the transcriptional activity of non-coding RNAs. Conclusions Taken together, these data reveal enhancers to be an important cis-regulatory platform from which MIRs can exercise a regulatory function in the human genome and help to resolve a long-standing conundrum as to the reason for MIRs’ deep evolutionary conservation.
Collapse
Affiliation(s)
- Daudi Jjingo
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Andrew B Conley
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jianrong Wang
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA ; PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia
| | - Victoria V Lunyak
- PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia ; Buck Institute for Research on Aging, Novato, CA, USA
| | - I King Jordan
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA ; PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia
| |
Collapse
|
18
|
Chiang AWT, Shaw GTW, Hwang MJ. Partitioning the human transcriptome using HKera, a novel classifier of housekeeping and tissue-specific genes. PLoS One 2013; 8:e83040. [PMID: 24376628 PMCID: PMC3869736 DOI: 10.1371/journal.pone.0083040] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 10/30/2013] [Indexed: 01/12/2023] Open
Abstract
High-throughput transcriptomic experiments have made it possible to classify genes that are ubiquitously expressed as housekeeping (HK) genes and those expressed only in selective tissues as tissue-specific (TS) genes. Although partitioning a transcriptome into HK and TS genes is conceptually problematic owing to the lack of precise definitions and gene expression profile criteria for the two, information whether a gene is an HK or a TS gene can provide an initial clue to its cellular and/or functional role. Consequently, the development of new and novel HK (TS) classification methods has been a topic of considerable interest in post-genomics research. Here, we report such a development. Our method, called HKera, differs from the others by utilizing a novel property of HK genes that we have previously uncovered, namely that the ranking order of their expression levels, as opposed to the expression levels themselves, tends to be preserved from one tissue to another. Evaluated against multiple benchmark sets of human HK genes, including one recently derived from second generation sequencing data, HKera was shown to perform significantly better than five other classifiers that use different methodologies. An enrichment analysis of pathway and gene ontology annotations showed that HKera-predicted HK and TS genes have distinct functional roles and, together, cover most of the ontology categories. These results show that HKera is a good transcriptome partitioner that can be used to search for, and obtain useful expression and functional information for, novel HK (TS) genes.
Collapse
Affiliation(s)
- Austin W. T. Chiang
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Grace T. W. Shaw
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Ming-Jing Hwang
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan
- Institute of BioMedical Informatics, NationalYang-MingUniversity, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
19
|
Linker S, Hedges D. Linear decay of retrotransposon antisense bias across genes is contingent upon tissue specificity. PLoS One 2013; 8:e79402. [PMID: 24244495 PMCID: PMC3828378 DOI: 10.1371/journal.pone.0079402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 09/28/2013] [Indexed: 12/23/2022] Open
Abstract
Retrotransposons comprise approximately half of the human genome and contribute to chromatin structure, regulatory motifs, and protein-coding sequences. Since retrotransposon insertions can disrupt functional genetic elements as well as introduce new sequence motifs to a region, they have the potential to affect the function of genes that harbour insertions as well as those nearby. Partly as a result of these effects, the distribution of retrotransposons across the genome is non-uniform and there are observed imbalances in the orientation of insertions with respect to the transcriptional direction of the containing gene. Although some of the factors underlying the observed distributions are understood, much of the variability remains unexplained. Detailed characterization of retrotransposon density in genes could help inform predictions of the functional consequence of de novo as well as polymorphic insertions. In order to characterize the relationship between genes and inserted elements, we have examined the distribution of retrotransposons and their internal motifs within tissue-specific and housekeeping genes. We have identified that the previously established retrotransposon antisense bias decays at a linear rate across genes, resulting in an equal density of sense and antisense retrotransposons near the 3'-UTR. In addition, the decay of antisense bias across genes is less pronounced among tissue-specific genes. Our results provide support for the scenario in which this linear decay in antisense bias is established by natural selection shortly after retrotransposon integration, and that total antisense bias observed is above and beyond any bias introduced by the integration process itself. Finally, we provide an example of a retrotransposon acting as an eQTL on a coincident gene, highlighting one of several possible avenues through which insertions may modulate gene function.
Collapse
Affiliation(s)
- Sara Linker
- Hussman Institute for Human Genomics, Dr John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, United States of America
| | - Dale Hedges
- Division of Human Genetics, Department of Internal Medicine, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
20
|
Eisenberg E, Levanon EY. Human housekeeping genes, revisited. Trends Genet 2013; 29:569-74. [PMID: 23810203 DOI: 10.1016/j.tig.2013.05.010] [Citation(s) in RCA: 842] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 05/06/2013] [Accepted: 05/30/2013] [Indexed: 10/26/2022]
Abstract
Housekeeping genes are involved in basic cell maintenance and, therefore, are expected to maintain constant expression levels in all cells and conditions. Identification of these genes facilitates exposure of the underlying cellular infrastructure and increases understanding of various structural genomic features. In addition, housekeeping genes are instrumental for calibration in many biotechnological applications and genomic studies. Advances in our ability to measure RNA expression have resulted in a gradual increase in the number of identified housekeeping genes. Here, we describe housekeeping gene detection in the era of massive parallel sequencing and RNA-seq. We emphasize the importance of expression at a constant level and provide a list of 3804 human genes that are expressed uniformly across a panel of tissues. Several exceptionally uniform genes are singled out for future experimental use, such as RT-PCR control genes. Finally, we discuss both ways in which current technology can meet some of past obstacles encountered, and several as yet unmet challenges.
Collapse
Affiliation(s)
- Eli Eisenberg
- Raymond and Beverly Sackler School of Physics and Astronomy, Tel-Aviv University, Tel Aviv 69978, Israel.
| | | |
Collapse
|
21
|
Wagstaff BJ, Hedges DJ, Derbes RS, Campos Sanchez R, Chiaromonte F, Makova KD, Roy-Engel AM. Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion. PLoS Genet 2012; 8:e1002842. [PMID: 22912586 PMCID: PMC3415434 DOI: 10.1371/journal.pgen.1002842] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2011] [Accepted: 05/30/2012] [Indexed: 12/15/2022] Open
Abstract
Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1). Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5′ truncations of Alu inserts are rare, as most of the recovered inserts (96.5%) are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5′) of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction. SINEs are mobile elements that are found ubiquitously throughout a large diversity of genomes from plants to mammals. The human SINE, Alu, is among the most successful mobile elements, with more than one million copies in the genome. Due to its high activity and ability to insert throughout the genome, Alu retrotransposition is responsible for the majority of diseases reported to be caused by mobile element activity. To further evaluate the genomic impact of SINEs, we recovered and characterized over 200 de novo Alu inserts under controlled conditions. Our data reinforce observations on the mutagenic potential of Alu, with newly retrotransposed Alu elements favoring insertion into genic and highly conserved elements. Alu-mediated deletions and rearrangements are infrequent and lack the typical hallmarks of TPRT retrotransposition, suggesting the use of an alternate method for resolving retrotransposition intermediates or an atypical insertion mechanism. Our data also provide novel insights into SINE retrotransposition biology. We found that slippage of L1 ORF2 protein during reverse transcription expands the A-tails of de novo insertions. We propose that the L1 ORF2 protein plays a major role in minimizing Alu extinction by reintroducing active Alu elements to counter the natural loss of Alu source elements.
Collapse
Affiliation(s)
- Bradley J. Wagstaff
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
| | - Dale J. Hedges
- Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, United States of America
| | - Rebecca S. Derbes
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
| | - Rebeca Campos Sanchez
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Francesca Chiaromonte
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Kateryna D. Makova
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Astrid M. Roy-Engel
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
22
|
Wang Z, Willard HF. Evidence for sequence biases associated with patterns of histone methylation. BMC Genomics 2012; 13:367. [PMID: 22857523 PMCID: PMC3532361 DOI: 10.1186/1471-2164-13-367] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 07/18/2012] [Indexed: 11/19/2022] Open
Abstract
Background Combinations of histone variants and modifications, conceptually representing a histone code, have been proposed to play a significant role in gene regulation and developmental processes in complex organisms. While various mechanisms have been implicated in establishing and maintaining epigenetic patterns at specific locations in the genome, they are generally believed to be independent of primary DNA sequence on a more global scale. Results To address this systematically in the case of the human genome, we have analyzed primary DNA sequences underlying patterns of 19 different methylated histones in human primary T-cells and patterns of three methylated histones across additional human cell lines. We report strong sequence biases associated with most of these histone marks genome-wide in each cell type. Furthermore, the sequence characteristics for such association are distinct for different groups of histone marks. Conclusions These findings provide evidence of an influence of genomic sequence on patterns of histone modification associated with gene expression and chromatin programming, and they suggest that the mechanisms responsible for global histone modifications may interpret genomic sequence in various ways.
Collapse
Affiliation(s)
- Zhong Wang
- Genome Biology Group, Duke Institute for Genome Sciences & Policy, Duke University, Durham, NC 27708, USA
| | | |
Collapse
|
23
|
Sigurdsson MI, Smith AV, Bjornsson HT, Jonsson JJ. Distribution of a marker of germline methylation differs between major families of transposon-derived repeats in the human genome. Gene 2011; 492:104-9. [PMID: 22093876 DOI: 10.1016/j.gene.2011.10.046] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 10/18/2011] [Accepted: 10/27/2011] [Indexed: 11/18/2022]
Abstract
A potential relationship between transposon-derived repeats (TDR) and human germline methylation is of biological importance since many genes are flanked by TDR and methylation could affect the expression of nearby genes. Furthermore, DNA methylation has been suggested as a global defense mechanism against genome instability threatened by TDR. We studied the correlation between the density of HapMap methyl-associated SNPs (mSNPs), a marker of germline methylation, and proportion of TDR. After correcting for confounding variables, we found a negative correlation between proportion of Alu repeats and mSNP density for 125-1000 kb windows. Similar results were found for the most active subgroup of repeats. In contrast, a negative correlation between proportion of L1 repeats and mSNP density was found only in the larger 1000 kb windows. Using methylation data on germ cells (sperm) from the Human Epigenome Project, we found a lower proportion of Alu repeats adjacent (3-15 kb) to hypermethylated amplicons. On the contrary, there was a higher proportion of L1 repeats in the 3-5 kb of sequence flanking hypermethylated amplicons but not in the 10-15 kb flanks. Our data indicate a differential response to the major repeat families and that DNA methylation is unlikely to be a uniform global defense system against all TDR. It appears to play a role for the L1 subgroup, with sequences adjacent to L1 repeats methylated in response to their proximity. In contrast, sequences adjacent to Alu repeats appear to be hypomethylated, arguing against a role of methylation in germline defense against those elements.
Collapse
Affiliation(s)
- Martin I Sigurdsson
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Iceland, IS-101, and Department of Genetics and Molecular Medicine, Landspitali-University Hospital, Reykjavik, IS-101, Iceland
| | | | | | | |
Collapse
|
24
|
Woody JL, Shoemaker RC. Gene expression: sizing it all up. Front Genet 2011; 2:70. [PMID: 22303365 PMCID: PMC3268623 DOI: 10.3389/fgene.2011.00070] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 09/29/2011] [Indexed: 11/13/2022] Open
Abstract
Genomic architecture appears to be a largely unexplored component of gene expression. That architecture can be related to chromatin domains, transposable element neighborhoods, epigenetic modifications of the genome, and more. Although surely not the end of the story, we are learning that when it comes to gene expression, size is also important. We have been surprised to find that certain patterns of expression, tissue specific versus constitutive, or high expression versus low expression, are often associated with physical attributes of the gene and genome. Multiple studies have shown an inverse relationship between gene expression patterns and various physical parameters of the genome such as intron size, exon size, intron number, and size of intergenic regions. An increase in expression level and breadth often correlates with a decrease in the size of physical attributes of the gene. Three models have been proposed to explain these relationships. Contradictory results were found in several organisms when expression level and expression breadth were analyzed independently. However, when both factors were combined in a single study a novel relationship was revealed. At low levels of expression, an increase in expression breadth correlated with an increase in genic, intergenic, and intragenic sizes. Contrastingly, at high levels of expression, an increase in expression breadth inversely correlated with the size of the gene. In this article we explore the several hypotheses regarding genome physical parameters and gene expression.
Collapse
|
25
|
Dong B, Zhang P, Chen X, Liu L, Wang Y, He S, Chen R. Predicting housekeeping genes based on Fourier analysis. PLoS One 2011; 6:e21012. [PMID: 21687628 PMCID: PMC3110801 DOI: 10.1371/journal.pone.0021012] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2010] [Accepted: 05/18/2011] [Indexed: 11/19/2022] Open
Abstract
Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs.
Collapse
Affiliation(s)
- Bo Dong
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Peng Zhang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Xiaowei Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Li Liu
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Yunfei Wang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Shunmin He
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Runsheng Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
| |
Collapse
|
26
|
Calistri E, Livi R, Buiatti M. Evolutionary trends of GC/AT distribution patterns in promoters. Mol Phylogenet Evol 2011; 60:228-35. [PMID: 21554969 DOI: 10.1016/j.ympev.2011.04.015] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2010] [Revised: 03/25/2011] [Accepted: 04/17/2011] [Indexed: 11/18/2022]
Abstract
Nucleotide distributions in genomes is known not to be random, showing the presence of specific motifs, long and short range correlations, periodicities, etc. Particularly, motifs are critical for the recognition by specific proteins affecting chromosome organization, transcription and DNA replication but little is known about the possible functional effects of nucleotide distributions on the conformational landscape of DNA, putatively leading to differential selective pressures throughout evolution. Promoter sequences have a fundamental role in the regulation of gene activity and a vast literature suggests that their conformational landscapes may be a critical factor in gene expression dynamics. On these grounds, with the aim of investigating the putative existence of phylogenetic patterns of promoter base distributions, we analyzed GC/AT ratios along the 1000 nucleotide sequences upstream of TSS in wide sets of promoters belonging to organisms ranging from bacteria to pluricellular eukaryotes. The data obtained showed very clear phylogenetic trends throughout evolution of promoter sequence base distributions. Particularly, in all cases either GC-rich or AT-rich monotone gradients were observed: the former being present in eukaryotes, the latter in bacteria along with strand biases. Moreover, within eukaryotes, GC-rich gradients increased in length from unicellular organisms to plants, to vertebrates and, within them, from ancestral to more recent species. Finally, results were thoroughly discussed with particular attention to the possible correlation between nucleotide distribution patterns, evolution, and the putative existence of differential selection pressures, deriving from structural and/or functional constraints, between and within prokaryotes and eukaryotes.
Collapse
Affiliation(s)
- Elisa Calistri
- Dipartimento di Biologia Evoluzionistica, Universita' degli Studi di Firenze, via Romana 19, 50125 Firenze, Italy.
| | | | | |
Collapse
|
27
|
Kitkumthorn N, Mutirangura A. Long interspersed nuclear element-1 hypomethylation in cancer: biology and clinical applications. Clin Epigenetics 2011; 2:315-30. [PMID: 22704344 PMCID: PMC3365388 DOI: 10.1007/s13148-011-0032-8] [Citation(s) in RCA: 104] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2010] [Accepted: 03/20/2011] [Indexed: 12/31/2022] Open
Abstract
Epigenetic changes in long interspersed nuclear element-1s (LINE-1s or L1s) occur early during the process of carcinogenesis. A lower methylation level (hypomethylation) of LINE-1 is common in most cancers, and the methylation level is further decreased in more advanced cancers. Consequently, several previous studies have suggested the use of LINE-1 hypomethylation levels in cancer screening, risk assessment, tumor staging, and prognostic prediction. Epigenomic changes are complex, and global hypomethylation influences LINE-1s in a generalized fashion. However, the methylation levels of some loci are dependent on their locations. The consequences of LINE-1 hypomethylation are genomic instability and alteration of gene expression. There are several mechanisms that promote both of these consequences in cis. Therefore, the methylation levels of different sets of LINE-1s may represent certain phenotypes. Furthermore, the methylation levels of specific sets of LINE-1s may indicate carcinogenesis-dependent hypomethylation. LINE-1 methylation pattern analysis can classify LINE-1s into one of three classes based on the number of methylated CpG dinucleotides. These classes include hypermethylation, partial methylation, and hypomethylation. The number of partial and hypermethylated loci, but not hypomethylated LINE-1s, is different among normal cell types. Consequently, the number of hypomethylated loci is a more promising marker than methylation level in the detection of cancer DNA. Further genome-wide studies to measure the methylation level of each LINE-1 locus may improve PCR-based methylation analysis to allow for a more specific and sensitive detection of cancer DNA or for an analysis of certain cancer phenotypes.
Collapse
|
28
|
Aporntewan C, Phokaew C, Piriyapongsa J, Ngamphiw C, Ittiwut C, Tongsima S, Mutirangura A. Hypomethylation of intragenic LINE-1 represses transcription in cancer cells through AGO2. PLoS One 2011; 6:e17934. [PMID: 21423624 PMCID: PMC3057998 DOI: 10.1371/journal.pone.0017934] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2010] [Accepted: 02/18/2011] [Indexed: 01/23/2023] Open
Abstract
In human cancers, the methylation of long interspersed nuclear element -1 (LINE-1
or L1) retrotransposons is reduced. This occurs within the context of genome
wide hypomethylation, and although it is common, its role is poorly understood.
L1s are widely distributed both inside and outside of genes, intragenic and
intergenic, respectively. Interestingly, the insertion of active full-length L1
sequences into host gene introns disrupts gene expression. Here, we evaluated if
intragenic L1 hypomethylation influences their host gene expression in cancer.
First, we extracted data from L1base (http://l1base.molgen.mpg.de), a database containing putatively
active L1 insertions, and compared intragenic and intergenic L1 characters. We
found that intragenic L1 sequences have been conserved across evolutionary time
with respect to transcriptional activity and CpG dinucleotide sites for
mammalian DNA methylation. Then, we compared regulated mRNA levels of cells from
two different experiments available from Gene Expression Omnibus (GEO), a
database repository of high throughput gene expression data, (http://www.ncbi.nlm.nih.gov/geo) by chi-square. The odds ratio
of down-regulated genes between demethylated normal bronchial epithelium and
lung cancer was high (p<1E−27;
OR = 3.14; 95%
CI = 2.54–3.88), suggesting cancer genome wide
hypomethylation down-regulating gene expression. Comprehensive analysis between
L1 locations and gene expression showed that expression of genes containing L1s
had a significantly higher likelihood to be repressed in cancer and
hypomethylated normal cells. In contrast, many mRNAs derived from genes
containing L1s are elevated in Argonaute 2 (AGO2 or EIF2C2)-depleted cells.
Hypomethylated L1s increase L1 mRNA levels. Finally, we found that AGO2 targets
intronic L1 pre-mRNA complexes and represses cancer genes. These findings
represent one of the mechanisms of cancer genome wide hypomethylation altering
gene expression. Hypomethylated intragenic L1s are a nuclear siRNA mediated
cis-regulatory element that can repress genes. This
epigenetic regulation of retrotransposons likely influences many aspects of
genomic biology.
Collapse
Affiliation(s)
- Chatchawit Aporntewan
- Department of Mathematics, Faculty of Science,
Chulalongkorn University, Bangkok, Thailand
| | - Chureerat Phokaew
- Inter-Department Program of BioMedical
Sciences, Faculty of Graduate School, Chulalongkorn University, Bangkok,
Thailand
| | - Jittima Piriyapongsa
- National Center for Genetic Engineering and
Biotechnology, Genome Institute, Thailand Science Park, Pathumtani,
Thailand
| | - Chumpol Ngamphiw
- National Center for Genetic Engineering and
Biotechnology, Genome Institute, Thailand Science Park, Pathumtani,
Thailand
| | - Chupong Ittiwut
- Department of Anatomy, Faculty of Medicine,
Center of Excellence in Molecular Genetics of Cancer and Human Diseases,
Chulalongkorn University, Bangkok, Thailand
| | - Sissades Tongsima
- National Center for Genetic Engineering and
Biotechnology, Genome Institute, Thailand Science Park, Pathumtani,
Thailand
| | - Apiwat Mutirangura
- Department of Anatomy, Faculty of Medicine,
Center of Excellence in Molecular Genetics of Cancer and Human Diseases,
Chulalongkorn University, Bangkok, Thailand
- * E-mail:
| |
Collapse
|
29
|
Jjingo D, Huda A, Gundapuneni M, Mariño-Ramírez L, Jordan IK. Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol 2011; 3:259-71. [PMID: 21362639 PMCID: PMC3070429 DOI: 10.1093/gbe/evr015] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.
Collapse
Affiliation(s)
- Daudi Jjingo
- School of Biology, Georgia Institute of Technology, GA, USA
| | | | | | | | | |
Collapse
|
30
|
Chen X, Wang M, Zhang H. The use of classification trees for bioinformatics. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2011; 1:55-63. [PMID: 22523608 PMCID: PMC3329156 DOI: 10.1002/widm.14] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Classification trees are non-parametric statistical learning methods that incorporate feature selection and interactions, possess intuitive interpretability, are efficient, and have high prediction accuracy when used in ensembles. This paper provides a brief introduction to the classification tree-based methods, a review of the recent developments, and a survey of the applications in bioinformatics and statistical genetics.
Collapse
|
31
|
Chen X, Wang M, Zhang H. The use of classification trees for bioinformatics. WILEY INTERDISCIPLINARY REVIEWS. DATA MINING AND KNOWLEDGE DISCOVERY 2011. [PMID: 22523608 DOI: 10.1002/widm.8] [Citation(s) in RCA: 435] [Impact Index Per Article: 31.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Classification trees are non-parametric statistical learning methods that incorporate feature selection and interactions, possess intuitive interpretability, are efficient, and have high prediction accuracy when used in ensembles. This paper provides a brief introduction to the classification tree-based methods, a review of the recent developments, and a survey of the applications in bioinformatics and statistical genetics.
Collapse
|
32
|
Abstract
Transcribed regions in the human genome differ from adjacent intergenic regions in transposable element density, crossover rates, and asymmetric substitution and sequence composition patterns. We tested whether these differences reflect selection or are instead a byproduct of germline transcription, using publicly available gene expression data from a variety of germline and somatic tissues. Crossover rate shows a strong negative correlation with gene expression in meiotic tissues, suggesting that crossover is inhibited by transcription. Strand-biased composition (G+T content) and A → G versus T → C substitution asymmetry are both positively correlated with germline gene expression. We find no evidence for a strand bias in allele frequency data, implying that the substitution asymmetry reflects a mutation rather than a fixation bias. The density of transposable elements is positively correlated with germline expression, suggesting that such elements preferentially insert into regions that are actively transcribed. For each of the features examined, our analyses favor a nonselective explanation for the observed trends and point to the role of germline gene expression in shaping the mammalian genome.
Collapse
Affiliation(s)
- Graham McVicker
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | | |
Collapse
|
33
|
Huda A, Jordan IK. Epigenetic Regulation of Mammalian Genomes by Transposable Elements. Ann N Y Acad Sci 2009; 1178:276-84. [DOI: 10.1111/j.1749-6632.2009.05007.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|
34
|
Gaudray P, Weber G. Genetic Background of MEN1: From Genetic Homogeneity to Functional Diversity. SUPERMEN1 2009; 668:17-26. [DOI: 10.1007/978-1-4419-1664-8_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
35
|
Phokaew C, Kowudtitham S, Subbalekha K, Shuangshoti S, Mutirangura A. LINE-1 methylation patterns of different loci in normal and cancerous cells. Nucleic Acids Res 2008; 36:5704-12. [PMID: 18776216 PMCID: PMC2553567 DOI: 10.1093/nar/gkn571] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
This study evaluated methylation patterns of long interspersed nuclear element-1 (LINE-1) sequences from 17 loci in several cell types, including squamous cell cancer cell lines, normal oral epithelium (NOE), white blood cells and head and neck squamous cell cancers (HNSCC). Although sequences of each LINE-1 are homologous, LINE-1 methylation levels at each locus are different. Moreover, some loci demonstrate the different methylation levels between normal tissue types. Interestingly, in some chromosomal regions, wider ranges of LINE-1 methylation levels were observed. In cancerous cells, the methylation levels of most LINE-1 loci demonstrated a positive correlation with each other and with the genome-wide levels. Therefore, the loss of genome-wide methylation in cancerous cells occurs as a generalized process. However, different LINE-1 loci showed different incidences of HNSCC hypomethylation, which is a lower methylation level than NOE. Additionally, we report a closer direct association between two LINE-1s in different EPHA3 introns. Finally, hypermethylation of some LINE-1s can be found sporadically in cancer. In conclusion, even though the global hypomethylation process that occurs in cancerous cells can generally deplete LINE-1 methylation levels, LINE-1 methylation can be influenced differentially depending on where the particular sequences are located in the genome.
Collapse
Affiliation(s)
- Chureerat Phokaew
- Inter-Department Program of BioMedical Sciences, Faculty of Graduate School, Chulalongkorn University, Bangkok 10330, Thailand
| | | | | | | | | |
Collapse
|
36
|
Zhu J, He F, Song S, Wang J, Yu J. How many human genes can be defined as housekeeping with current expression data? BMC Genomics 2008; 9:172. [PMID: 18416810 PMCID: PMC2396180 DOI: 10.1186/1471-2164-9-172] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2007] [Accepted: 04/16/2008] [Indexed: 12/16/2022] Open
Abstract
Background Housekeeping (HK) genes are ubiquitously expressed in all tissue/cell types and constitute a basal transcriptome for the maintenance of basic cellular functions. Partitioning transcriptomes into HK and tissue-specific (TS) genes relatively is fundamental for studying gene expression and cellular differentiation. Although many studies have aimed at large-scale and thorough categorization of human HK genes, a meaningful consensus has yet to be reached. Results We collected two latest gene expression datasets (both EST and microarray data) from public databases and analyzed the gene expression profiles in 18 human tissues that have been well-documented by both two data types. Benchmarked by a manually-curated HK gene collection (HK408), we demonstrated that present data from EST sampling was far from saturated, and the inadequacy has limited the gene detectability and our understanding of TS expressions. Due to a likely over-stringent threshold, microarray data showed higher false negative rate compared with EST data, leading to a significant underestimation of HK genes. Based on EST data, we found that 40.0% of the currently annotated human genes were universally expressed in at least 16 of 18 tissues, as compared to only 5.1% specifically expressed in a single tissue. Our current EST-based estimate on human HK genes ranged from 3,140 to 6,909 in number, a ten-fold increase in comparison with previous microarray-based estimates. Conclusion We concluded that a significant fraction of human genes, at least in the currently annotated data depositories, was broadly expressed. Our understanding of tissue-specific expression was still preliminary and required much more large-scale and high-quality transcriptomic data in future studies. The new HK gene list categorized in this study will be useful for genome-wide analyses on structural and functional features of HK genes.
Collapse
Affiliation(s)
- Jiang Zhu
- Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
| | | | | | | | | |
Collapse
|
37
|
Urrutia AO, Ocaña LB, Hurst LD. Do Alu repeats drive the evolution of the primate transcriptome? Genome Biol 2008; 9:R25. [PMID: 18241332 PMCID: PMC2374697 DOI: 10.1186/gb-2008-9-2-r25] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2007] [Revised: 01/02/2008] [Accepted: 02/01/2008] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Of all repetitive elements in the human genome, Alus are unusual in being enriched near to genes that are expressed across a broad range of tissues. This has led to the proposal that Alus might be modifying the expression breadth of neighboring genes, possibly by providing CpG islands, modifying transcription factor binding, or altering chromatin structure. Here we consider whether Alus have increased expression breadth of genes in their vicinity. RESULTS Contrary to the modification hypothesis, we find that those genes that have always had broad expression are richest in Alus, whereas those that are more likely to have become more broadly expressed have lower enrichment. This finding is consistent with a model in which Alus accumulate near broadly expressed genes but do not affect their expression breadth. Furthermore, this model is consistent with the finding that expression breadth of mouse genes predicts Alu density near their human orthologs. However, Alus were found to be related to some alternative measures of transcription profile divergence, although evidence is contradictory as to whether Alus associate with lowly or highly diverged genes. If Alu have any effect it is not by provision of CpG islands, because they are especially rare near to transcriptional start sites. Previously reported Alu enrichment for genes serving certain cellular functions, suggested to be evidence of functional importance of Alus, appears to be partly a byproduct of the association with broadly expressed genes. CONCLUSION The abundance of Alu near broadly expressed genes is better explained by their preferential preservation near to housekeeping genes rather than by a modifying effect on expression of genes.
Collapse
Affiliation(s)
- Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA4 7AY, UK.
| | | | | |
Collapse
|
38
|
Laviad EL, Albee L, Pankova-Kholmyansky I, Epstein S, Park H, Merrill AH, Futerman AH. Characterization of ceramide synthase 2: tissue distribution, substrate specificity, and inhibition by sphingosine 1-phosphate. J Biol Chem 2007; 283:5677-84. [PMID: 18165233 DOI: 10.1074/jbc.m707386200] [Citation(s) in RCA: 386] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Ceramide is an important lipid signaling molecule and a key intermediate in sphingolipid biosynthesis. Recent studies have implied a previously unappreciated role for the ceramide N-acyl chain length, inasmuch as ceramides containing specific fatty acids appear to play defined roles in cell physiology. The discovery of a family of mammalian ceramide synthases (CerS), each of which utilizes a restricted subset of acyl-CoAs for ceramide synthesis, strengthens this notion. We now report the characterization of mammalian CerS2. qPCR analysis reveals that CerS2 mRNA is found at the highest level of all CerS and has the broadest tissue distribution. CerS2 has a remarkable acyl-CoA specificity, showing no activity using C16:0-CoA and very low activity using C18:0, rather utilizing longer acyl-chain CoAs (C20-C26) for ceramide synthesis. There is a good correlation between CerS2 mRNA levels and levels of ceramide and sphingomyelin containing long acyl chains, at least in tissues where CerS2 mRNA is expressed at high levels. Interestingly, the activity of CerS2 can be regulated by another bioactive sphingolipid, sphingosine 1-phosphate (S1P), via interaction of S1P with two residues that are part of an S1P receptor-like motif found only in CerS2. These findings provide insight into the biochemical basis for the ceramide N-acyl chain composition of cells, and also reveal a novel and potentially important interplay between two bioactive sphingolipids that could be relevant to the regulation of sphingolipid metabolism and the opposing functions that these lipids play in signaling pathways.
Collapse
Affiliation(s)
- Elad L Laviad
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel
| | | | | | | | | | | | | |
Collapse
|
39
|
Lawson MJ, Zhang L. Housekeeping and tissue-specific genes differ in simple sequence repeats in the 5'-UTR region. Gene 2007; 407:54-62. [PMID: 17964742 DOI: 10.1016/j.gene.2007.09.017] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2007] [Revised: 09/25/2007] [Accepted: 09/26/2007] [Indexed: 12/22/2022]
Abstract
SSRs (simple sequence repeats) have been shown to have a variety of effects on an organism. In this study, we compared SSRs in housekeeping and tissue-specific genes in human and mouse, in terms of SSR types and distributions in different regions including 5'-UTRs, introns, coding exons, 3'-UTRs, and upstream regions. Among all these regions, SSRs in the 5'-UTR show the most distinction between housekeeping genes and tissue-specific genes in both densities and repeat types. Specifically, SSR densities in 5'-UTRs in housekeeping genes are about 1.7 times higher than those in tissue-specific genes, in contrast to the 0.8-1.2 times differences between the two classes of genes in other regions. Tri-SSRs in 5'-UTRs of housekeeping genes are more GC rich than those of tissue-specific genes and CGG, the dominant type of tri-SSR in 5'-UTR, accounts for 74-79% of the tri-SSRs in housekeeping genes, as compared to 42-57% in tissue-specific genes. 75% of the tri-SSRs in the 5'-UTR of housekeeping genes have 4-5 repeat units, versus the 86-90% in tissue-specific genes. Taken together, our results suggest that SSRs may have an effect on gene expression and may play an important role in contributing to the different expression profiles between housekeeping and tissue-specific genes.
Collapse
Affiliation(s)
- Mark J Lawson
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | | |
Collapse
|