51
|
Tyekucheva S, Yolken RH, McCombie WR, Parla J, Kramer M, Wheelan SJ, Sabunciyan S. Establishing the baseline level of repetitive element expression in the human cortex. BMC Genomics 2011; 12:495. [PMID: 21985647 PMCID: PMC3207997 DOI: 10.1186/1471-2164-12-495] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 10/10/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Although nearly half of the human genome is comprised of repetitive sequences, the expression profile of these elements remains largely uncharacterized. Recently developed high throughput sequencing technologies provide us with a powerful new set of tools to study repeat elements. Hence, we performed whole transcriptome sequencing to investigate the expression of repetitive elements in human frontal cortex using postmortem tissue obtained from the Stanley Medical Research Institute. RESULTS We found a significant amount of reads from the human frontal cortex originate from repeat elements. We also noticed that Alu elements were expressed at levels higher than expected by random or background transcription. In contrast, L1 elements were expressed at lower than expected amounts. CONCLUSIONS Repetitive elements are expressed abundantly in the human brain. This expression pattern appears to be element specific and can not be explained by random or background transcription. These results demonstrate that our knowledge about repetitive elements is far from complete. Further characterization is required to determine the mechanism, the control, and the effects of repeat element expression.
Collapse
Affiliation(s)
- Svitlana Tyekucheva
- Department of Biostatistics and Computational Biology, Dana-Farber CancerInstitute, 450 Brookline Ave, Boston, 02115, USA
| | | | | | | | | | | | | |
Collapse
|
52
|
|
53
|
Hedges DJ, Belancio VP. Restless genomes humans as a model organism for understanding host-retrotransposable element dynamics. ADVANCES IN GENETICS 2011; 73:219-62. [PMID: 21310298 DOI: 10.1016/b978-0-12-380860-8.00006-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Since their initial discovery in maize, there have been various attempts to categorize the relationship between transposable elements (TEs) and their host organisms. These have ranged from TEs being selfish parasites to their role as essential, functional components of organismal biology. Research over the past several decades has, in many respects, only served to complicate the issue even further. On the one hand, investigators have amassed substantial evidence concerning the negative effects that TE-mutagenic activity can have on host genomes and organismal fitness. On the other hand, we find an increasing number of examples, across several taxa, of TEs being incorporated into functional biological roles for their host organism. Some 45% of our own genomes are comprised of TE copies. While many of these copies are dormant, having lost their ability to mobilize, several lineages continue to actively proliferate in modern human populations. With its complement of ancestral and active TEs, the human genome exhibits key aspects of the host-TE dynamic that has played out since early on in organismal evolution. In this review, we examine what insights the particularly well-characterized human system can provide regarding the nature of the host-TE interaction.
Collapse
Affiliation(s)
- Dale J Hedges
- Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, USA
| | | |
Collapse
|
54
|
Oliver KR, Greene WK. Mobile DNA and the TE-Thrust hypothesis: supporting evidence from the primates. Mob DNA 2011; 2:8. [PMID: 21627776 PMCID: PMC3123540 DOI: 10.1186/1759-8753-2-8] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2011] [Accepted: 05/31/2011] [Indexed: 02/07/2023] Open
Abstract
Transposable elements (TEs) are increasingly being recognized as powerful facilitators of evolution. We propose the TE-Thrust hypothesis to encompass TE-facilitated processes by which genomes self-engineer coding, regulatory, karyotypic or other genetic changes. Although TEs are occasionally harmful to some individuals, genomic dynamism caused by TEs can be very beneficial to lineages. This can result in differential survival and differential fecundity of lineages. Lineages with an abundant and suitable repertoire of TEs have enhanced evolutionary potential and, if all else is equal, tend to be fecund, resulting in species-rich adaptive radiations, and/or they tend to undergo major evolutionary transitions. Many other mechanisms of genomic change are also important in evolution, and whether the evolutionary potential of TE-Thrust is realized is heavily dependent on environmental and ecological factors. The large contribution of TEs to evolutionary innovation is particularly well documented in the primate lineage. In this paper, we review numerous cases of beneficial TE-caused modifications to the genomes of higher primates, which strongly support our TE-Thrust hypothesis.
Collapse
Affiliation(s)
- Keith R Oliver
- School of Biological Sciences and Biotechnology, Faculty of Science and Engineering, Murdoch University, Perth W. A. 6150, Australia
| | - Wayne K Greene
- School of Veterinary and Biomedical Sciences, Faculty of Health Sciences, Murdoch University, Perth W. A. 6150, Australia
| |
Collapse
|
55
|
Lai AG, Denton-Giles M, Mueller-Roeber B, Schippers JHM, Dijkwel PP. Positional information resolves structural variations and uncovers an evolutionarily divergent genetic locus in accessions of Arabidopsis thaliana. Genome Biol Evol 2011; 3:627-40. [PMID: 21622917 PMCID: PMC3157834 DOI: 10.1093/gbe/evr038] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genome sequencing of closely related individuals has yielded valuable insights that link genome evolution to phenotypic variations. However, advancement in sequencing technology has also led to an escalation in the number of poor quality–drafted genomes assembled based on reference genomes that can have highly divergent or haplotypic regions. The self-fertilizing nature of Arabidopsis thaliana poses an advantage to sequencing projects because its genome is mostly homozygous. To determine the accuracy of an Arabidopsis drafted genome in less conserved regions, we performed a resequencing experiment on a ∼371-kb genomic interval in the Landsberg erecta (Ler-0) accession. We identified novel structural variations (SVs) between Ler-0 and the reference accession Col-0 using a long-range polymerase chain reaction approach to generate an Illumina data set that has positional information, that is, a data set with reads that map to a known location. Positional information is important for accurate genome assembly and the resolution of SVs particularly in highly duplicated or repetitive regions. Sixty-one regions with misassembly signatures were identified from the Ler-0 draft, suggesting the presence of novel SVs that are not represented in the draft sequence. Sixty of those were resolved by iterative mapping using our data set. Fifteen large indels (>100 bp) identified from this study were found to be located either within protein-coding regions or upstream regulatory regions, suggesting the formation of novel alleles or altered regulation of existing genes in Ler-0. We propose future genome-sequencing experiments to follow a clone-based approach that incorporates positional information to ultimately reveal haplotype-specific differences between accessions.
Collapse
Affiliation(s)
- Alvina G Lai
- Institute of Molecular BioSciences, Massey University, Private Bag 11-222, Palmerston North 4442, New Zealand
| | | | | | | | | |
Collapse
|
56
|
Kim DS, Hahn Y. Identification of human-specific transcript variants induced by DNA insertions in the human genome. ACTA ACUST UNITED AC 2010; 27:14-21. [PMID: 21037245 DOI: 10.1093/bioinformatics/btq612] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
MOTIVATION Many genes in the human genome produce a wide variety of transcript variants resulting from alternative exon splicing, differential promoter usage, or altered polyadenylation site utilization that may function differently in human cells. Here, we present a bioinformatics method for the systematic identification of human-specific novel transcript variants that might have arisen after the human-chimpanzee divergence. RESULTS The procedure involved collecting genomic insertions that are unique to the human genome when compared with orthologous chimpanzee and rhesus macaque genomic regions, and that are expressed in the transcriptome as exons evidenced by mRNAs and/or expressed sequence tags (ESTs). Using this procedure, we identified 112 transcript variants that are specific to humans; 74 were associated with known genes and the remaining transcripts were located in unannotated genomic loci. The original source of inserts was mostly transposable elements including L1, Alu, SVA, and human endogenous retroviruses (HERVs). Interestingly, some non-repetitive genomic segments were also involved in the generation of novel transcript variants. Insert contributions to the transcripts included promoters, terminal exons and insertions in exons, splice donors and acceptors and complete exon cassettes. Comparison of personal genomes revealed that at least seven loci were polymorphic in humans. The exaptation of human-specific genomic inserts as novel transcript variants may have increased human gene versatility or affected gene regulation.
Collapse
Affiliation(s)
- Dong Seon Kim
- Department of Life Science (BK21 Program), Chung-Ang University, Seoul, Korea
| | | |
Collapse
|
57
|
Pyle AM. The tertiary structure of group II introns: implications for biological function and evolution. Crit Rev Biochem Mol Biol 2010; 45:215-32. [PMID: 20446804 DOI: 10.3109/10409231003796523] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Group II introns are some of the largest ribozymes in nature, and they are a major source of information about RNA assembly and tertiary structural organization. These introns are of biological significance because they are self-splicing mobile elements that have migrated into diverse genomes and played a major role in the genomic organization and metabolism of most life forms. The tertiary structure of group II introns has been the subject of many phylogenetic, genetic, biochemical and biophysical investigations, all of which are consistent with the recent crystal structure of an intact group IIC intron from the alkaliphilic eubacterium Oceanobacillus iheyensis. The crystal structure reveals that catalytic intron domain V is enfolded within the other intronic domains through an elaborate network of diverse tertiary interactions. Within the folded core, DV adopts an activated conformation that readily binds catalytic metal ions and positions them in a manner appropriate for reaction with nucleic acid targets. The tertiary structure of the group II intron reveals new information on motifs for RNA architectural organization, mechanisms of group II intron catalysis, and the evolutionary relationships among RNA processing systems. Guided by the structure and the wealth of previous genetic and biochemical work, it is now possible to deduce the probable location of DVI and the site of additional domains that contribute to the function of the highly derived group IIB and IIA introns.
Collapse
Affiliation(s)
- Anna Marie Pyle
- Department of Molecular Biophysics and Biochemistry, Howard Hughes Medical Institute and Yale University, New Haven, CT, USA.
| |
Collapse
|
58
|
Paquet Y, Anderson A. Sequence composition similarities with the 7SL RNA are highly predictive of functional genomic features. Nucleic Acids Res 2010; 38:4907-16. [PMID: 20392819 PMCID: PMC2926601 DOI: 10.1093/nar/gkq234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Transposable elements derived from the 7SL RNA gene, such as Alu elements in primates, have had remarkable success in several mammalian lineages. The results presented here show a broad spectrum of functions for genomic segments that display sequence composition similarities with the 7SL RNA gene. Using thoroughly documented loci, we report that DNaseI-hypersensitive sites can be singled out in large genomic sequences by an assessment of sequence composition similarities with the 7SL RNA gene. We apply a root word frequency approach to illustrate a distinctive relationship between the sequence of the 7SL RNA gene and several classes of functional genomic features that are not presumed to be of transposable origin. Transposable elements that show noticeable similarities with the 7SL sequence include Alu sequences, as expected, but also long terminal repeats and the 5′-untranslated regions of long interspersed repetitive elements. In sequences masked for repeated elements, we find, when using the 7SL RNA gene as query sequence, distinctive similarities with promoters, exons and distal gene regulatory regions. The latter being the most notoriously difficult to detect, this approach may be useful for finding genomic segments that have regulatory functions and that may have escaped detection by existing methods.
Collapse
Affiliation(s)
- Yanick Paquet
- Centre de recherche en cancérologie de l’Université Laval, L’Hôtel-Dieu de Québec, Centre hospitalier universitaire de Québec, Québec G1R 2J6 and Département de biologie, Université Laval, Québec G1K 7P4, Canada
| | - Alan Anderson
- Centre de recherche en cancérologie de l’Université Laval, L’Hôtel-Dieu de Québec, Centre hospitalier universitaire de Québec, Québec G1R 2J6 and Département de biologie, Université Laval, Québec G1K 7P4, Canada
- *To whom correspondence should be addressed. Tel: + 418 691 5281; Fax: +418 691 5439;
| |
Collapse
|
59
|
Mobile interspersed repeats are major structural variants in the human genome. Cell 2010; 141:1171-82. [PMID: 20602999 DOI: 10.1016/j.cell.2010.05.026] [Citation(s) in RCA: 198] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2010] [Revised: 03/29/2010] [Accepted: 05/13/2010] [Indexed: 01/22/2023]
Abstract
Characterizing structural variants in the human genome is of great importance, but a genome wide analysis to detect interspersed repeats has not been done. Thus, the degree to which mobile DNAs contribute to genetic diversity, heritable disease, and oncogenesis remains speculative. We perform transposon insertion profiling by microarray (TIP-chip) to map human L1(Ta) retrotransposons (LINE-1 s) genome-wide. This identified numerous novel human L1(Ta) insertional polymorphisms with highly variant allelic frequencies. We also explored TIP-chip's usefulness to identify candidate alleles associated with different phenotypes in clinical cohorts. Our data suggest that the occurrence of new insertions is twice as high as previously estimated, and that these repeats are under-recognized as sources of human genomic and phenotypic diversity. We have just begun to probe the universe of human L1(Ta) polymorphisms, and as TIP-chip is applied to other insertions such as Alu SINEs, it will expand the catalog of genomic variants even further.
Collapse
|
60
|
Lee SH, Cho SY, Shannon MF, Fan J, Rangasamy D. The impact of CpG island on defining transcriptional activation of the mouse L1 retrotransposable elements. PLoS One 2010; 5:e11353. [PMID: 20613872 PMCID: PMC2894050 DOI: 10.1371/journal.pone.0011353] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Accepted: 05/20/2010] [Indexed: 12/31/2022] Open
Abstract
Background L1 retrotransposable elements are potent insertional mutagens responsible for the generation of genomic variation and diversification of mammalian genomes, but reliable estimates of the numbers of actively transposing L1 elements are mostly nonexistent. While the human and mouse genomes contain comparable numbers of L1 elements, several phylogenetic and L1Xplore analyses in the mouse genome suggest that 1,500–3,000 active L1 elements currently exist and that they are still expanding in the genome. Conversely, the human genome contains only 150 active L1 elements. In addition, there is a discrepancy among the nature and number of mouse L1 elements in L1Xplore and the mouse genome browser at the UCSC and in the literature. To date, the reason why a high copy number of active L1 elements exist in the mouse genome but not in the human genome is unknown, as are the potential mechanisms that are responsible for transcriptional activation of mouse L1 elements. Methodology/Principal Findings We analyzed the promoter sequences of the 1,501 potentially active mouse L1 elements retrieved from the GenBank and L1Xplore databases and evaluated their transcription factors binding sites and CpG content. To this end, we found that a substantial number of mouse L1 elements contain altered transcription factor YY1 binding sites on their promoter sequences that are required for transcriptional initiation, suggesting that only a half of L1 elements are capable of being transcriptionally active. Furthermore, we present experimental evidence that previously unreported CpG islands exist in the promoters of the most active TF family of mouse L1 elements. The presence of sequence variations and polymorphisms in CpG islands of L1 promoters that arise from transition mutations indicates that CpG methylation could play a significant role in determining the activity of L1 elements in the mouse genome. Conclusions A comprehensive analysis of mouse L1 promoters suggests that the number of transcriptionally active elements is significantly lower than the total number of full-length copies from the three active mouse L1 families. Like human L1 elements, the CpG islands and potentially the transcription factor YY1 binding sites are likely to be required for transcriptional initiation of mouse L1 elements.
Collapse
Affiliation(s)
- Sung-Hun Lee
- The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Soo-Young Cho
- Division of Molecular and Life Sciences, Hanyang University, Ansan, Republic of Korea
| | - M. Frances Shannon
- The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Jun Fan
- The John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Danny Rangasamy
- The John Curtin School of Medical Research, Australian National University, Canberra, Australia
- * E-mail:
| |
Collapse
|
61
|
Belancio VP, Roy-Engel AM, Deininger PL. All y'all need to know 'bout retroelements in cancer. Semin Cancer Biol 2010; 20:200-10. [PMID: 20600922 DOI: 10.1016/j.semcancer.2010.06.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2010] [Revised: 06/14/2010] [Accepted: 06/17/2010] [Indexed: 01/08/2023]
Abstract
Genetic instability is one of the principal hallmarks and causative factors in cancer. Human transposable elements (TE) have been reported to cause human diseases, including several types of cancer through insertional mutagenesis of genes critical for preventing or driving malignant transformation. In addition to retrotransposition-associated mutagenesis, TEs have been found to contribute even more genomic rearrangements through non-allelic homologous recombination. TEs also have the potential to generate a wide range of mutations derivation of which is difficult to directly trace to mobile elements, including double strand breaks that may trigger mutagenic genomic rearrangements. Genome-wide hypomethylation of TE promoters and significantly elevated TE expression in almost all human cancers often accompanied by the loss of critical DNA sensing and repair pathways suggests that the negative impact of mobile elements on genome stability should increase as human tumors evolve. The biological consequences of elevated retroelement expression, such as the rate of their amplification, in human cancers remain obscure, particularly, how this increase translates into disease-relevant mutations. This review is focused on the cellular mechanisms that control human TE-associated mutagenesis in cancer and summarizes the current understanding of TE contribution to genetic instability in human malignancies.
Collapse
Affiliation(s)
- Victoria P Belancio
- Tulane University, Department of Structural and Cellular Biology, School of Medicine, Tulane Cancer Center and Tulane Center for Aging, New Orleans, LA 70112, USA
| | | | | |
Collapse
|
62
|
Mätlik K, Redik K, Speek M. L1 antisense promoter drives tissue-specific transcription of human genes. J Biomed Biotechnol 2010; 2006:71753. [PMID: 16877819 PMCID: PMC1559930 DOI: 10.1155/jbb/2006/71753] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Transcription of transposable elements interspersed in the genome
is controlled by complex interactions between their regulatory
elements and host factors. However, the same regulatory elements
may be occasionally used for the transcription of host genes. One
such example is the human L1 retrotransposon, which contains an
antisense promoter (ASP) driving transcription into adjacent genes
yielding chimeric transcripts. We have characterized 49 chimeric
mRNAs corresponding to sense and antisense strands of human genes.
Here we show that L1 ASP is capable of functioning as an
alternative promoter, giving rise to a chimeric transcript whose
coding region is identical to the ORF of mRNA of the following
genes: KIAA1797, CLCN5, and SLCO1A2.
Furthermore, in these cases the activity of L1 ASP is
tissue-specific and may expand the expression pattern of the
respective gene. The activity of L1 ASP is tissue-specific also in
cases where L1 ASP produces antisense RNAs complementary to
COL11A1 and BOLL mRNAs. Simultaneous assessment
of the activity of L1 ASPs in multiple loci revealed the presence
of L1 ASP-derived transcripts in all human tissues examined. We
also demonstrate that L1 ASP can act as a promoter in vivo and
predict that it has a heterogeneous transcription initiation site.
Our data suggest that L1 ASP-driven transcription may increase the
transcriptional flexibility of several human genes.
Collapse
Affiliation(s)
- Kert Mätlik
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
| | - Kaja Redik
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
| | - Mart Speek
- Department of Gene Technology, Tallinn University of
Technology, Akadeemia tee 15, Tallinn 19086, Estonia
- *Mart Speek:
| |
Collapse
|
63
|
Shapiro JA. Mobile DNA and evolution in the 21st century. Mob DNA 2010; 1:4. [PMID: 20226073 PMCID: PMC2836002 DOI: 10.1186/1759-8753-1-4] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 01/25/2010] [Indexed: 01/05/2023] Open
Abstract
Scientific history has had a profound effect on the theories of evolution. At the beginning of the 21st century, molecular cell biology has revealed a dense structure of information-processing networks that use the genome as an interactive read-write (RW) memory system rather than an organism blueprint. Genome sequencing has documented the importance of mobile DNA activities and major genome restructuring events at key junctures in evolution: exon shuffling, changes in cis-regulatory sites, horizontal transfer, cell fusions and whole genome doublings (WGDs). The natural genetic engineering functions that mediate genome restructuring are activated by multiple stimuli, in particular by events similar to those found in the DNA record: microbial infection and interspecific hybridization leading to the formation of allotetraploids. These molecular genetic discoveries, plus a consideration of how mobile DNA rearrangements increase the efficiency of generating functional genomic novelties, make it possible to formulate a 21st century view of interactive evolutionary processes. This view integrates contemporary knowledge of the molecular basis of genetic change, major genome events in evolution, and stimuli that activate DNA restructuring with classical cytogenetic understanding about the role of hybridization in species diversification.
Collapse
Affiliation(s)
- James A Shapiro
- Department of Biochemistry and Molecular Biology, University of Chicago, Gordon Center for Integrative Science W123B, 929 E 57th Street, Chicago, IL 60637, USA.
| |
Collapse
|
64
|
Keating KS, Toor N, Perlman PS, Pyle AM. A structural analysis of the group II intron active site and implications for the spliceosome. RNA (NEW YORK, N.Y.) 2010; 16:1-9. [PMID: 19948765 PMCID: PMC2802019 DOI: 10.1261/rna.1791310] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 08/12/2009] [Indexed: 05/20/2023]
Abstract
Group II introns are self-splicing, mobile genetic elements that have fundamentally influenced the organization of terrestrial genomes. These large ribozymes remain important for gene expression in almost all forms of bacteria and eukaryotes and they are believed to share a common ancestry with the eukaryotic spliceosome that is required for processing all nuclear pre-mRNAs. The three-dimensional structure of a group IIC intron was recently determined by X-ray crystallography, making it possible to visualize the active site and the elaborate network of tertiary interactions that stabilize the molecule. Here we describe the molecular features of the active site in detail and evaluate their correspondence with prior biochemical, genetic, and phylogenetic analyses on group II introns. In addition, we evaluate the structural significance of RNA motifs within the intron core, such as the major-groove triple helix and the domain 5 bulge. Having combined what is known about the group II intron core, we then compare it with known structural features of U6 snRNA in the eukaryotic spliceosome. This analysis leads to a set of predictions for the molecular structure of the spliceosomal active site.
Collapse
Affiliation(s)
- Kevin S Keating
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
| | | | | | | |
Collapse
|
65
|
Unique functions of repetitive transcriptomes. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2010; 285:115-88. [PMID: 21035099 DOI: 10.1016/b978-0-12-381047-2.00003-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Repetitive sequences occupy a huge fraction of essentially every eukaryotic genome. Repetitive sequences cover more than 50% of mammalian genomic DNAs, whereas gene exons and protein-coding sequences occupy only ~3% and 1%, respectively. Numerous genomic repeats include genes themselves. They generally encode "selfish" proteins necessary for the proliferation of transposable elements (TEs) in the host genome. The major part of evolutionary "older" TEs accumulated mutations over time and fails to encode functional proteins. However, repeats have important functions also on the RNA level. Repetitive transcripts may serve as multifunctional RNAs by participating in the antisense regulation of gene activity and by competing with the host-encoded transcripts for cellular factors. In addition, genomic repeats include regulatory sequences like promoters, enhancers, splice sites, polyadenylation signals, and insulators, which actively reshape cellular transcriptomes. TE expression is tightly controlled by the host cells, and some mechanisms of this regulation were recently decoded. Finally, capacity of TEs to proliferate in the host genome led to the development of multiple biotechnological applications.
Collapse
|
66
|
Gogvadze E, Buzdin A. Retroelements and their impact on genome evolution and functioning. Cell Mol Life Sci 2009; 66:3727-42. [PMID: 19649766 PMCID: PMC11115525 DOI: 10.1007/s00018-009-0107-2] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2009] [Revised: 06/11/2009] [Accepted: 07/14/2009] [Indexed: 12/31/2022]
Abstract
Retroelements comprise a considerable fraction of eukaryotic genomes. Since their initial discovery by Barbara McClintock in maize DNA, retroelements have been found in genomes of almost all organisms. First considered as a "junk DNA" or genomic parasites, they were shown to influence genome functioning and to promote genetic innovations. For this reason, they were suggested as an important creative force in the genome evolution and adaptation of an organism to altered environmental conditions. In this review, we summarize the up-to-date knowledge of different ways of retroelement involvement in structural and functional evolution of genes and genomes, as well as the mechanisms generated by cells to control their retrotransposition.
Collapse
Affiliation(s)
- Elena Gogvadze
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 16/10 Miklukho-Maklaya st, 117997 Moscow, Russia.
| | | |
Collapse
|
67
|
Belancio VP, Deininger PL, Roy-Engel AM. LINE dancing in the human genome: transposable elements and disease. Genome Med 2009; 1:97. [PMID: 19863772 PMCID: PMC2784310 DOI: 10.1186/gm97] [Citation(s) in RCA: 99] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Transposable elements (TEs) have been consistently underestimated in their contribution to genetic instability and human disease. TEs can cause human disease by creating insertional mutations in genes, and also contributing to genetic instability through non-allelic homologous recombination and introduction of sequences that evolve into various cis-acting signals that alter gene expression. Other outcomes of TE activity, such as their potential to cause DNA double-strand breaks or to modulate the epigenetic state of chromosomes, are less fully characterized. The currently active human transposable elements are members of the non-LTR retroelement families, LINE-1, Alu (SINE), and SVA. The impact of germline insertional mutagenesis by TEs is well established, whereas the rate of post-insertional TE-mediated germline mutations and all forms of somatic mutations remain less well quantified. The number of human diseases discovered to be associated with non-allelic homologous recombination between TEs, and particularly between Alu elements, is growing at an unprecedented rate. Improvement in the technology for detection of such events, as well as the mounting interest in the research and medical communities in resolving the underlying causes of the human diseases with unknown etiology, explain this increase. Here, we focus on the most recent advances in understanding of the impact of the active human TEs on the stability of the human genome and its relevance to human disease.
Collapse
Affiliation(s)
- Victoria P Belancio
- Department of Structural and Cellular Biology, School of Medicine, Tulane Cancer Center and Tulane Center for Aging, Tulane University, SL-49 1430 Tulane Ave, New Orleans, LA 70112, USA.
| | | | | |
Collapse
|
68
|
Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nat Rev Genet 2009; 10:691-703. [PMID: 19763152 DOI: 10.1038/nrg2640] [Citation(s) in RCA: 1138] [Impact Index Per Article: 75.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Their ability to move within genomes gives transposable elements an intrinsic propensity to affect genome evolution. Non-long terminal repeat (LTR) retrotransposons--including LINE-1, Alu and SVA elements--have proliferated over the past 80 million years of primate evolution and now account for approximately one-third of the human genome. In this Review, we focus on this major class of elements and discuss the many ways that they affect the human genome: from generating insertion mutations and genomic instability to altering gene expression and contributing to genetic innovation. Increasingly detailed analyses of human and other primate genomes are revealing the scale and complexity of the past and current contributions of non-LTR retrotransposons to genomic change in the human lineage.
Collapse
Affiliation(s)
- Richard Cordaux
- CNRS UMR 6556 Ecologie, Evolution, Symbiose, Université de Poitiers, 40 Avenue du Recteur Pineau, Poitiers, France
| | | |
Collapse
|
69
|
Rangwala SH, Zhang L, Kazazian HH. Many LINE1 elements contribute to the transcriptome of human somatic cells. Genome Biol 2009; 10:R100. [PMID: 19772661 PMCID: PMC2768975 DOI: 10.1186/gb-2009-10-9-r100] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Revised: 08/21/2009] [Accepted: 09/22/2009] [Indexed: 11/29/2022] Open
Abstract
Over 600 LINE 1 elements are shown to be transcribed in humans; 400 of these are full-length elements in the reference genome. Background While LINE1 (L1) retroelements comprise nearly 20% of the human genome, the majority are thought to have been rendered transcriptionally inactive, due to either mutation or epigenetic suppression. How many L1 elements 'escape' these forms of repression and contribute to the transcriptome of human somatic cells? We have cloned out expressed sequence tags corresponding to the 5' and 3' flanks of L1 elements in order to characterize the population of elements that are being actively transcribed. We also examined expression of a select number of elements in different individuals. Results We isolated expressed sequence tags from human lymphoblastoid cell lines corresponding to 692 distinct L1 element sites, including 410 full-length elements. Four of the expression tagged sites corresponding to full-length elements from the human specific L1Hs subfamily were examined in European-American individuals and found to be differentially expressed in different family members. Conclusions A large number of different L1 element sites are expressed in human somatic tissues, and this expression varies among different individuals. Paradoxically, few elements were tagged at high frequency, indicating that the majority of expressed L1s are transcribed at low levels. Based on our preliminary expression studies of a limited number of elements in a single family, we predict a significant degree of inter-individual transcript-level polymorphism in this class of sequence.
Collapse
Affiliation(s)
- Sanjida H Rangwala
- Department of Genetics, University of Pennsylvania School of Medicine, Hamilton Walk, Philadelphia, Pennsylvania 19104, USA.
| | | | | |
Collapse
|
70
|
Cruickshanks HA, Tufarelli C. Isolation of cancer-specific chimeric transcripts induced by hypomethylation of the LINE-1 antisense promoter. Genomics 2009; 94:397-406. [PMID: 19720139 DOI: 10.1016/j.ygeno.2009.08.013] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2008] [Revised: 08/10/2009] [Accepted: 08/22/2009] [Indexed: 11/19/2022]
Abstract
The antisense promoter of human LINE-1 (L1) retroelements can direct transcription of adjacent unique genomic sequences generating chimeric RNAs, which can perturb transcription of neighbouring genes. As L1 elements constitute 17% of the human genome, chimeric transcription is potentially widespread, but the extent to which this occurs is largely unknown. Using a genome-wide screen we have isolated novel chimeric transcripts that are unique to breast cancer cell lines, primary tumours and colon cancer cells. Expression of the cancer-specific chimeric transcripts can be induced in non-malignant breast epithelial cells by the demethylating drug 5-azacytidine. These findings indicate that loss of L1 methylation in cancer cells is linked to the expression of L1-chimeric transcripts which may therefore constitute a useful set of markers of malignancy.
Collapse
MESH Headings
- Azacitidine/pharmacology
- Breast/cytology
- Breast Neoplasms/genetics
- Breast Neoplasms/pathology
- Cell Line, Tumor/drug effects
- Cell Line, Tumor/metabolism
- Cells, Cultured/drug effects
- Cells, Cultured/metabolism
- Colonic Neoplasms/genetics
- Colonic Neoplasms/pathology
- DNA Methylation/drug effects
- Female
- Humans
- Long Interspersed Nucleotide Elements/genetics
- Promoter Regions, Genetic/genetics
- RNA, Messenger/biosynthesis
- RNA, Messenger/isolation & purification
- RNA, Neoplasm/biosynthesis
- RNA, Neoplasm/isolation & purification
- Reverse Transcriptase Polymerase Chain Reaction
- Transcription, Genetic/drug effects
Collapse
Affiliation(s)
- Hazel A Cruickshanks
- Wolfson Centre for Stem Cells, Tissue Engineering and Modelling (STEM), Centre for Biomolecular Sciences, University of Nottingham, Nottingham, NG7 2RD, UK
| | | |
Collapse
|
71
|
Xing J, Zhang Y, Han K, Salem AH, Sen SK, Huff CD, Zhou Q, Kirkness EF, Levy S, Batzer MA, Jorde LB. Mobile elements create structural variation: analysis of a complete human genome. Genome Res 2009; 19:1516-26. [PMID: 19439515 DOI: 10.1101/gr.091827.109] [Citation(s) in RCA: 220] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Structural variants (SVs) are common in the human genome. Because approximately half of the human genome consists of repetitive, transposable DNA sequences, it is plausible that these elements play an important role in generating SVs in humans. Sequencing of the diploid genome of one individual human (HuRef) affords us the opportunity to assess, for the first time, the impact of mobile elements on SVs in an individual in a thorough and unbiased fashion. In this study, we systematically evaluated more than 8000 SVs to identify mobile element-associated SVs as small as 100 bp and specific to the HuRef genome. Combining computational and experimental analyses, we identified and validated 706 mobile element insertion events (including Alu, L1, SVA elements, and nonclassical insertions), which added more than 305 kb of new DNA sequence to the HuRef genome compared with the Human Genome Project (HGP) reference sequence (hg18). We also identified 140 mobile element-associated deletions, which removed approximately 126 kb of sequence from the HuRef genome. Overall, approximately 10% of the HuRef-specific indels larger than 100 bp are caused by mobile element-associated events. More than one-third of the insertion/deletion events occurred in genic regions, and new Alu insertions occurred in exons of three human genes. Based on the number of insertions and the estimated time to the most recent common ancestor of HuRef and the HGP reference genome, we estimated the Alu, L1, and SVA retrotransposition rates to be one in 21 births, 212 births, and 916 births, respectively. This study presents the first comprehensive analysis of mobile element-related structural variants in the complete DNA sequence of an individual and demonstrates that mobile elements play an important role in generating inter-individual structural variation.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84109, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
72
|
Sanzol J. Pistil-function breakdown in a new S-allele of European pear, S21*, confers self-compatibility. PLANT CELL REPORTS 2009; 28:457-67. [PMID: 19096853 DOI: 10.1007/s00299-008-0645-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2008] [Revised: 10/30/2008] [Accepted: 11/16/2008] [Indexed: 05/07/2023]
Abstract
European pear exhibits RNase-based gametophytic self-incompatibility controlled by the polymorphic S-locus. S-allele diversity of cultivars has been extensively investigated; however, no mutant alleles conferring self-compatibility have been reported. In this study, two European pear cultivars, 'Abugo' and 'Ceremeño', were classified as self-compatible after fruit/seed setting and pollen tube growth examination. S-genotyping through S-PCR and sequencing identified a new S-RNase allele in the two cultivars, with identical deduced amino acid sequence as S(21), but differing at the nucleotide level. Test-pollinations and analysis of descendants suggested that the new allele is a self-compatible pistil-mutated variant of S(21), so it was named S(21)*. S-genotypes assigned to 'Abugo' and 'Ceremeño' were S(10)S(21)* and S(21)*S(25) respectively, of which S(25) is a new functional S-allele of European pear. Reciprocal crosses between cultivars bearing S(21) and S(21)* indicated that both alleles exhibit the same pollen function; however, cultivars bearing S(21)* had impaired pistil-S function as they failed to reject either S(21) or S (21)* pollen. RT-PCR analysis showed absence of S(21)* -RNase gene expression in styles of 'Abugo' and 'Ceremeño', suggesting a possible origin for S(21)* pistil dysfunction. Two polymorphisms found within the S-RNase genomic region (a retrotransposon insertion within the intron of S(21)* and indels at the 3'UTR) might explain the different pattern of expression between S(21) and S(21)*. Evaluation of cultivars with unknown S-genotype identified another cultivar 'Azucar Verde' bearing S(21)*, and pollen tube growth examination confirmed self-compatibility for this cultivar as well. This is the first report of a mutated S-allele conferring self-compatibility in European pear.
Collapse
Affiliation(s)
- Javier Sanzol
- Unidad de Fruticultura, Centro de Investigación y Tecnología Agroalimentaria de Aragón (CITA), Zaragoza, Spain.
| |
Collapse
|
73
|
Chen C, Ara T, Gautheret D. Using Alu elements as polyadenylation sites: A case of retroposon exaptation. Mol Biol Evol 2008; 26:327-34. [PMID: 18984903 DOI: 10.1093/molbev/msn249] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Of the 1.1 million Alu retroposons in the human genome, about 10,000 are inserted in the 3' untranslated regions (UTR) of protein-coding genes and 1% of these (107 events) are active as polyadenylation sites (PASs). Strikingly, although Alu's in 3' UTR are indifferently inserted in the forward or reverse direction, 99% of polyadenylation-active Alu sequences are forward oriented. Consensus Alu+ sequences contain sites that can give rise to polyadenylation signals and enhancers through a few point mutations. We found that the strand bias of polyadenylation-active Alu's reflects a radical difference in the fitness of sense and antisense Alu's toward cleavage/polyadenylation activity. In contrast to previous beliefs, Alu inserts do not necessarily represent weak or cryptic PASs; instead, they often constitute the major or the unique PAS in a gene, adding to the growing list of Alu exaptations. Finally, some Alu-borne PASs are intronic and produce truncated transcripts that may impact gene function and/or contribute to gene remodeling.
Collapse
Affiliation(s)
- Chongjian Chen
- Institut de Génétique et Microbiologie, Université Paris, Orsay, France
| | | | | |
Collapse
|
74
|
Abstract
Long interspersed nuclear elements (LINEs) are among the most successful parasitic genetic sequences in higher organisms. Recent work has discovered many instances of LINE incorporation into exons, reminding us of the hazards they pose to genes in their vicinity as well as their potential to be co-opted for the host's purposes.
Collapse
Affiliation(s)
- Kathleen H Burns
- Department of Pathology, The Johns Hopkins Hospital, 600 North Wolfe Street, Baltimore, MD 21287, USA.
| | | |
Collapse
|
75
|
Akagi K, Li J, Stephens RM, Volfovsky N, Symer DE. Extensive variation between inbred mouse strains due to endogenous L1 retrotransposition. Genome Res 2008; 18:869-80. [PMID: 18381897 DOI: 10.1101/gr.075770.107] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Numerous inbred mouse strains comprise models for human diseases and diversity, but the molecular differences between them are mostly unknown. Several mammalian genomes have been assembled, providing a framework for identifying structural variations. To identify variants between inbred mouse strains at a single nucleotide resolution, we aligned 26 million individual sequence traces from four laboratory mouse strains to the C57BL/6J reference genome. We discovered and analyzed over 10,000 intermediate-length genomic variants (from 100 nucleotides to 10 kilobases), distinguishing these strains from the C57BL/6J reference. Approximately 85% of such variants are due to recent mobilization of endogenous retrotransposons, predominantly L1 elements, greatly exceeding that reported in humans. Many genes' structures and expression are altered directly by polymorphic L1 retrotransposons, including Drosha (also called Rnasen), Parp8, Scn1a, Arhgap15, and others, including novel genes. L1 polymorphisms are distributed nonrandomly across the genome, as they are excluded significantly from the X chromosome and from genes associated with the cell cycle, but are enriched in receptor genes. Thus, recent endogenous L1 retrotransposition has diversified genomic structures and transcripts extensively, distinguishing mouse lineages and driving a major portion of natural genetic variation.
Collapse
Affiliation(s)
- Keiko Akagi
- Mouse Cancer Genetics Program, Center for Cancer Research, National Cancer Institute, Frederick, Maryland 21702, USA
| | | | | | | | | |
Collapse
|
76
|
Belancio VP, Hedges DJ, Deininger P. Mammalian non-LTR retrotransposons: for better or worse, in sickness and in health. Genome Res 2008; 18:343-58. [PMID: 18256243 DOI: 10.1101/gr.5558208] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Transposable elements (TEs) have shared an exceptionally long coexistence with their host organisms and have come to occupy a significant fraction of eukaryotic genomes. The bulk of the expansion occurring within mammalian genomes has arisen from the activity of type I retrotransposons, which amplify in a "copy-and-paste" fashion through an RNA intermediate. For better or worse, the sequences of these retrotransposons are now wedded to the genomes of their mammalian hosts. Although there are several reported instances of the positive contribution of mobile elements to their host genomes, these discoveries have occurred alongside growing evidence of the role of TEs in human disease and genetic instability. Here we examine, with a particular emphasis on human retrotransposon activity, several newly discovered aspects of mammalian retrotransposon biology. We consider their potential impact on host biology as well as their ultimate implications for the nature of the TE-host relationship.
Collapse
Affiliation(s)
- Victoria P Belancio
- Tulane Cancer Center and Department of Epidemiology, Tulane University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | | | | |
Collapse
|
77
|
Belancio VP, Roy-Engel AM, Deininger P. The impact of multiple splice sites in human L1 elements. Gene 2008; 411:38-45. [PMID: 18261861 DOI: 10.1016/j.gene.2007.12.022] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2007] [Revised: 12/20/2007] [Accepted: 12/28/2007] [Indexed: 12/17/2022]
Abstract
LINE-1 elements represent a significant proportion of mammalian genomes. The impact of their activity on the structure and function of the host genomes has been recognized from the time of their discovery as an endogenous source of insertional mutagenesis. L1 elements contain numerous functional internal polyadenylation signals and splice sites that generate a variety of processed L1 transcripts. These sites are also reported to contribute to the generation of hybrid transcripts between L1 elements and host genes. Using northern blot analysis we demonstrate that L1 splicing, but not L1 polyadenylation, is delayed during the course of L1 expression. L1 splicing can also be negatively regulated by EBV SM protein known to alter this process. These results suggest a potential for L1 mRNA processing to be regulated in a tissue- and/or development-specific manner. The delay in L1 splicing may also serve to protect host genes from the excessive burden of L1 interference with their normal expression via aberrant splicing.
Collapse
Affiliation(s)
- V P Belancio
- Tulane Cancer Center, SL66, Department of Epidemiology, Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112, USA
| | | | | |
Collapse
|
78
|
Sivasubbu S, Balciunas D, Amsterdam A, Ekker SC. Insertional mutagenesis strategies in zebrafish. Genome Biol 2007; 8 Suppl 1:S9. [PMID: 18047701 PMCID: PMC2106850 DOI: 10.1186/gb-2007-8-s1-s9] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
We review here some recent developments in the field of insertional mutagenesis in zebrafish. We highlight the advantages and limitations of the rich body of retroviral methodologies, and we focus on the mechanisms and concepts of new transposon-based mutagenesis approaches under development, including prospects for conditional 'gene trapping' and 'gene breaking' approaches.
Collapse
Affiliation(s)
- Sridhar Sivasubbu
- Institute of Genomics and Integrative Biology, Council for Scientific and Industrial Research, Mall Road, Delhi 110007, India
| | | | | | | |
Collapse
|
79
|
Zemojtel T, Penzkofer T, Schultz J, Dandekar T, Badge R, Vingron M. Exonization of active mouse L1s: a driver of transcriptome evolution? BMC Genomics 2007; 8:392. [PMID: 17963496 PMCID: PMC2176070 DOI: 10.1186/1471-2164-8-392] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Accepted: 10/26/2007] [Indexed: 12/02/2022] Open
Abstract
Background Long interspersed nuclear elements (LINE-1s, L1s) have been recently implicated in the regulation of mammalian transcriptomes. Results Here, we show that members of the three active mouse L1 subfamilies (A, GF and TF) contain, in addition to those on their sense strands, conserved functional splice sites on their antisense strands, which trigger multiple exonization events. The latter is particularly intriguing in the light of the strong antisense orientation bias of intronic L1s, implying that the toleration of antisense insertions results in an increased potential for exonization. Conclusion In a genome-wide analysis, we have uncovered evidence suggesting that the mobility of the large number of retrotransposition-competent mouse L1s (~2400 potentially active L1s in NCBIm35) has significant potential to shape the mouse transcriptome by continuously generating insertions into transcriptional units.
Collapse
Affiliation(s)
- Tomasz Zemojtel
- Department of Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Ihnestrasse 73, D-14195 Berlin, Germany.
| | | | | | | | | | | |
Collapse
|
80
|
Abstract
While less than 1.5% of the mammalian genome encodes proteins, it is now evident that the vast majority is transcribed, mainly into non-protein-coding RNAs. This raises the question of what fraction of the genome is functional, i.e., composed of sequences that yield functional products, are required for the expression (regulation or processing) of these products, or are required for chromosome replication and maintenance. Many of the observed noncoding transcripts are differentially expressed, and, while most have not yet been studied, increasing numbers are being shown to be functional and/or trafficked to specific subcellular locations, as well as exhibit subtle evidence of selection. On the other hand, analyses of conservation patterns indicate that only approximately 5% (3%-8%) of the human genome is under purifying selection for functions common to mammals. However, these estimates rely on the assumption that reference sequences (usually ancient transposon-derived sequences) have evolved neutrally, which may not be the case, and if so would lead to an underestimate of the fraction of the genome under evolutionary constraint. These analyses also do not detect functional sequences that are evolving rapidly and/or have acquired lineage-specific functions. Indeed, many regulatory sequences and known functional noncoding RNAs, including many microRNAs, are not conserved over significant evolutionary distances, and recent evidence from the ENCODE project suggests that many functional elements show no detectable level of sequence constraint. Thus, it is likely that much more than 5% of the genome encodes functional information, and although the upper bound is unknown, it may be considerably higher than currently thought.
Collapse
Affiliation(s)
- Michael Pheasant
- ARC Special Research Centre for Functional and Applied Genomics, Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland 4072, Australia
| | | |
Collapse
|
81
|
Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB. Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 2007; 17:839-51. [PMID: 17568002 PMCID: PMC1891343 DOI: 10.1101/gr.5586307] [Citation(s) in RCA: 152] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Arising from either retrotransposition or genomic duplication of functional genes, pseudogenes are "genomic fossils" valuable for exploring the dynamics and evolution of genes and genomes. Pseudogene identification is an important problem in computational genomics, and is also critical for obtaining an accurate picture of a genome's structure and function. However, no consensus computational scheme for defining and detecting pseudogenes has been developed thus far. As part of the ENCyclopedia Of DNA Elements (ENCODE) project, we have compared several distinct pseudogene annotation strategies and found that different approaches and parameters often resulted in rather distinct sets of pseudogenes. We subsequently developed a consensus approach for annotating pseudogenes (derived from protein coding genes) in the ENCODE regions, resulting in 201 pseudogenes, two-thirds of which originated from retrotransposition. A survey of orthologs for these pseudogenes in 28 vertebrate genomes showed that a significant fraction ( approximately 80%) of the processed pseudogenes are primate-specific sequences, highlighting the increasing retrotransposition activity in primates. Analysis of sequence conservation and variation also demonstrated that most pseudogenes evolve neutrally, and processed pseudogenes appear to have lost their coding potential immediately or soon after their emergence. In order to explore the functional implication of pseudogene prevalence, we have extensively examined the transcriptional activity of the ENCODE pseudogenes. We performed systematic series of pseudogene-specific RACE analyses. These, together with complementary evidence derived from tiling microarrays and high throughput sequencing, demonstrated that at least a fifth of the 201 pseudogenes are transcribed in one or more cell lines or tissues.
Collapse
Affiliation(s)
- Deyou Zheng
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Corresponding authors.E-mail ; fax (360) 838-7861.E-mail ; fax (360) 838-7861
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1HH, United Kingdom
| | - Robert Baertsch
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA
| | | | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Siew Woh Choo
- Genome Institute of Singapore, Singapore 138672, Singapore
| | - Yontao Lu
- Department of Biomolecular Engineering, University of California, Santa Cruz, Santa Cruz, California 95064, USA
| | - France Denoeud
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Michael Snyder
- Molecular, Cellular & Developmental Biology Department, Yale University, New Haven, Connecticut 06520, USA
| | - Yijun Ruan
- Genome Institute of Singapore, Singapore 138672, Singapore
| | - Chia-Lin Wei
- Genome Institute of Singapore, Singapore 138672, Singapore
| | | | - Roderic Guigó
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
- Center for Genomic Regulation, Passeig Marítim de la Barceloneta, 37-49, 08003, Barcelona, Catalonia, Spain
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1HH, United Kingdom
| | - Mark B. Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
- Department of Computer Science, Yale University, New Haven, Connecticut 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA
- Corresponding authors.E-mail ; fax (360) 838-7861.E-mail ; fax (360) 838-7861
| |
Collapse
|
82
|
Abstract
Long interspersed nucleotide element (LINE)-1 retrotransposon (L1) has emerged as the largest contributor to mammalian genome mass, responsible for over 35% of the human genome. Differences in the number and activity levels of L1s contribute to interindividual variation in humans, both by affecting an individual's likelihood of acquiring new L1-mediated mutations, as well as by differentially modifying gene expression. Here, we report on recent progress in understanding L1 biology, with a focus on mechanisms of L1-mediated disease. We discuss known details of L1 life cycle, including L1 structure, transcriptional regulation, and the mechanisms of translation and retrotransposition. Current views on cell type specificity, timing, and control of retrotransposition are put forth. Finally, we discuss the role of L1 as a mutagen, using the latest findings in L1 biology to illuminate molecular mechanisms of L1-mediated gene disruption.
Collapse
Affiliation(s)
- Daria V Babushok
- Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6145, USA
| | | |
Collapse
|
83
|
Hedges DJ, Deininger PL. Inviting instability: Transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res 2006; 616:46-59. [PMID: 17157332 PMCID: PMC1850990 DOI: 10.1016/j.mrfmmm.2006.11.021] [Citation(s) in RCA: 214] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The ubiquity of mobile elements in mammalian genomes poses considerable challenges for the maintenance of genome integrity. The predisposition of mobile elements towards participation in genomic rearrangements is largely a consequence of their interspersed homologous nature. As tracts of nonallelic sequence homology, they have the potential to interact in a disruptive manner during both meiotic recombination and DNA repair processes, resulting in genomic alterations ranging from deletions and duplications to large-scale chromosomal rearrangements. Although the deleterious effects of transposable element (TE) insertion events have been extensively documented, it is arguably through post-insertion genomic instability that they pose the greatest hazard to their host genomes. Despite the periodic generation of important evolutionary innovations, genomic alterations involving TE sequences are far more frequently neutral or deleterious in nature. The potentially negative consequences of this instability are perhaps best illustrated by the >25 human genetic diseases that are attributable to TE-mediated rearrangements. Some of these rearrangements, such as those involving the MLL locus in leukemia and the LDL receptor in familial hypercholesterolemia, represent recurrent mutations that have independently arisen multiple times in human populations. While TE-instability has been a potent force in shaping eukaryotic genomes and a significant source of genetic disease, much concerning the mechanisms governing the frequency and variety of these events remains to be clarified. Here we survey the current state of knowledge regarding the mechanisms underlying mobile element-based genetic instability in mammals. Compared to simpler eukaryotic systems, mammalian cells appear to have several modifications to their DNA-repair ensemble that allow them to better cope with the large amount of interspersed homology that has been generated by TEs. In addition to the disruptive potential of nonallelic sequence homology, we also consider recent evidence suggesting that the endonuclease products of TEs may also play a key role in instigating mammalian genomic instability.
Collapse
Affiliation(s)
- D J Hedges
- Tulane Cancer Center, SL66 and Department of Epidemiology, Tulane University Health Sciences Center, 1430 Tulane Avenue, New Orleans, LA 70112, USA
| | | |
Collapse
|
84
|
Lee Y, Ise T, Ha D, Saint Fleur A, Hahn Y, Liu XF, Nagata S, Lee B, Bera TK, Pastan I. Evolution and expression of chimeric POTE-actin genes in the human genome. Proc Natl Acad Sci U S A 2006; 103:17885-90. [PMID: 17101985 PMCID: PMC1693842 DOI: 10.1073/pnas.0608344103] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We previously described a primate-specific gene family, POTE, that is expressed in many cancers but in a limited number of normal organs. The 13 POTE genes are dispersed among eight different chromosomes and evolved by duplications and remodeling of the human genome from an ancestral gene, ANKRD26. Based on sequence similarity, the POTE gene family members can be divided into three groups. By genome database searches, we identified an actin retroposon insertion at the carboxyl terminus of one of the ancestral POTE paralogs. By Northern blot analysis, we identified the expected 7.5-kb POTE-actin chimeric transcript in a breast cancer cell line. The protein encoded by the POTE-actin transcript is predicted to be 120 kDa in size. Using anti-POTE mAbs that recognize the amino-terminal portion of the POTE protein, we detected the 120-kDa POTE-actin fusion protein in breast cancer cell lines known to express the fusion transcript. These data demonstrate that insertion of a retroposon produced an altered functional POTE gene. This example indicates that new functional human genes can evolve by insertion of retroposons.
Collapse
Affiliation(s)
- Yoomi Lee
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Tomoko Ise
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Duc Ha
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Ashley Saint Fleur
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Yoonsoo Hahn
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Xiu-Fen Liu
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Satoshi Nagata
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Byungkook Lee
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Tapan K. Bera
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| | - Ira Pastan
- Laboratory of Molecular Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892-4264
| |
Collapse
|
85
|
Gasior SL, Preston G, Hedges DJ, Gilbert N, Moran JV, Deininger PL. Characterization of pre-insertion loci of de novo L1 insertions. Gene 2006; 390:190-8. [PMID: 17067767 PMCID: PMC1850991 DOI: 10.1016/j.gene.2006.08.024] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Revised: 08/21/2006] [Accepted: 08/22/2006] [Indexed: 10/24/2022]
Abstract
The human Long Interspersed Element-1 (LINE-1) and the Short Interspersed Element (SINE) Alu comprise 28% of the human genome. They share the same L1-encoded endonuclease for insertion, which recognizes an A+T-rich sequence. Under a simple model of insertion distribution, this nucleotide preference would lead to the prediction that the populations of both elements would be biased towards A+T-rich regions. Genomic L1 elements do show an A+T-rich bias. In contrast, Alu is biased towards G+C-rich regions when compared to the genome average. Several analyses have demonstrated that relatively recent insertions of both elements show less G+C content bias relative to older elements. We have analyzed the repetitive element and G+C composition of more than 100 pre-insertion loci derived from de novo L1 insertions in cultured human cancer cells, which should represent an evolutionarily unbiased set of insertions. An A+T-rich bias is observed in the 50 bp flanking the endonuclease target site, consistent with the known target site for the L1 endonuclease. The L1, Alu, and G+C content of 20 kb of the de novo pre-insertion loci shows a different set of biases than that observed for fixed L1s in the human genome. In contrast to the insertion sites of genomic L1s, the de novo L1 pre-insertion loci are relatively L1-poor, Alu-rich and G+C neutral. Finally, a statistically significant cluster of de novo L1 insertions was localized in the vicinity of the c-myc gene. These results suggest that the initial insertion preference of L1, while A+T-rich in the initial vicinity of the break site, can be influenced by the broader content of the flanking genomic region and have implications for understanding the dynamics of L1 and Alu distributions in the human genome.
Collapse
Affiliation(s)
- Stephen L. Gasior
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Graeme Preston
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Dale J. Hedges
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Nicolas Gilbert
- Institut de Génétique Humaine, CNRS, UPR 1142, 141 rue de la Cardonille, 34396 Montpellier cedex 5, France
| | - John V. Moran
- Departments of Human Genetics and Internal Medicine, 1241 E. Catherine St., University of Michigan Medical School, Ann Arbor, Michigan 48109-0618
| | - Prescott L. Deininger
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
- *Address for Correspondence: Tulane Cancer Center, SL66, Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112, 504-988-6385,
| |
Collapse
|
86
|
Sivasubbu S, Balciunas D, Davidson AE, Pickart MA, Hermanson SB, Wangensteen KJ, Wolbrink DC, Ekker SC. Gene-breaking transposon mutagenesis reveals an essential role for histone H2afza in zebrafish larval development. Mech Dev 2006; 123:513-29. [PMID: 16859902 DOI: 10.1016/j.mod.2006.06.002] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2005] [Revised: 06/02/2006] [Accepted: 06/02/2006] [Indexed: 12/11/2022]
Abstract
We report a novel gene tagging, identification and mutagenicity ('gene-breaking') method for the zebrafish, Danio rerio. This modular approach consists of two distinct and separable molecular cassettes. The first is a gene-finding cassette. In this study, we employed a 3' gene-tagging approach that selectively 'traps' transcripts regardless of expression status, and we show that this cassette identifies both known and novel endogenous transcripts in transgenic zebrafish. The second is a transcriptional termination mutagenicity cassette assembled from a combination of a splice acceptor and polyadenylation signal to disrupt tagged transcripts upon integration into intronic sequence. We identified both novel and conserved loci as linked phenotypic mutations using this gene-breaking strategy, generating molecularly null mutations in both larval lethal and adult viable loci. We show that the Histone 2a family member z (H2afza) variant is essential for larval development through the generation of a lethal locus with a truncation of conserved carboxy-terminal residues in the protein. In principle this gene-breaking strategy is scalable for functional genomics screens and can be used in Sleeping Beauty transposon and other gene delivery systems in the zebrafish.
Collapse
Affiliation(s)
- Sridhar Sivasubbu
- University of Minnesota, Department of Genetics, Cell Biology and Development, Arnold and Mabel Beckman Center for Transposon Research, 321 Church St SE, 6-160 Jackson Hall, Minneapolis, MN 55455, USA
| | | | | | | | | | | | | | | |
Collapse
|