51
|
Ladevèze V, Chaminade N, Lemeunier F, Periquet G, Aulard S. General survey of hAT transposon superfamily with highlight on hobo element in Drosophila. Genetica 2012; 140:375-92. [DOI: 10.1007/s10709-012-9687-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Accepted: 10/10/2012] [Indexed: 11/30/2022]
|
52
|
Distinct groups of repetitive families preserved in mammals correspond to different periods of regulatory innovations in vertebrates. Biol Direct 2012; 7:36. [PMID: 23098210 PMCID: PMC3500645 DOI: 10.1186/1745-6150-7-36] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Accepted: 10/23/2012] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Mammalian genomes are repositories of repetitive DNA sequences derived from transposable elements (TEs). Typically, TEs generate multiple, mostly inactive copies of themselves, commonly known as repetitive families or families of repeats. Recently, we proposed that families of TEs originate in small populations by genetic drift and that the origin of small subpopulations from larger populations can be fueled by biological innovations. RESULTS We report three distinct groups of repetitive families preserved in the human genome that expanded and declined during the three previously described periods of regulatory innovations in vertebrate genomes. The first group originated prior to the evolutionary separation of the mammalian and bird lineages and the second one during subsequent diversification of the mammalian lineages prior to the origin of eutherian lineages. The third group of families is primate-specific. CONCLUSIONS The observed correlation implies a relationship between regulatory innovations and the origin of repetitive families. Consistent with our previous hypothesis, it is proposed that regulatory innovations fueled the origin of new subpopulations in which new repetitive families became fixed by genetic drift.
Collapse
|
53
|
Han JS, Shao S. Circular retrotransposition products generated by a LINE retrotransposon. Nucleic Acids Res 2012; 40:10866-77. [PMID: 22977178 PMCID: PMC3510499 DOI: 10.1093/nar/gks859] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Non-long terminal repeat (non-LTR) retrotransposons are highly abundant elements that are present in chromosomes throughout the eukaryotic domain of life. The long interspersed nuclear element (LINE-1) (L1) clade of non-LTR retrotransposons has been particularly successful in mammals, accounting for 30–40% of human genome sequence. The current model of LINE retrotransposition, target-primed reverse transcription, culminates in a chromosomally integrated end product. Using a budding yeast model of non-LTR retrotransposition, we show that in addition to producing these ‘classical’, chromosomally integrated products, a fungal L1 clade member (Zorro3) can generate abundant, RNA-derived episomal products. Genetic evidence suggests that these products are likely to be formed via a variation of target-primed reverse transcription. These episomal products are a previously unseen alternative fate of LINE retrotransposition, and may represent an unexpected source for de novo retrotransposition.
Collapse
Affiliation(s)
- Jeffrey S Han
- Department of Embryology, Carnegie Institution for Science, Baltimore, MD 21218, USA.
| | | |
Collapse
|
54
|
Testori A, Caizzi L, Cutrupi S, Friard O, De Bortoli M, Cora' D, Caselle M. The role of Transposable Elements in shaping the combinatorial interaction of Transcription Factors. BMC Genomics 2012; 13:400. [PMID: 22897927 PMCID: PMC3478180 DOI: 10.1186/1471-2164-13-400] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Accepted: 06/28/2012] [Indexed: 12/22/2022] Open
Abstract
Background In the last few years several studies have shown that Transposable Elements (TEs) in the human genome are significantly associated with Transcription Factor Binding Sites (TFBSs) and that in several cases their expansion within the genome led to a substantial rewiring of the regulatory network. Another important feature of the regulatory network which has been thoroughly studied is the combinatorial organization of transcriptional regulation. In this paper we combine these two observations and suggest that TEs, besides rewiring the network, also played a central role in the evolution of particular patterns of combinatorial gene regulation. Results To address this issue we searched for TEs overlapping Estrogen Receptor α (ERα) binding peaks in two publicly available ChIP-seq datasets from the MCF7 cell line corresponding to different modalities of exposure to estrogen. We found a remarkable enrichment of a few specific classes of Transposons. Among these a prominent role was played by MIR (Mammalian Interspersed Repeats) transposons. These TEs underwent a dramatic expansion at the beginning of the mammalian radiation and then stabilized. We conjecture that the special affinity of ERα for the MIR class of TEs could be at the origin of the important role assumed by ERα in Mammalians. We then searched for TFBSs within the TEs overlapping ChIP-seq peaks. We found a strong enrichment of a few precise combinations of TFBS. In several cases the corresponding Transcription Factors (TFs) were known cofactors of ERα, thus supporting the idea of a co-regulatory role of TFBS within the same TE. Moreover, most of these correlations turned out to be strictly associated to specific classes of TEs thus suggesting the presence of a well-defined "transposon code" within the regulatory network. Conclusions In this work we tried to shed light into the role of Transposable Elements (TEs) in shaping the regulatory network of higher eukaryotes. To test this idea we focused on a particular transcription factor: the Estrogen Receptor α (ERα) and we found that ERα preferentially targets a well defined set of TEs and that these TEs host combinations of transcriptional regulators involving several of known co-regulators of ERα. Moreover, a significant number of these TEs turned out to be conserved between human and mouse and located in the vicinity (and thus candidate to be regulators) of important estrogen-related genes.
Collapse
Affiliation(s)
- Alessandro Testori
- Center for Molecular Systems Biology, University of Turin, Turin, Candiolo I-10060, Italy.
| | | | | | | | | | | | | |
Collapse
|
55
|
Ha HS, Moon JW, Gim JA, Jung YD, Ahn K, Oh KB, Kim TH, Seong HH, Kim HS. Identification and characterization of transposable element-mediated chimeric transcripts from porcine Refseq and EST databases. Genes Genomics 2012. [DOI: 10.1007/s13258-011-0212-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
56
|
Transformation of a transposon into a derived prolactin promoter with function during human pregnancy. Proc Natl Acad Sci U S A 2012; 109:11246-51. [PMID: 22733751 DOI: 10.1073/pnas.1118566109] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Transposable elements (TEs) are known to provide DNA for host regulatory functions, but the mechanisms underlying the transformation of TEs into cis-regulatory elements are unclear. In humans two TEs--MER20 and MER39--contribute the enhancer/promoter for decidual prolactin (dPRL), which is dramatically induced during pregnancy. We show that evolution of the strong human dPRL promoter was a multistep process that took millions of years. First, MER39 inserted near MER20 in the primate/rodent ancestor, and then there were two phases of activity enhancement in primates. Through the mapping of causal nucleotide substitutions, we demonstrate that strong promoter activity in apes involves epistasis between transcription factor binding sites (TFBSs) ancestral to MER39 and derived sites. We propose a mode of molecular evolution that describes the process by which MER20/MER39 was transformed into a strong promoter, called "epistatic capture." Epistatic capture is the stabilization of a TFBS that is ancestral but variable in outgroup lineages, and is fixed in the ingroup because of epistatic interactions with derived TFBSs. Finally, we note that evolution of human promoter activity coincides with the emergence of a unique reproductive character in apes, highly invasive placentation. Because prolactin communicates with immune cells during pregnancy, which regulate fetal invasion into maternal tissues, we speculate that ape dPRL promoter activity evolved in response to increased invasiveness of ape fetal tissue.
Collapse
|
57
|
Metcalfe CJ, Filée J, Germon I, Joss J, Casane D. Evolution of the Australian lungfish (Neoceratodus forsteri) genome: a major role for CR1 and L2 LINE elements. Mol Biol Evol 2012; 29:3529-39. [PMID: 22734051 DOI: 10.1093/molbev/mss159] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Haploid genomes greater than 25,000 Mb are rare, within the animals only the lungfish and some of the salamanders and crustaceans are known to have genomes this large. There is very little data on the structure of genomes this size. It is known, however, that for animal genomes up to 3,000 Mb, there is in general a good correlation between genome size and the percent of the genome composed of repetitive sequence and that this repetitive component is highly dynamic. In this study, we sampled the Australian lungfish genome using three mini-genomic libraries and found that with very little sequence, the results converged on an estimate of 40% of the genome being composed of recognizable transposable elements (TEs), chiefly from the CR1 and L2 long interspersed nuclear element clades. We further characterized the CR1 and L2 elements in the lungfish genome and show that although most CR1 elements probably represent recent amplifications, the L2 elements are more diverse and are more likely the result of a series of amplifications. We suggest that our sampling method has probably underestimated the recognizable TE content. However, on the basis of the most likely sources of error, we suggest that this very large genome is not largely composed of recently amplified, undetected TEs but may instead include a large component of older degenerate TEs. Based on these estimates, and on Thomson's (Thomson K. 1972. An attempt to reconstruct evolutionary changes in the cellular DNA content of lungfish. J Exp Zool. 180:363-372) inference that in the lineage leading to the extant Australian lungfish, there was massive increase in genome size between 350 and 200 mya, after which the size of the genome changed little, we speculate that the very large Australian lungfish genome may be the result of a massive amplification of TEs followed by a long period with a very low rate of sequence removal and some ongoing TE activity.
Collapse
Affiliation(s)
- Cushla J Metcalfe
- Laboratoire Evolution, Génomes et Spéciation, Centre National de la Recherche Scientifique, Gif-sur-Yvette, and Université Paris Diderot, Paris, France
| | | | | | | | | |
Collapse
|
58
|
Intron Retention and TE Exonization Events in ZRANB2. Comp Funct Genomics 2012; 2012:170208. [PMID: 22778693 PMCID: PMC3384923 DOI: 10.1155/2012/170208] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2012] [Revised: 05/14/2012] [Accepted: 05/14/2012] [Indexed: 12/17/2022] Open
Abstract
The Zinc finger, RAN-binding domain-containing protein 2 (ZRANB2), contains arginine/serine-rich (RS) domains that mediate its function in the regulation of alternative splicing. The ZRANB2 gene contains 2 LINE elements (L3b, Plat_L3) between the 9th and 10th exons. We identified the exonization event of a LINE element (Plat_L3). Using genomic PCR, RT-PCR amplification, and sequencing of primate DNA and RNA samples, we analyzed the evolutionary features of ZRANB2 transcripts. The results indicated that 2 of the LINE elements were integrated in human and all of the tested primate samples (hominoids: 3 species; Old World monkey: 8 species; New World monkey: 6 species; prosimian: 1 species). Human, rhesus monkey, crab-eating monkey, African-green monkey, and marmoset harbor the exon derived from LINE element (Plat_L3). RT-PCR amplification revealed the long transcripts and their differential expression patterns. Intriguingly, these long transcripts were abundantly expressed in Old World monkey lineages (rhesus, crab-eating, and African-green monkeys) and were expressed via intron retention (IR). Thus, the ZRANB2 gene produces 3 transcript variants in which the Cterminus varies by transposable elements (TEs) exonization and IR mechanisms. Therefore, ZRANB2 is valuable for investigating the evolutionary mechanisms of TE exonization and IR during primate evolution.
Collapse
|
59
|
Abstract
Repetitive sequences, especially transposon-derived interspersed repetitive elements, account for a large fraction of the genome in most eukaryotes. Despite the repetitive nature, these transposable elements display quantitative and qualitative differences even among species of the same lineage. Although transposable elements contribute greatly as a driving force to the biological diversity during evolution, they can induce embryonic lethality and genetic disorders as a result of insertional mutagenesis and genomic rearrangement. Temporary relaxation of the epigenetic control of retrotransposons during early germline development opens a risky window that can allow retrotransposons to escape from host constraints and to propagate abundantly in the host genome. Because germline mutations caused by retrotransposon activation are heritable and thus can be deleterious to the offspring, an adaptive strategy has evolved in host cells, especially in the germline. In this review, we will attempt to summarize general defense mechanisms deployed by the eukaryotic genome, with an emphasis on pathways utilized by the male germline to confer retrotransposon silencing.
Collapse
Affiliation(s)
- Jianqiang Bao
- Department of Physiology and Cell Biology, University of Nevada School of Medicine, Reno, Nevada, USA
| | | |
Collapse
|
60
|
Feng ZP, Chandrashekaran IR, Low A, Speed TP, Nicholson SE, Norton RS. The N-terminal domains of SOCS proteins: a conserved region in the disordered N-termini of SOCS4 and 5. Proteins 2012; 80:946-57. [PMID: 22423360 DOI: 10.1002/prot.23252] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Suppressors of cytokine signaling (SOCS) proteins function as negative regulators of cytokine signaling and are involved in fine tuning the immune response. The structure and role of the SH2 domains and C-terminal SOCS box motifs of the SOCS proteins are well characterized, but the long N-terminal domains of SOCS4-7 remain poorly understood. Here, we present bioinformatic analyses of the N-terminal domains of the mammalian SOCS proteins, which indicate that these domains of SOCS4, 5, 6, and 7 are largely disordered. We have also identified a conserved region of about 70 residues in the N-terminal domains of SOCS4 and 5 that is predicted to be more ordered than the surrounding sequence. The conservation of this region can be traced as far back as lower vertebrates. As conserved regions with increased structural propensity that are located within long disordered regions often contain molecular recognition motifs, we expressed the N-terminal conserved region of mouse SOCS4 for further analysis. This region, mSOCS4₈₆₋₁₅₅, has been characterized by circular dichroism and nuclear magnetic resonance spectroscopy, both of which indicate that it is predominantly unstructured in aqueous solution, although it becomes helical in the presence of trifluoroethanol. The high degree of sequence conservation of this region across different species and between SOCS4 and SOCS5 nonetheless implies that it has an important functional role, and presumably this region adopts a more ordered conformation in complex with its partners. The recombinant protein will be a valuable tool in identifying these partners and defining the structures of these complexes.
Collapse
Affiliation(s)
- Zhi-Ping Feng
- The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia
| | | | | | | | | | | |
Collapse
|
61
|
Redi CA, Capanna E. Genome size evolution: sizing mammalian genomes. Cytogenet Genome Res 2012; 137:97-112. [PMID: 22627028 DOI: 10.1159/000338820] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The study of genome size (GS) and its variation is so fascinating to the scientific community because it constitutes the link between the present-day analytical and molecular studies of the genome and the old trunk of the holistic and synthetic view of the genome. The GS of several taxa vary over a broad range and do not correlate with the complexity of the organisms (the C-value paradox). However, the biology of transposable elements has let us reach a satisfactory view of the molecular mechanisms that give rise to GS variation and novelties, providing a less perplexing view of the significance of the GS (C-enigma). The knowledge of the composition and structure of a genome is a pre-requisite for trying to understand the evolution of the main genome signature: its size. The radiation of mammals provides an approximately 180-million-year test case for theories of how GS evolves. It has been found from data-mining GS databases that GS is a useful cyto-taxonomical instrument at the level of orders/superorders, providing genomic signatures characterizing Monotremata, Marsupialia, Afrotheria, Xenarthra, Laurasiatheria, and Euarchontoglires. A hypothetical ancestral mammalian-like GS of 2.9-3.7 pg has been suggested. This value appears compatible with the average values calculated for the high systematic levels of the extant Monotremata (∼2.97 pg) and Marsupialia (∼4.07 pg), suggesting invasion of mobile DNA elements concurrently with the separation of the older clades of Afrotheria (∼5.5 pg) and Xenarthra (∼4.5 pg) with larger GS, leaving the Euarchontoglires (∼3.4 pg) and Laurasiatheria (∼2.8 pg) genomes with fewer transposable elements. However, the paucity of GS data (546 mammalian species sized from 5,488 living species) for species, genera, and families calls for caution. Considering that mammalian species may be vanished even before they are known, GS data are sorely needed to phenotype the effects brought about by their variation and to validate any hypotheses on GS evolution in mammals.
Collapse
Affiliation(s)
- C A Redi
- Fondazione IRCCS Policlinico San Matteo, Dipartimento di Biologia e Biotecnologie Lazzaro Spallanzani, Pavia, Italia.
| | | |
Collapse
|
62
|
Carbone L, Harris RA, Mootnick AR, Milosavljevic A, Martin DIK, Rocchi M, Capozzi O, Archidiacono N, Konkel MK, Walker JA, Batzer MA, de Jong PJ. Centromere remodeling in Hoolock leuconedys (Hylobatidae) by a new transposable element unique to the gibbons. Genome Biol Evol 2012; 4:648-58. [PMID: 22593550 PMCID: PMC3606032 DOI: 10.1093/gbe/evs048] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Gibbons (Hylobatidae) shared a common ancestor with the other hominoids only 15–18 million years ago. Nevertheless, gibbons show very distinctive features that include heavily rearranged chromosomes. Previous observations indicate that this phenomenon may be linked to the attenuated epigenetic repression of transposable elements (TEs) in gibbon species. Here we describe the massive expansion of a repeat in almost all the centromeres of the eastern hoolock gibbon (Hoolock leuconedys). We discovered that this repeat is a new composite TE originating from the combination of portions of three other elements (L1ME5, AluSz6, and SVA_A) and thus named it LAVA. We determined that this repeat is found in all the gibbons but does not occur in other hominoids. Detailed investigation of 46 different LAVA elements revealed that the majority of them have target site duplications (TSDs) and a poly-A tail, suggesting that they have been retrotransposing in the gibbon genome. Although we did not find a direct correlation between the emergence of LAVA elements and human–gibbon synteny breakpoints, this new composite transposable element is another mark of the great plasticity of the gibbon genome. Moreover, the centromeric expansion of LAVA insertions in the hoolock closely resembles the massive centromeric expansion of the KERV-1 retroelement reported for wallaby (marsupial) interspecific hybrids. The similarity between the two phenomena is consistent with the hypothesis that evolution of the gibbons is characterized by defects in epigenetic repression of TEs, perhaps triggered by interspecific hybridization.
Collapse
Affiliation(s)
- Lucia Carbone
- Children's Hospital Oakland Research Institute, Oakland, CA, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
63
|
Nilsson MA, Janke A, Murchison EP, Ning Z, Hallström BM. Expansion of CORE-SINEs in the genome of the Tasmanian devil. BMC Genomics 2012; 13:172. [PMID: 22559330 PMCID: PMC3403934 DOI: 10.1186/1471-2164-13-172] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 05/06/2012] [Indexed: 11/22/2022] Open
Abstract
Background The genome of the carnivorous marsupial, the Tasmanian devil (Sarcophilus harrisii, Order: Dasyuromorphia), was sequenced in the hopes of finding a cure for or gaining a better understanding of the contagious devil facial tumor disease that is threatening the species’ survival. To better understand the Tasmanian devil genome, we screened it for transposable elements and investigated the dynamics of short interspersed element (SINE) retroposons. Results The temporal history of Tasmanian devil SINEs, elucidated using a transposition in transposition analysis, indicates that WSINE1, a CORE-SINE present in around 200,000 copies, is the most recently active element. Moreover, we discovered a new subtype of WSINE1 (WSINE1b) that comprises at least 90% of all Tasmanian devil WSINE1s. The frequencies of WSINE1 subtypes differ in the genomes of two of the other Australian marsupial orders. A co-segregation analysis indicated that at least 66 subfamilies of WSINE1 evolved during the evolution of Dasyuromorphia. Using a substitution rate derived from WSINE1 insertions, the ages of the subfamilies were estimated and correlated with a newly established phylogeny of Dasyuromorphia. Phylogenetic analyses and divergence time estimates of mitochondrial genome data indicate a rapid radiation of the Tasmanian devil and the closest relative the quolls (Dasyurus) around 14 million years ago. Conclusions The radiation and abundance of CORE-SINEs in marsupial genomes indicates that they may be a major player in the evolution of marsupials. It is evident that the early phases of evolution of the carnivorous marsupial order Dasyuromorphia was characterized by a burst of SINE activity. A correlation between a speciation event and a major burst of retroposon activity is for the first time shown in a marsupial genome.
Collapse
Affiliation(s)
- Maria A Nilsson
- LOEWE-Biodiversity and Climate Research Center, BiK-F, Senckenberganlage 25, Frankfurt am Main D-60325, Germany.
| | | | | | | | | |
Collapse
|
64
|
Franchini LF, de Souza FS, Low MJ, Rubinstein M. Positive selection of co-opted mobile genetic elements in a mammalian gene: If you can't beat them, join them. Mob Genet Elements 2012; 2:106-109. [PMID: 22934245 PMCID: PMC3429518 DOI: 10.4161/mge.20267] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The proopiomelanocortin (Pomc) gene encodes a prepropeptide with essential functions in the response to stress and energy balance, which is expressed in the pituitary and hypothalamus of vertebrate animals. Neuronal expression of Pomc is controlled by two distal enhancers named nPE1 and nPE2. Using transgenic mice, we observed that both enhancers drive identical expression patterns in the mammalian hypothalamus, starting at embryonic day 10.5, when endogenous Pomc expression commences. This overlapping enhancer activity is maintained throughout hypothalamic development and into adulthood. We also found that nPE1 and nPE2 were exapted as neuronal enhancers into the POMC locus after the sequential insertion of two unrelated retroposons. Thus, nPE1 and nPE2 are functional analogs and represent an authentic first example of convergent molecular evolution of cell-specific transcriptional enhancers. In this Commentary we discuss the following questions that remain unanswered: (1) how does transcriptional control of POMC operate in hypothalamic neurons of non-mammalian vertebrates? (2) What evolutionary forces are maintaining two discrete neuronal POMC enhancers under purifying selection for the last ~100 million years in all placental mammals? (3) What is the contribution of MaLRs to genome evolution?
Collapse
Affiliation(s)
- Lucia F. Franchini
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular; Consejo Nacional de Investigaciones Científicas y Técnicas; Buenos Aires, Argentina
| | - Flavio S.J. de Souza
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular; Consejo Nacional de Investigaciones Científicas y Técnicas and Facultad de Ciencias Exactas y Naturales; Universidad de Buenos Aires; Buenos Aires, Argentina
| | - Malcolm J. Low
- Department of Molecular and Integrative Physiology; University of Michigan; Ann Arbor, MI USA
| | - Marcelo Rubinstein
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular; Consejo Nacional de Investigaciones Científicas y Técnicas and Facultad de Ciencias Exactas y Naturales; Universidad de Buenos Aires; Buenos Aires, Argentina
| |
Collapse
|
65
|
Klimopoulos A, Sellis D, Almirantis Y. Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. Gene 2012; 499:88-98. [PMID: 22370293 DOI: 10.1016/j.gene.2012.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Revised: 02/05/2012] [Accepted: 02/06/2012] [Indexed: 11/25/2022]
Abstract
Repetitive DNA sequences derived from transposable elements (TE) are distributed in a non-random way, co-clustering with other classes of repeat elements, genes and other genomic components. In a previous work we reported power-law-like size distributions (linearity in log-log scale) in the spatial arrangement of Alu and LINE1 elements in the human genome. Here we investigate the large-scale features of the spatial arrangement of all principal classes of TEs in 14 genomes from phylogenetically distant organisms by studying the size distribution of inter-repeat distances. Power-law-like size distributions are found to be widespread, extending up to several orders of magnitude. In order to understand the emergence of this distributional pattern, we introduce an evolutionary scenario, which includes (i) Insertions of DNA segments (e.g., more recent repeats) into the considered sequence and (ii) Eliminations of members of the studied TE family. In the proposed model we also incorporate the potential for transposition events (characteristic of the DNA transposons' life-cycle) and segmental duplications. Simulations reproduce the main features of the observed size distributions. Furthermore, we investigate the effects of various genomic features on the presence and extent of power-law size distributions including TE class and age, mode of parental TE transmission, GC content, deletion and recombination rates in the studied genomic region, etc. Our observations corroborate the hypothesis that insertions of genomic material and eliminations of repeats are at the basis of power-laws in inter-repeat distances. The existence of these power-laws could facilitate the formation of the recently proposed "fractal globule" for the confined chromatin organization.
Collapse
Affiliation(s)
- Alexandros Klimopoulos
- National Center for Scientific Research "Demokritos," Institute of Biology, 153 10 Athens, Greece.
| | | | | |
Collapse
|
66
|
Bire S, Rouleux-Bonnin F. Transposable elements as tools for reshaping the genome: it is a huge world after all! Methods Mol Biol 2012; 859:1-28. [PMID: 22367863 DOI: 10.1007/978-1-61779-603-6_1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Transposable elements (TEs) are discrete pieces of DNA that can move from one site to another within genomes and sometime between genomes. They are found in all major branches of life. Because of their wide distribution and considerable diversity, they are a considerable source of genomic variation and as such, they constitute powerful drivers of genome evolution. Moreover, it is becoming clear that the epigenetic regulation of certain genes is derived from defense mechanisms against the activity of ancestral transposable elements. TEs now tend to be viewed as natural molecular tools that can reshape the genome, which challenges the idea that TEs are natural tools used to answer biological questions. In the first part of this chapter, we review the classification and distribution of TEs, and look at how they have contributed to the structural and transcriptional reshaping of genomes. In the second part, we describe methodological innovations that have modified their contribution as molecular tools.
Collapse
Affiliation(s)
- Solenne Bire
- GICC, UMR CNRS 6239, Université François Rabelais, UFR des Sciences et Technques, Tours, France
| | | |
Collapse
|
67
|
Tashiro K, Teissier A, Kobayashi N, Nakanishi A, Sasaki T, Yan K, Tarabykin V, Vigier L, Sumiyama K, Hirakawa M, Nishihara H, Pierani A, Okada N. A mammalian conserved element derived from SINE displays enhancer properties recapitulating Satb2 expression in early-born callosal projection neurons. PLoS One 2011; 6:e28497. [PMID: 22174821 PMCID: PMC3234267 DOI: 10.1371/journal.pone.0028497] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 11/09/2011] [Indexed: 02/04/2023] Open
Abstract
Short interspersed repetitive elements (SINEs) are highly repeated sequences that account for a significant proportion of many eukaryotic genomes and are usually considered "junk DNA". However, we previously discovered that many AmnSINE1 loci are evolutionarily conserved across mammalian genomes, suggesting that they may have acquired significant functions involved in controlling mammalian-specific traits. Notably, we identified the AS021 SINE locus, located 390 kbp upstream of Satb2. Using transgenic mice, we showed that this SINE displays specific enhancer activity in the developing cerebral cortex. The transcription factor Satb2 is expressed by cortical neurons extending axons through the corpus callosum and is a determinant of callosal versus subcortical projection. Mouse mutants reveal a crucial function for Sabt2 in corpus callosum formation. In this study, we compared the enhancer activity of the AS021 locus with Satb2 expression during telencephalic development in the mouse. First, we showed that the AS021 enhancer is specifically activated in early-born Satb2(+) neurons. Second, we demonstrated that the activity of the AS021 enhancer recapitulates the expression of Satb2 at later embryonic and postnatal stages in deep-layer but not superficial-layer neurons, suggesting the possibility that the expression of Satb2 in these two subpopulations of cortical neurons is under genetically distinct transcriptional control. Third, we showed that the AS021 enhancer is activated in neurons projecting through the corpus callosum, as described for Satb2(+) neurons. Notably, AS021 drives specific expression in axons crossing through the ventral (TAG1(-)/NPY(+)) portion of the corpus callosum, confirming that it is active in a subpopulation of callosal neurons. These data suggest that exaptation of the AS021 SINE locus might be involved in enhancement of Satb2 expression, leading to the establishment of interhemispheric communication via the corpus callosum, a eutherian-specific brain structure.
Collapse
Affiliation(s)
- Kensuke Tashiro
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
| | - Anne Teissier
- Centre National de la Recherche Scientifique–Unité Mixte de Recherche 7592, Institut Jacques Monod, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Naoki Kobayashi
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
| | - Akiko Nakanishi
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
| | - Takeshi Sasaki
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
| | - Kuo Yan
- Department of Molecular Biology of Neuronal Signals, Max-Plank-Institute for Experimental Medicine, Göttingen, Germany
| | - Victor Tarabykin
- Department of Molecular Biology of Neuronal Signals, Max-Plank-Institute for Experimental Medicine, Göttingen, Germany
| | - Lisa Vigier
- Centre National de la Recherche Scientifique–Unité Mixte de Recherche 7592, Institut Jacques Monod, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Kenta Sumiyama
- National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Mika Hirakawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
| | - Hidenori Nishihara
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
| | - Alessandra Pierani
- Centre National de la Recherche Scientifique–Unité Mixte de Recherche 7592, Institut Jacques Monod, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
- * E-mail: (NO); (AP)
| | - Norihiro Okada
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Midori-ku, Yokohama, Kanagawa, Japan
- * E-mail: (NO); (AP)
| |
Collapse
|
68
|
Fan G, Li J. Regions identity between the genome of vertebrates and non-retroviral families of insect viruses. Virol J 2011; 8:511. [PMID: 22073942 PMCID: PMC3226645 DOI: 10.1186/1743-422x-8-511] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2011] [Accepted: 11/10/2011] [Indexed: 01/06/2023] Open
Abstract
Background The scope of our understanding of the evolutionary history between viruses and animals is limited. The fact that the recent availability of many complete insect virus genomes and vertebrate genomes as well as the ability to screen these sequences makes it possible to gain a new perspective insight into the evolutionary interaction between insect viruses and vertebrates. This study is to determine the possibility of existence of sequence identity between the genomes of insect viruses and vertebrates, attempt to explain this phenomenon in term of genetic mobile element, and try to investigate the evolutionary relationship between these short regions of identity among these species. Results Some of studied insect viruses contain variable numbers of short regions of sequence identity to the genomes of vertebrate with nucleotide sequence length from 28 bp to 124 bp. They are found to locate in multiple sites of the vertebrate genomes. The ontology of animal genes with identical regions involves in several processes including chromatin remodeling, regulation of apoptosis, signaling pathway, nerve system development and some enzyme-like catalysis. Phylogenetic analysis reveals that at least some short regions of sequence identity in the genomes of vertebrate are derived the ancestral of insect viruses. Conclusion Short regions of sequence identity were found in the vertebrates and insect viruses. These sequences played an important role not only in the long-term evolution of vertebrates, but also in promotion of insect virus. This typical win-win strategy may come from natural selection.
Collapse
Affiliation(s)
- Gaowei Fan
- National Center for Clinical Laboratories, Beijing Hospital, Beijing, China
| | | |
Collapse
|
69
|
Dufresne F, Jeffery N. A guided tour of large genome size in animals: what we know and where we are heading. Chromosome Res 2011; 19:925-38. [DOI: 10.1007/s10577-011-9248-x] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
70
|
Retrofitting the genome: L1 extinction follows endogenous retroviral expansion in a group of muroid rodents. J Virol 2011; 85:12315-23. [PMID: 21957310 DOI: 10.1128/jvi.05180-11] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Long interspersed nuclear element 1 (LINE-1; L1) retrotransposons are the most common retroelements in mammalian genomes. Unlike individual families of endogenous retroviruses (ERVs), they have remained active throughout the mammalian radiation and are responsible for most of the retroelement movement and much genome rearrangement within mammals. They can be viewed as occupying a substantial niche within mammalian genomes. Our previous demonstration that L1s and B1 short interspersed nuclear elements (SINEs) are inactive in a group of South American rodents led us to ask if other elements have amplified to fill the empty niche. We identified a novel and highly active family of ERVs (mysTR). To determine whether loss of L1 activity was correlated with expansion of mysTR, we examined mysTR activity in four South American rodent species that have lost L1 and B1 activity and four sister species with active L1s. The copy number of recent mysTR insertions was extremely high, with an average of 4,200 copies per genome. High copy numbers exist in both L1-active and L1-extinct species, so the mysTR expansion appears to have preceded the loss of both SINE and L1 activity rather than to have filled an empty niche created by their loss. It may be coincidental that two unusual genomic events--loss of L1 activity and massive expansion of an ERV family--occur in the same group of mammals. Alternatively, it is possible that this large ERV expansion set the stage for L1 extinction.
Collapse
|
71
|
Pathak D, Ali S. RsaI repetitive DNA in Buffalo Bubalus bubalis representing retrotransposons, conserved in bovids, are part of the functional genes. BMC Genomics 2011; 12:338. [PMID: 21718551 PMCID: PMC3149587 DOI: 10.1186/1471-2164-12-338] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Accepted: 07/01/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Repetitive sequences are the major components of the eukaryotic genomes. Association of these repeats with transcribing sequences and their regulation in buffalo Bubalus bubalis has remained largely unresolved. RESULTS We cloned and sequenced RsaI repeat fragments pDp1, pDp2, pDp3, pDp4 of 1331, 651, 603 and 339 base pairs, respectively from the buffalo, Bubalus bubalis. Upon characterization, these fragments were found to represent retrotransposons and part of some functional genes. The resultant clones showed cross hybridization only with buffalo, cattle, goat and sheep genomic DNA. Real Time PCR, detected ~2 × 10(4) copies of pDp1, ~ 3000 copies of pDp2 and pDp3 and ~ 1000 of pDp4 in buffalo, cattle, goat and sheep genomes, respectively. RsaI repeats are transcriptionally active in somatic tissues and spermatozoa. Accordingly, pDp1 showed maximum expression in lung, pDp2 and pDp3 both in Kidney, and pDp4 in ovary. Fluorescence in situ hybridization showed repeats to be distributed all across the chromosomes. CONCLUSIONS The data suggest that RsaI repeats have been incorporated into the exonic regions of various transcribing genes, possibly contributing towards the architecture and evolution of the buffalo and related genomes. Prospects of our present work in the context of comparative and functional genomics are highlighted.
Collapse
Affiliation(s)
- Deepali Pathak
- Molecular Genetics Laboratory, National Institute of Immunology, Aruna Asaf Ali Marg, New Delhi -110 067, India
| | | |
Collapse
|
72
|
Hua-Van A, Le Rouzic A, Boutin TS, Filée J, Capy P. The struggle for life of the genome's selfish architects. Biol Direct 2011; 6:19. [PMID: 21414203 PMCID: PMC3072357 DOI: 10.1186/1745-6150-6-19] [Citation(s) in RCA: 180] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Accepted: 03/17/2011] [Indexed: 01/28/2023] Open
Abstract
Transposable elements (TEs) were first discovered more than 50 years ago, but were totally ignored for a long time. Over the last few decades they have gradually attracted increasing interest from research scientists. Initially they were viewed as totally marginal and anecdotic, but TEs have been revealed as potentially harmful parasitic entities, ubiquitous in genomes, and finally as unavoidable actors in the diversity, structure, and evolution of the genome. Since Darwin's theory of evolution, and the progress of molecular biology, transposable elements may be the discovery that has most influenced our vision of (genome) evolution. In this review, we provide a synopsis of what is known about the complex interactions that exist between transposable elements and the host genome. Numerous examples of these interactions are provided, first from the standpoint of the genome, and then from that of the transposable elements. We also explore the evolutionary aspects of TEs in the light of post-Darwinian theories of evolution.
Collapse
Affiliation(s)
- Aurélie Hua-Van
- Laboratoire Evolution, Génomes, Spéciation, CNRS UPR9034/Université Paris-Sud, Gif-sur-Yvette, France.
| | | | | | | | | |
Collapse
|
73
|
Recent amplification of the kangaroo endogenous retrovirus, KERV, limited to the centromere. J Virol 2011; 85:4761-71. [PMID: 21389136 DOI: 10.1128/jvi.01604-10] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Mammalian retrotransposons, transposable elements that are processed through an RNA intermediate, are categorized as short interspersed elements (SINEs), long interspersed elements (LINEs), and long terminal repeat (LTR) retroelements, which include endogenous retroviruses. The ability of transposable elements to autonomously amplify led to their initial characterization as selfish or junk DNA; however, it is now known that they may acquire specific cellular functions in a genome and are implicated in host defense mechanisms as well as in genome evolution. Interactions between classes of transposable elements may exert a markedly different and potentially more significant effect on a genome than interactions between members of a single class of transposable elements. We examined the genomic structure and evolution of the kangaroo endogenous retrovirus (KERV) in the marsupial genus Macropus. The complete proviral structure of the kangaroo endogenous retrovirus, phylogenetic relationship among relative retroviruses, and expression of this virus in both Macropus rufogriseus and M. eugenii are presented for the first time. In addition, we show the relative copy number and distribution of the kangaroo endogenous retrovirus in the Macropus genus. Our data indicate that amplification of the kangaroo endogenous retrovirus occurred in a lineage-specific fashion, is restricted to the centromeres, and is not correlated with LINE depletion. Finally, analysis of KERV long terminal repeat sequences using massively parallel sequencing indicates that the recent amplification in M. rufogriseus is likely due to duplications and concerted evolution rather than a high number of independent insertion events.
Collapse
|
74
|
Alkan C, Cardone MF, Catacchio CR, Antonacci F, O'Brien SJ, Ryder OA, Purgato S, Zoli M, Della Valle G, Eichler EE, Ventura M. Genome-wide characterization of centromeric satellites from multiple mammalian genomes. Genome Res 2010; 21:137-45. [PMID: 21081712 DOI: 10.1101/gr.111278.110] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Collapse
Affiliation(s)
- Can Alkan
- Department of Genome Sciences, Howard Hughes Medical Institute, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
75
|
Dalloul RA, Long JA, Zimin AV, Aslam L, Beal K, Ann Blomberg L, Bouffard P, Burt DW, Crasta O, Crooijmans RPMA, Cooper K, Coulombe RA, De S, Delany ME, Dodgson JB, Dong JJ, Evans C, Frederickson KM, Flicek P, Florea L, Folkerts O, Groenen MAM, Harkins TT, Herrero J, Hoffmann S, Megens HJ, Jiang A, de Jong P, Kaiser P, Kim H, Kim KW, Kim S, Langenberger D, Lee MK, Lee T, Mane S, Marcais G, Marz M, McElroy AP, Modise T, Nefedov M, Notredame C, Paton IR, Payne WS, Pertea G, Prickett D, Puiu D, Qioa D, Raineri E, Ruffier M, Salzberg SL, Schatz MC, Scheuring C, Schmidt CJ, Schroeder S, Searle SMJ, Smith EJ, Smith J, Sonstegard TS, Stadler PF, Tafer H, Tu Z(J, Van Tassell CP, Vilella AJ, Williams KP, Yorke JA, Zhang L, Zhang HB, Zhang X, Zhang Y, Reed KM. Multi-platform next-generation sequencing of the domestic turkey (Meleagris gallopavo): genome assembly and analysis. PLoS Biol 2010; 8:e1000475. [PMID: 20838655 PMCID: PMC2935454 DOI: 10.1371/journal.pbio.1000475] [Citation(s) in RCA: 320] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 07/27/2010] [Indexed: 12/11/2022] Open
Abstract
A synergistic combination of two next-generation sequencing platforms with a detailed comparative BAC physical contig map provided a cost-effective assembly of the genome sequence of the domestic turkey (Meleagris gallopavo). Heterozygosity of the sequenced source genome allowed discovery of more than 600,000 high quality single nucleotide variants. Despite this heterozygosity, the current genome assembly (∼1.1 Gb) includes 917 Mb of sequence assigned to specific turkey chromosomes. Annotation identified nearly 16,000 genes, with 15,093 recognized as protein coding and 611 as non-coding RNA genes. Comparative analysis of the turkey, chicken, and zebra finch genomes, and comparing avian to mammalian species, supports the characteristic stability of avian genomes and identifies genes unique to the avian lineage. Clear differences are seen in number and variety of genes of the avian immune system where expansions and novel genes are less frequent than examples of gene loss. The turkey genome sequence provides resources to further understand the evolution of vertebrate genomes and genetic variation underlying economically important quantitative traits in poultry. This integrated approach may be a model for providing both gene and chromosome level assemblies of other species with agricultural, ecological, and evolutionary interest.
Collapse
Affiliation(s)
- Rami A. Dalloul
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Julie A. Long
- Animal Biosciences and Biotechnology Laboratory, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Aleksey V. Zimin
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Luqman Aslam
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Kathryn Beal
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Le Ann Blomberg
- Animal Biosciences and Biotechnology Laboratory, USDA Agricultural Research Service, Beltsville, Maryland, United States of America
| | - Pascal Bouffard
- Roche Applied Science, Indianapolis, Indiana, United States of America
| | - David W. Burt
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - Oswald Crasta
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
- Chromatin Inc., Champaign, Illinois, United States of America
| | | | - Kristal Cooper
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Roger A. Coulombe
- Department of Veterinary Sciences, Utah State University, Logan, Utah, United States of America
| | - Supriyo De
- Gene Expression and Genomics Unit, National Institute on Aging, National Institutes of Health, Baltimore, Maryland, United States of America
| | - Mary E. Delany
- Department of Animal Science, University of California, Davis, California, United States of America
| | - Jerry B. Dodgson
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
| | - Jennifer J. Dong
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Clive Evans
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | | | - Paul Flicek
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Liliana Florea
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Otto Folkerts
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
- Chromatin Inc., Champaign, Illinois, United States of America
| | - Martien A. M. Groenen
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Tim T. Harkins
- Roche Applied Science, Indianapolis, Indiana, United States of America
| | - Javier Herrero
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Steve Hoffmann
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- LIFE Project, University of Leipzig, Leipzig, Germany
| | - Hendrik-Jan Megens
- Animal Breeding and Genomics Centre, Wageningen University, Wageningen, the Netherlands
| | - Andrew Jiang
- Department of Animal Science, University of California, Davis, California, United States of America
| | - Pieter de Jong
- Children's Hospital and Research Center at Oakland, Oakland, California, United States of America
| | - Pete Kaiser
- Institute for Animal Health, Compton, Berkshire, United Kingdom
| | - Heebal Kim
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Kyu-Won Kim
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Sungwon Kim
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - David Langenberger
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
| | - Mi-Kyung Lee
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Taeheon Lee
- Laboratory of Bioinformatics and Population Genetics, Department of Agricultural Biotechnology, Seoul National University, Seoul, Korea
| | - Shrinivasrao Mane
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Guillaume Marcais
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Manja Marz
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Philipps-Universität Marburg, Pharmazeutische Chemie, Marburg, Germany
| | - Audrey P. McElroy
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Thero Modise
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Mikhail Nefedov
- Children's Hospital and Research Center at Oakland, Oakland, California, United States of America
| | - Cédric Notredame
- Comparative Bioinformatics, Centre for Genomic Regulation (CRG), Universitat Pompeus Fabre, Barcelona, Spain
| | - Ian R. Paton
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - William S. Payne
- Department of Microbiology and Molecular Genetics, Michigan State University, East Lansing, Michigan, United States of America
| | - Geo Pertea
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Dennis Prickett
- Institute for Animal Health, Compton, Berkshire, United Kingdom
| | - Daniela Puiu
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, United States of America
| | - Dan Qioa
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Emanuele Raineri
- Comparative Bioinformatics, Centre for Genomic Regulation (CRG), Universitat Pompeus Fabre, Barcelona, Spain
| | - Magali Ruffier
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Steven L. Salzberg
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
| | - Michael C. Schatz
- Center for Bioinformatics and Computational Biology, Department of Computer Science, University of Maryland, College Park, Maryland, United States of America
| | - Chantel Scheuring
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Carl J. Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Steven Schroeder
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Stephen M. J. Searle
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Edward J. Smith
- Avian Immunobiology Laboratory, Department of Animal and Poultry Sciences, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Roslin, Midlothian, United Kingdom
| | - Tad S. Sonstegard
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Peter F. Stadler
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Fraunhofer Institut für Zelltherapie und Immunologie, Leipzig, Germany
- Department of Theoretical Chemistry University of Vienna, Vienna, Austria
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| | - Hakim Tafer
- Department of Computer Science and Interdisciplinary Center for Bioinformatics, University of Leipzig, Leipzig, Germany
- Department of Theoretical Chemistry University of Vienna, Vienna, Austria
| | - Zhijian (Jake) Tu
- Department of Biochemistry, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Curtis P. Van Tassell
- Bovine Functional Genomics Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
- Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville Agricultural Research Center, Beltsville, Maryland, United States of America
| | - Albert J. Vilella
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Kelly P. Williams
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, United States of America
| | - James A. Yorke
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland, United States of America
| | - Liqing Zhang
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia, United States of America
| | - Hong-Bin Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Xiaojun Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Yang Zhang
- Department of Soil and Crop Sciences, Texas A&M University, College Station, Texas, United States of America
| | - Kent M. Reed
- Department of Veterinary and Biomedical Sciences, College of Veterinary Medicine, University of Minnesota, St. Paul, Minnesota, United States of America
| |
Collapse
|
76
|
Kojima KK, Kapitonov VV, Jurka J. Recent expansion of a new Ingi-related clade of Vingi non-LTR retrotransposons in hedgehogs. Mol Biol Evol 2010; 28:17-20. [PMID: 20716533 DOI: 10.1093/molbev/msq220] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Autonomous non-long terminal repeat (non-LTR) retrotransposons and their repetitive remnants are ubiquitous components of mammalian genomes. Recently, we identified non-LTR retrotransposon families, Ingi-1_AAl and Ingi-1_EE, in two hedgehog genomes. Here we rename them to Vingi-1_AAl and Vingi-1_EE and report a new clade "Vingi," which is a sister clade of Ingi that lacks the ribonuclease H domain. In the European hedgehog genome, there are 11 non-autonomous families of elements derived from Vingi-1_EE by internal deletions. No retrotransposons related to Vingi elements were found in any of the remaining 33 mammalian genomes nearly completely sequenced to date, but we identified several new families of Vingi and Ingi retrotransposons outside mammals. Our data suggest the horizontal transfer of Vingi elements to hedgehog, although the vertical transfer cannot be ruled out. The compact structure and trans-mobilization of nonautonomous derivatives of Vingi can make them useful for in vivo retrotransposition assay system.
Collapse
|
77
|
Tracking marsupial evolution using archaic genomic retroposon insertions. PLoS Biol 2010; 8:e1000436. [PMID: 20668664 PMCID: PMC2910653 DOI: 10.1371/journal.pbio.1000436] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2010] [Accepted: 06/15/2010] [Indexed: 01/05/2023] Open
Abstract
Genome-wide comparisons of shared retroposon insertion patterns resolve the phylogeny of marsupials, clearly distinguishing South American and Australian species and lending support to Didelphimorphia as the basal split. The Australasian and South American marsupial mammals, such as kangaroos and opossums, are the closest living relatives to placental mammals, having shared a common ancestor around 130 million years ago. The evolutionary relationships among the seven marsupial orders have, however, so far eluded resolution. In particular, the relationships between the four Australasian and three South American marsupial orders have been intensively debated since the South American order Microbiotheria was taxonomically moved into the group Australidelphia. Australidelphia is significantly supported by both molecular and morphological data and comprises the four Australasian marsupial orders and the South American order Microbiotheria, indicating a complex, ancient, biogeographic history of marsupials. However, the exact phylogenetic position of Microbiotheria within Australidelphia has yet to be resolved using either sequence or morphological data analysis. Here, we provide evidence from newly established and virtually homoplasy-free retroposon insertion markers for the basal relationships among marsupial orders. Fifty-three phylogenetically informative markers were retrieved after in silico and experimental screening of ∼217,000 retroposon-containing loci from opossum and kangaroo. The four Australasian orders share a single origin with Microbiotheria as their closest sister group, supporting a clear divergence between South American and Australasian marsupials. In addition, the new data place the South American opossums (Didelphimorphia) as the first branch of the marsupial tree. The exhaustive computational and experimental evidence provides important insight into the evolution of retroposable elements in the marsupial genome. Placing the retroposon insertion pattern in a paleobiogeographic context indicates a single marsupial migration from South America to Australia. The now firmly established phylogeny can be used to determine the direction of genomic changes and morphological transitions within marsupials. Ever since the first Europeans reached the Australian shores and were fascinated by the curious marsupials they found, the evolutionary relationships between the living Australian and South American marsupial orders have been intensively investigated. However, neither the morphological nor the more recent molecular methods produced an evolutionary consensus. Most problematic of the seven marsupial groups is the South American species Dromiciops gliroides, the only survivor of the order Microbiotheria. Several studies suggest that Dromiciops, although living in South America, is more closely related to Australian than to South American marsupials. This relationship would have required a complex migration scenario whereby several groups of ancestral South American marsupials migrated across Antarctica to Australia. We screened the genomes of the South American opossum and the Australian tammar wallaby for retroposons, unambiguous phylogenetic markers that occupy more than half of the marsupial genome. From analyses of nearly 217,000 retroposon-containing loci, we identified 53 retroposons that resolve most branches of the marsupial evolutionary tree. Dromiciops is clearly only distantly related to Australian marsupials, supporting a single Gondwanan migration of marsupials from South America to Australia. The new phylogeny offers a novel perspective in understanding the morphological and molecular transitions between the South American and Australian marsupials.
Collapse
|
78
|
The role of transposable elements in the evolution of non-mammalian vertebrates and invertebrates. Genome Biol 2010; 11:R59. [PMID: 20525173 PMCID: PMC2911107 DOI: 10.1186/gb-2010-11-6-r59] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 04/27/2010] [Accepted: 06/02/2010] [Indexed: 01/29/2023] Open
Abstract
Background Transposable elements (TEs) have played an important role in the diversification and enrichment of mammalian transcriptomes through various mechanisms such as exonization and intronization (the birth of new exons/introns from previously intronic/exonic sequences, respectively), and insertion into first and last exons. However, no extensive analysis has compared the effects of TEs on the transcriptomes of mammals, non-mammalian vertebrates and invertebrates. Results We analyzed the influence of TEs on the transcriptomes of five species, three invertebrates and two non-mammalian vertebrates. Compared to previously analyzed mammals, there were lower levels of TE introduction into introns, significantly lower numbers of exonizations originating from TEs and a lower percentage of TE insertion within the first and last exons. Although the transcriptomes of vertebrates exhibit significant levels of exonization of TEs, only anecdotal cases were found in invertebrates. In vertebrates, as in mammals, the exonized TEs are mostly alternatively spliced, indicating that selective pressure maintains the original mRNA product generated from such genes. Conclusions Exonization of TEs is widespread in mammals, less so in non-mammalian vertebrates, and very low in invertebrates. We assume that the exonization process depends on the length of introns. Vertebrates, unlike invertebrates, are characterized by long introns and short internal exons. Our results suggest that there is a direct link between the length of introns and exonization of TEs and that this process became more prevalent following the appearance of mammals.
Collapse
|
79
|
Unique functions of repetitive transcriptomes. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2010; 285:115-88. [PMID: 21035099 DOI: 10.1016/b978-0-12-381047-2.00003-7] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Repetitive sequences occupy a huge fraction of essentially every eukaryotic genome. Repetitive sequences cover more than 50% of mammalian genomic DNAs, whereas gene exons and protein-coding sequences occupy only ~3% and 1%, respectively. Numerous genomic repeats include genes themselves. They generally encode "selfish" proteins necessary for the proliferation of transposable elements (TEs) in the host genome. The major part of evolutionary "older" TEs accumulated mutations over time and fails to encode functional proteins. However, repeats have important functions also on the RNA level. Repetitive transcripts may serve as multifunctional RNAs by participating in the antisense regulation of gene activity and by competing with the host-encoded transcripts for cellular factors. In addition, genomic repeats include regulatory sequences like promoters, enhancers, splice sites, polyadenylation signals, and insulators, which actively reshape cellular transcriptomes. TE expression is tightly controlled by the host cells, and some mechanisms of this regulation were recently decoded. Finally, capacity of TEs to proliferate in the host genome led to the development of multiple biotechnological applications.
Collapse
|
80
|
Veitia RA, Bottani S. Whole genome duplications and a 'function' for junk DNA? Facts and hypotheses. PLoS One 2009; 4:e8201. [PMID: 20011530 PMCID: PMC2788606 DOI: 10.1371/journal.pone.0008201] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2009] [Accepted: 11/09/2009] [Indexed: 12/18/2022] Open
Abstract
Background The lack of correlation between genome size and organismal complexity is understood in terms of the massive presence of repetitive and non-coding DNA. This non-coding subgenome has long been called “junk” DNA. However, it might have important functions. Generation of junk DNA depends on proliferation of selfish DNA elements and on local or global DNA duplication followed by genic non-fonctionalization. Methodology/Principal Findings Evidence from genomic analyses and experimental data indicates that Whole Genome Duplications (WGD) are often followed by a return to the diploid state, through DNA deletions and intra/interchromosomal rearrangements. We use simple theoretical models and simulations to explore how a WGD accompanied by sequence deletions might affect the dosage balance often required among several gene products involved in regulatory processes. We find that potential genomic deletions leading to changes in nuclear and cell volume might potentially perturb gene dosage balance. Conclusions/Significance The potentially negative impact of DNA deletions can be buffered if deleted genic DNA is, at least temporarily, replaced by repetitive DNA so that the nuclear/cell volume remains compatible with normal living. Thus, we speculate that retention of non-functionalized non-coding DNA, and replacement of deleted DNA through proliferation of selfish elements, might help avoid dosage imbalances in cycles of polyploidization and diploidization, which are particularly frequent in plants.
Collapse
|
81
|
Gogvadze E, Buzdin A. Retroelements and their impact on genome evolution and functioning. Cell Mol Life Sci 2009; 66:3727-42. [PMID: 19649766 PMCID: PMC11115525 DOI: 10.1007/s00018-009-0107-2] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2009] [Revised: 06/11/2009] [Accepted: 07/14/2009] [Indexed: 12/31/2022]
Abstract
Retroelements comprise a considerable fraction of eukaryotic genomes. Since their initial discovery by Barbara McClintock in maize DNA, retroelements have been found in genomes of almost all organisms. First considered as a "junk DNA" or genomic parasites, they were shown to influence genome functioning and to promote genetic innovations. For this reason, they were suggested as an important creative force in the genome evolution and adaptation of an organism to altered environmental conditions. In this review, we summarize the up-to-date knowledge of different ways of retroelement involvement in structural and functional evolution of genes and genomes, as well as the mechanisms generated by cells to control their retrotransposition.
Collapse
Affiliation(s)
- Elena Gogvadze
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, 16/10 Miklukho-Maklaya st, 117997 Moscow, Russia.
| | | |
Collapse
|
82
|
Transposable elements in gene regulation and in the evolution of vertebrate genomes. Curr Opin Genet Dev 2009; 19:607-12. [PMID: 19914058 DOI: 10.1016/j.gde.2009.10.013] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Revised: 10/20/2009] [Accepted: 10/26/2009] [Indexed: 01/30/2023]
Abstract
Repetitive DNA and in particular transposable elements have been intimately linked to eukaryotic genomes for millions of years. Once overlooked for being only a collection of selfish debris and a nuisance for sequence assembly, genomic repeats are now being recognized as a key driving force in genome evolution. Indeed, by changing the DNA landscape of genomes, transposable elements have been a rich source of innovation in genes, regulatory elements and genome structures. In this review, I will focus on recent advances that demonstrate that genomic repeats have had a global impact on vertebrate gene regulatory networks. I will also summarize results that show how transposable elements have been a major catalyst of structural rearrangements throughout evolution.
Collapse
|
83
|
Luchetti A, Mantovani B. Talua SINE Biology in the Genome of the Reticulitermes Subterranean Termites (Isoptera, Rhinotermitidae). J Mol Evol 2009; 69:589-600. [DOI: 10.1007/s00239-009-9285-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2009] [Accepted: 09/21/2009] [Indexed: 10/20/2022]
|
84
|
Ray DA, Platt RN, Batzer MA. Reading between the LINEs to see into the past. Trends Genet 2009; 25:475-9. [PMID: 19837475 DOI: 10.1016/j.tig.2009.09.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Revised: 09/15/2009] [Accepted: 09/15/2009] [Indexed: 11/24/2022]
Abstract
Transposable elements (TEs) are an important source of genome diversity and play a crucial role in genome evolution. A recent study by Zhao et al. describes novel patterns of TE diversification in the genome of the extinct mammoth Mammuthus primigenius. Analysis of Mammuthus has provided a unique genome landscape, a pivotal species for understanding TEs and genome evolution and hints at the diversity we verge on discovering by expanding our taxonomic sampling among genomes. Strategies based on this work might also revolutionize investigations of the interface between TE dynamics and genome diversity.
Collapse
Affiliation(s)
- David A Ray
- Department of Biochemistry and Molecular Biology, Box 9650, Mississippi State University, Mississippi State, MS 39762, USA
| | | | | |
Collapse
|
85
|
Adelson DL, Raison JM, Edgar RC. Characterization and distribution of retrotransposons and simple sequence repeats in the bovine genome. Proc Natl Acad Sci U S A 2009; 106:12855-60. [PMID: 19625614 PMCID: PMC2722308 DOI: 10.1073/pnas.0901282106] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2009] [Indexed: 12/11/2022] Open
Abstract
Interspersed repeat composition and distribution in mammals have been best characterized in the human and mouse genomes. The bovine genome contains typical eutherian mammal repeats, but also has a significant number of long interspersed nuclear element RTE (BovB) elements proposed to have been horizontally transferred from squamata. Our analysis of the BovB repeats has indicated that only a few of them are currently likely to retrotranspose in cattle. However, bovine L1 repeats (L1 BT) have many likely active copies. Comparison of substitution rates for BovB and L1 BT indicates that L1 BT is a younger repeat family than BovB. In contrast to mouse and human, L1 occurrence is not negatively correlated with G+C content. However, BovB, Bov A2, ART2A, and Bov-tA are negatively correlated with G+C, although Bov-tAs correlation is weaker. Also, by performing genome wide correlation analysis of interspersed and simple sequence repeats, we have identified genome territories by repeat content that appear to define ancestral vs. ruminant-specific genomic regions. These ancestral regions, enriched with L2 and MIR repeats, are largely conserved between bovine and human.
Collapse
Affiliation(s)
- David L Adelson
- School of Molecular and Biomedical Science, University of Adelaide, North Terrace, Adelaide, South Australia, 5005, Australia.
| | | | | |
Collapse
|
86
|
Zhao F, Qi J, Schuster SC. Tracking the past: interspersed repeats in an extinct Afrotherian mammal, Mammuthus primigenius. Genome Res 2009; 19:1384-92. [PMID: 19508981 DOI: 10.1101/gr.091363.109] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The woolly mammoth (Mammuthus primigenius) died out about several thousand years ago, yet recent paleogenomic studies have successfully recovered genetic information from both the mitochondrial and nuclear genomes of this extinct species. Mammoths belong to Afrotheria, a group of mammals exhibiting extreme morphological diversity and large genome sizes. In this study, we found that the mammoth genome contains a larger proportion of interspersed repeats than any other mammalian genome reported so far, in which the proliferation of the RTE family of retrotransposons (covering 12% of the genome) may be the main reason for an increased genome size. Phylogenetic analysis showed that RTEs in mammoth are closely related to the family BovB/RTE. The incongruence of the reconstructed RTE phylogeny indicates that RTEs in mammoth may be acquired through an ancient lateral gene transfer event. A recent proliferation of SINEs was also found in the probocidean lineage, whereas the Afrotherian-wide SINEs in mammoth have undergone a rather flat and stepwise expansion. Comparisons of the transposable elements (TEs) between mammoth and other mammals may shed light on the evolutionary history of TEs in various mammalian lineages.
Collapse
Affiliation(s)
- Fangqing Zhao
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | | |
Collapse
|
87
|
Kanizay L, Dawe RK. Centromeres: long intergenic spaces with adaptive features. Funct Integr Genomics 2009; 9:287-92. [DOI: 10.1007/s10142-009-0124-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Revised: 04/20/2009] [Accepted: 04/24/2009] [Indexed: 12/12/2022]
|
88
|
Novick PA, Basta H, Floumanhaft M, McClure MA, Boissinot S. The Evolutionary Dynamics of Autonomous Non-LTR Retrotransposons in the Lizard Anolis Carolinensis Shows More Similarity to Fish Than Mammals. Mol Biol Evol 2009; 26:1811-22. [DOI: 10.1093/molbev/msp090] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
|
89
|
Bao W, Jurka MG, Kapitonov VV, Jurka J. New superfamilies of eukaryotic DNA transposons and their internal divisions. Mol Biol Evol 2009; 26:983-93. [PMID: 19174482 PMCID: PMC2727372 DOI: 10.1093/molbev/msp013] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/13/2009] [Indexed: 12/23/2022] Open
Abstract
Despite their enormous diversity and abundance, all currently known eukaryotic DNA transposons belong to only 15 superfamilies. Here, we report two new superfamilies of DNA transposons, named Sola and Zator. Sola transposons encode DDD-transposases (transposase, TPase) and are flanked by 4-bp target site duplications (TSD). Elements from the Sola superfamily are distributed in a variety of species including bacteria, protists, plants, and metazoans. They can be divided into three distinct groups of elements named Sola1, Sola2, and Sola3. The elements from each group have extremely low sequence identity to each other, different termini, and different target site preferences. However, all three groups belong to a single superfamily based on significant PSI-Blast identities between their TPases. The DDD TPase sequences encoded by Sola transposons are not similar to any known TPases. The second superfamily named Zator is characterized by 3-bp TSD. The Zator superfamily is relatively rare in eukaryotic species, and it evolved from a bacterial transposon encoding a TPase belonging to the "transposase 36" family (Pfam07592). These transposons are named TP36 elements (abbreviated from transposase 36).
Collapse
Affiliation(s)
- Weidong Bao
- Genetic Information Research Institute, Mountain View, CA, USA
| | | | | | | |
Collapse
|
90
|
Di-Poï N, Montoya-Burgos JI, Duboule D. Atypical relaxation of structural constraints in Hox gene clusters of the green anole lizard. Genome Res 2009; 19:602-10. [PMID: 19228589 DOI: 10.1101/gr.087932.108] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Hox genes control many aspects of embryonic development in metazoans. Previous analyses of this gene family revealed a surprising diversity in terms of gene number and organization between various animal species. In vertebrates, Hox genes are grouped into tightly organized clusters, claimed to be devoid of repetitive sequences. Here, we report the genomic organization of the four Hox loci present in the green anole lizard and show that they have massively accumulated retrotransposons, leading to gene clusters larger in size when compared to other vertebrates. In addition, similar repeats are present in many other development-related gene-containing regions, also thought to be refractory to such repetitive elements. Transposable elements are major sources of genetic variations, including alterations of gene expression, and hence this situation, so far unique among vertebrates, may have been associated with the evolution of the spectacular realm of morphological variations in the body plans of Squamata. Finally, sequence alignments highlight some divergent evolution in highly conserved DNA regions between vertebrate Hox clusters, which may coincide with the emergence of mammalian-specific features.
Collapse
Affiliation(s)
- Nicolas Di-Poï
- National Research Center "Frontiers in Genetics," Department of Zoology and Animal Biology, University of Geneva, 1211 Geneva 4, Switzerland
| | | | | |
Collapse
|
91
|
Abstract
Retrotransposons, mainly LINEs, SINEs, and endogenous retroviruses, make up roughly 40% of the mammalian genome and have played an important role in genome evolution. Their prevalence in genomes reflects a delicate balance between their further expansion and the restraint imposed by the host. In any human genome only a small number of LINE1s (L1s) are active, moving their own and SINE sequences into new genomic locations and occasionally causing disease. Recent insights and new technologies promise answers to fundamental questions about the biology of transposable elements.
Collapse
Affiliation(s)
- John L Goodier
- Department of Genetics, University of Pennsylvania School of Medicine, 415 Curie Boulevard, Philadelphia, PA 19104, USA.
| | | |
Collapse
|
92
|
Abstract
The strategic importance of the genome sequence of the gray, short-tailed opossum, Monodelphis domestica, accrues from both the unique phylogenetic position of metatherian (marsupial) mammals and the fundamental biologic characteristics of metatherians that distinguish them from other mammalian species. Metatherian and eutherian (placental) mammals are more closely related to one another than to other vertebrate groups, and owing to this close relationship they share fundamentally similar genetic structures and molecular processes. However, during their long evolutionary separation these alternative mammals have developed distinctive anatomical, physiologic, and genetic features that hold tremendous potential for examining relationships between the molecular structures of mammalian genomes and the functional attributes of their components. Comparative analyses using the opossum genome have already provided a wealth of new evidence regarding the importance of noncoding elements in the evolution of mammalian genomes, the role of transposable elements in driving genomic innovation, and the relationships between recombination rate, nucleotide composition, and the genomic distributions of repetitive elements. The genome sequence is also beginning to enlarge our understanding of the evolution and function of the vertebrate immune system, and it provides an alternative model for investigating mechanisms of genomic imprinting. Equally important, availability of the genome sequence is fostering the development of new research tools for physical and functional genomic analyses of M. domestica that are expanding its versatility as an experimental system for a broad range of research applications in basic biology and biomedically oriented research.
Collapse
|
93
|
Gilbert C, Pace JK, Waters PD. Target site analysis of RTE1_LA and its AfroSINE partner in the elephant genome. Gene 2008; 425:1-8. [PMID: 18796327 DOI: 10.1016/j.gene.2008.08.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 08/18/2008] [Accepted: 08/18/2008] [Indexed: 10/21/2022]
Abstract
SINEs retrotranspose using their partner LINE's enzymatic machinery. It has recently been proposed that AfroSINEs ending with GGTTT 3' tandem repeats were mobilized by RTE elements ending with CAA 3' tandem repeats in the Afrotherian genome. Using sequences from the elephant genome, we show that AfroSINEs derive from RTE ending with GGTTT-like 3' tandem repeats, a subgroup of RTE1_LA that only reached low copy number, and confirm that they were most likely mobilized by RTE ending with CAA(n) tandem repeats (RTE1_LA-CAA(n)). This partnership is supported by sequence similarity between two regions of the elements, overlap in the timing of their activity, common features of their target site consensus that are not shared by other members of the RTE family, and their high copy number. Detailed analyses of pre-insertion loci reveal that like many other apurinic/apyrimidinic endonuclease encoding elements, RTE1_LA-CAA(n) shows loose target site specificity. In addition, the RTE1_LA-CAA(n) target site consensus shares several structural and primary sequence features with that of LINE1, suggesting that these two elements share close functional similarity in the target primed reverse transcription (TPRT) reaction. Interestingly, although globally similar, the target site consensus of AfroSINE(Anc) and RTE1_LA-CAA(n) differ in several aspects. These differences, not observed among all SINE/LINE pairs so far examined, are most likely due to the fact that AfroSINEs and RTE1_LA-CAA(n) are terminated by a different tandem repeat motif. We propose that these differences reflect constraints imposed by base pairing interactions between the mRNA 3' terminal tandem repeats and the target DNA at the onset of TPRT. So in addition to the endonuclease nicking preference, the mRNA of these elements appears to play an important role in integration site choice through a passive, post-nicking, selective process.
Collapse
Affiliation(s)
- Clément Gilbert
- Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Stellenbosch, South Africa.
| | | | | |
Collapse
|
94
|
Smith AM, Sanchez MJ, Follows GA, Kinston S, Donaldson IJ, Green AR, Göttgens B. A novel mode of enhancer evolution: the Tal1 stem cell enhancer recruited a MIR element to specifically boost its activity. Genome Res 2008; 18:1422-32. [PMID: 18687876 PMCID: PMC2527711 DOI: 10.1101/gr.077008.108] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Altered cis-regulation is thought to underpin much of metazoan evolution, yet the underlying mechanisms remain largely obscure. The stem cell leukemia TAL1 (also known as SCL) transcription factor is essential for the normal development of blood stem cells and we have previously shown that the Tal1 +19 enhancer directs expression to hematopoietic stem cells, hematopoietic progenitors, and to endothelium. Here we demonstrate that an adjacent region 1 kb upstream (+18 element) is in an open chromatin configuration and carries active histone marks but does not function as an enhancer in transgenic mice. Instead, it boosts activity of the +19 enhancer both in stable transfection assays and during differentiation of embryonic stem (ES) cells carrying single-copy reporter constructs targeted to the Hprt locus. The +18 element contains a mammalian interspersed repeat (MIR) which is essential for the +18 function and which was transposed to the Tal1 locus approximately 160 million years ago at the time of the mammalian/marsupial branchpoint. Our data demonstrate a previously unrecognized mechanism whereby enhancer activity is modulated by a transposon exerting a "booster" function which would go undetected by conventional transgenic approaches.
Collapse
Affiliation(s)
- Aileen M Smith
- University of Cambridge Department of Haematology, Cambridge Institute for Medical Research, Cambridge CB2 2XY, United Kingdom
| | | | | | | | | | | | | |
Collapse
|
95
|
Bourque G, Leong B, Vega VB, Chen X, Lee YL, Srinivasan KG, Chew JL, Ruan Y, Wei CL, Ng HH, Liu ET. Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res 2008; 18:1752-62. [PMID: 18682548 DOI: 10.1101/gr.080663.108] [Citation(s) in RCA: 416] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Identification of lineage-specific innovations in genomic control elements is critical for understanding transcriptional regulatory networks and phenotypic heterogeneity. We analyzed, from an evolutionary perspective, the binding regions of seven mammalian transcription factors (ESR1, TP53, MYC, RELA, POU5F1, SOX2, and CTCF) identified on a genome-wide scale by different chromatin immunoprecipitation approaches and found that only a minority of sites appear to be conserved at the sequence level. Instead, we uncovered a pervasive association with genomic repeats by showing that a large fraction of the bona fide binding sites for five of the seven transcription factors (ESR1, TP53, POU5F1, SOX2, and CTCF) are embedded in distinctive families of transposable elements. Using the age of the repeats, we established that these repeat-associated binding sites (RABS) have been associated with significant regulatory expansions throughout the mammalian phylogeny. We validated the functional significance of these RABS by showing that they are over-represented in proximity of regulated genes and that the binding motifs within these repeats have undergone evolutionary selection. Our results demonstrate that transcriptional regulatory networks are highly dynamic in eukaryotic genomes and that transposable elements play an important role in expanding the repertoire of binding sites.
Collapse
Affiliation(s)
- Guillaume Bourque
- Computational and Mathematical Biology, Genome Institute of Singapore, Singapore 138672, Singapore.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
96
|
Carr M, Nelson M, Leadbeater BSC, Baldauf SL. Three families of LTR retrotransposons are present in the genome of the choanoflagellate Monosiga brevicollis. Protist 2008; 159:579-90. [PMID: 18621583 DOI: 10.1016/j.protis.2008.05.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Accepted: 05/01/2008] [Indexed: 11/29/2022]
Abstract
The choanoflagellates are a ubiquitous group of nanoflagellates and the sister group of Metazoa. Examination of the initial draft version of the first choanoflagellate genome, that of Monosiga brevicollis, reveals the presence of three novel families of long terminal repeat (LTR) retrotransposons and an apparent absence of non-LTR retrotransposons and transposons. One of the newly discovered LTR families falls in the chromovirus clade of the Ty3/gypsy group while the other two families are closely related members of the Ty1/copia group. Examination of EST sequences and nucleotide analyses show that all three families are transcriptionally active and potentially functional within the genome of M. brevicollis.
Collapse
Affiliation(s)
- Martin Carr
- Department of Biology, University of York, Heslington, York YO10 5YW, UK
| | | | | | | |
Collapse
|
97
|
Gu W, Castoe TA, Hedges DJ, Batzer MA, Pollock DD. Identification of repeat structure in large genomes using repeat probability clouds. Anal Biochem 2008; 380:77-83. [PMID: 18541131 DOI: 10.1016/j.ab.2008.05.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Revised: 05/01/2008] [Accepted: 05/02/2008] [Indexed: 11/28/2022]
Abstract
The identification of repeat structure in eukaryotic genomes can be time-consuming and difficult because of the large amount of information ( approximately 3 x 10(9) bp) that needs to be processed and compared. We introduce a new approach based on exact word counts to evaluate, de novo, the repeat structure present within large eukaryotic genomes. This approach avoids sequence alignment and similarity search, two of the most time-consuming components of traditional methods for repeat identification. Algorithms were implemented to efficiently calculate exact counts for any length oligonucleotide in large genomes. Based on these oligonucleotide counts, oligonucleotide excess probability clouds, or "P-clouds," were constructed. P-clouds are composed of clusters of related oligonucleotides that occur, as a group, more often than expected by chance. After construction, P-clouds were mapped back onto the genome, and regions of high P-cloud density were identified as repetitive regions based on a sliding window approach. This efficient method is capable of analyzing the repeat content of the entire human genome on a single desktop computer in less than half a day, at least 10-fold faster than current approaches. The predicted repetitive regions strongly overlap with known repeat elements as well as other repetitive regions such as gene families, pseudogenes, and segmental duplicons. This method should be extremely useful as a tool for use in de novo identification of repeat structure in large newly sequenced genomes.
Collapse
Affiliation(s)
- Wanjun Gu
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | | | | | | | | |
Collapse
|
98
|
Abstract
The control and coordination of eukaryotic gene expression rely on transcriptional and post-transcriptional regulatory networks. Although progress has been made in mapping the components and deciphering the function of these networks, the mechanisms by which such intricate circuits originate and evolve remain poorly understood. Here I revisit and expand earlier models and propose that genomic repeats, and in particular transposable elements, have been a rich source of material for the assembly and tinkering of eukaryotic gene regulatory systems.
Collapse
Affiliation(s)
- Cédric Feschotte
- Department of Biology, Life Science Building, BOX 19498, University of Texas, Arlington, Texas 76019, USA.
| |
Collapse
|
99
|
Devor EJ, Huang L, Samollow PB. PiRNA-like RNAs in the marsupial Monodelphis domestica identify transcription clusters and likely marsupial transposon targets. Mamm Genome 2008; 19:581-6. [PMID: 18473137 DOI: 10.1007/s00335-008-9109-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2008] [Accepted: 03/12/2008] [Indexed: 11/24/2022]
Abstract
PIWI-interacting RNAs (piRNAs) are a recently discovered class of small noncoding RNAs that have been detected in human, mouse, rat, zebrafish, and Drosophila genomes. We have utilized a size-directed small-RNA cloning procedure to clone and map more than 300 candidate piRNA-like small RNAs in the genome of the marsupial species Monodelphis domestica. Our results are consistent with those from other species in that the piRNA-like candidate sequences range in size from 28 to 31 nucleotides, show a pronounced preference for uridine at the 5' end, are transcribed from a few large clusters, appear to target transposons, and display virtually no sequence conservation.
Collapse
Affiliation(s)
- Eric J Devor
- Molecular Genetics and Biophysics, Integrated DNA Technologies, 1710 Commercial Park, Coralville, IA 52241, USA.
| | | | | |
Collapse
|
100
|
Schmitz J, Zemann A, Churakov G, Kuhl H, Grützner F, Reinhardt R, Brosius J. Retroposed SNOfall--a mammalian-wide comparison of platypus snoRNAs. Genome Res 2008; 18:1005-10. [PMID: 18463303 DOI: 10.1101/gr.7177908] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Diversification of mammalian species began more than 160 million years ago when the egg-laying monotremes diverged from live bearing mammals. The duck-billed platypus (Ornithorhynchus anatinus) and echidnas are the only potential contemporary witnesses of this period and, thereby, provide a unique insight into mammalian genome evolution. It has become clear that small RNAs are major regulatory agents in eukaryotic cells, and the significant role of non-protein-coding (npc) RNAs in transcription, processing, and translation is now well accepted. Here we show that the platypus genome contains more than 200 small nucleolar (sno) RNAs among hundreds of other diverse npcRNAs. Their comparison among key mammalian groups and other vertebrates enabled us to reconstruct a complete temporal pathway of acquisition and loss of these snoRNAs. In platypus we found cis- and trans-duplication distribution patterns for snoRNAs, which have not been described in any other vertebrates but are known to occur in nematodes. An exciting novelty in platypus is a snoRNA-derived retroposon (termed snoRTE) that facilitates a very effective dispersal of an H/ACA snoRNA via RTE-mediated retroposition. From more than 40,000 detected full-length and truncated genomic copies of this snoRTE, at least 21 are processed into mature snoRNAs. High-copy retroposition via multiple host gene-promoted transcription units is a novel pathway for combining housekeeping function and SINE-like dispersal and reveals a new dimension in the evolution of novel snoRNA function.
Collapse
Affiliation(s)
- Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, Münster 48149, Germany.
| | | | | | | | | | | | | |
Collapse
|