551
|
Hisano M, Yamada S, Tanaka H, Nishimune Y, Nozaki M. Genomic structure and promoter activity of the testis haploid germ cell-specific intronless genes, Tact1 and Tact2. Mol Reprod Dev 2003; 65:148-56. [PMID: 12704725 DOI: 10.1002/mrd.10276] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The Tact1 and Tact2 genes, each of which encodes an actin-like protein, are exclusively expressed and translated in haploid germ cells in testis. To characterize the haploid germ cell-specific gene structure, a mouse genomic library was screened with a Tact1 cDNA as a probe, and four independent phage clones containing the Tact1 gene were isolated. Southern hybridization and sequencing analyses revealed that Tact1 and Tact2 were single copy genes contained on a common fragment in a head-to-head orientation, and that the distance between these genes was less than 2 kb. Comparison of the nucleotide sequences of genomic DNA and cDNA demonstrated that Tact1 and Tact2 lack introns, although all known actin or actin-related genes in mammals contain introns. Human Tact orthologues also lack introns and are located within 6.4 kb in a head-to-head orientation. These findings indicate that Tact1 and Tact2 or one of these genes arose by retroposition of a spliced mRNA transcribed from an actin progenitor gene prior to the divergence of rodents and primates. The Tact1 and Tact2 genes are unusual retroposons in that they have retained an open reading frame and are expressed in testicular germ cells, because almost all retroposons become pseudogenes. It was revealed that a 2kb sequence between the two genes bidirectionally controls haploid germ-cell specific expression by analyzing transgenic mice. Comparison of the murine Tact genes with their human orthologues showed a high level of identity between the two species in the 5'-upstream and non-coding sequences as well as in the coding region, indicating that conserved elements in these regions may be involved in the regulation of haploid germ cell-specific expression. The promoter region contains no TATA-, CCAAT- or GC-boxes, although there are potential cAMP response element (CRE)-like motifs in the 5'-upstream region and the 5'-untranslated region in Tact1 and Tact2, respectively. Transient promoter analyses indicate that CREMtau may activate Tact1 and Tact2 expression in germ cells.
Collapse
Affiliation(s)
- Mizue Hisano
- Department of Science for Laboratory Animal Experimentation, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | | | | | |
Collapse
|
552
|
Affiliation(s)
- Tobias Mourier
- Department of Evolutionary Biology, Zoological Institute, University of Copenhagen, Universitetsparken 15, DK-2100 Copenhagen Ø, Denmark.
| | | |
Collapse
|
553
|
Zhang Z, Gerstein M. Identification and characterization of over 100 mitochondrial ribosomal protein pseudogenes in the human genome. Genomics 2003; 81:468-80. [PMID: 12706105 DOI: 10.1016/s0888-7543(03)00004-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The human (nuclear) genome encodes at least 79 mitochondrial ribosomal proteins (MRPs), which are imported into the mitochondria. Using a comprehensive approach, we find 41 of these give rise to 120 pseudogenes in the genome. The majority of the MRP pseudogenes are of processed origin and can be aligned to match the entire coding region of the functional MRP mRNAs. One processed pseudogene was found to have originated from an alternatively spliced mRNA transcript. We also found two duplicated pseudogenes that are transcribed in the cell as confirmed by screening the human EST database. We observed a significant correlation between the number of processed pseudogenes and the gene CDS length (R = -0.40; p < 0.001), i.e., the relatively shorter genes tend to have more processed pseudogenes. There is also a weaker correlation between the number of processed pseudogenes and the gene CDS GC content. Our study provides a catalogue of human MRP pseudogenes, which will be useful in the study of functional MRP genes. It also provides a molecular record of the evolution of these genes. More details are available at http://pseudogene.org/.
Collapse
Affiliation(s)
- Zhaolei Zhang
- Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA
| | | |
Collapse
|
554
|
Strichman-Almashanu LZ, Bustin M, Landsman D. Retroposed copies of the HMG genes: a window to genome dynamics. Genome Res 2003; 13:800-12. [PMID: 12727900 PMCID: PMC430908 DOI: 10.1101/gr.893803] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Retroposed copies (RPCs) of genes are functional (intronless paralogs) or nonfunctional (processed pseudogenes) copies derived from mRNA through a process of retrotransposition. Previous studies found that gene families involved in mRNA translation or nuclear function were more likely to have large numbers of RPCs. Here we characterize RPCs of the few families coding for the abundant high-mobility-group (HMG) proteins in humans. Using an algorithm we developed, we identified and studied 219 HMG RPCs. For slightly more than 10% of these RPCs, we found evidence indicating expression. Furthermore, eight of these are potentially new members of the HMG families of proteins. For three RPCs, the evidence indicated expression as part of other transcripts; in all of these, we found the presence of alternative splicing or multiple polyadenylation signals. RPC distribution among the HMGs was not even, with 33-65 each for HMGB1, HMGB3, HMGN1, and HMGN2, and 0-6 each for HMGA1, HMGA2, HMGB2, and HMGN3. Analysis of the sequences flanking the RPCs revealed that the junction between the target site duplications and the 5'-flanking sequences exhibited the same TT/AAAA consensus found for the L1 endonuclease, supporting an L1-mediated retrotransposition mechanism. Finally, because our algorithm included aligning RPC flanking sequences with the corresponding HMG genomic sequence, we were able to identify transcribed regions of HMG genes that were not part of the published mRNA sequences.
Collapse
Affiliation(s)
- Liora Z Strichman-Almashanu
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | |
Collapse
|
555
|
Brouha B, Schustak J, Badge RM, Lutz-Prigge S, Farley AH, Moran JV, Kazazian HH. Hot L1s account for the bulk of retrotransposition in the human population. Proc Natl Acad Sci U S A 2003; 100:5280-5. [PMID: 12682288 PMCID: PMC154336 DOI: 10.1073/pnas.0831042100] [Citation(s) in RCA: 803] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2002] [Accepted: 02/20/2003] [Indexed: 11/18/2022] Open
Abstract
Although LINE-1 (long interspersed nucleotide element-1, L1) retrotransposons comprise 17% of the human genome, an exhaustive search of the December 2001 "freeze" of the haploid human genome working draft sequence (95% complete) yielded only 90 L1s with intact ORFs. We demonstrate that 38 of 86 (44%) L1s are polymorphic as to their presence in human populations. We cloned 82 (91%) of the 90 L1s and found that 40 of the 82 (49%) are active in a cultured cell retrotransposition assay. From these data, we predict that there are 80-100 retrotransposition-competent L1s in an average human being. Remarkably, 84% of assayed retrotransposition capability was present in six highly active L1s (hot L1s). By comparison, four of five full-length L1s involved in recent human insertions had retrotransposition activity comparable to the six hot L1s in the human genome working draft sequence. Thus, our data indicate that most L1 retrotransposition in the human population stems from hot L1s, with the remaining elements playing a lesser role in genome plasticity.
Collapse
Affiliation(s)
- Brook Brouha
- Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | | | | | | | | | | | | |
Collapse
|
556
|
Zhang SM, Loker ES. The FREP gene family in the snail Biomphalaria glabrata: additional members, and evidence consistent with alternative splicing and FREP retrosequences. Fibrinogen-related proteins. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2003; 27:175-187. [PMID: 12590969 DOI: 10.1016/s0145-305x(02)00091-5] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Fibrinogen-related proteins (FREPs) found in hemolymph of the snail Biomphalara glabrata are hypothesized to be involved in non-self recognition. Among 150 cloned FREP cDNAs examined, we have identified three additional FREP members, FREPs 3.3, 12.1 and 13.1, bringing the total of FREP subfamilies to 13. The new FREPs each encode two immunoglobulin superfamily domains and a fibrinogen domain. Additionally, five truncated cDNAs with >99% nucleotide identity in coding regions to FREPs 3.2, 12.1 or 13.1 were identified. The truncated forms, the first reported for FREPs, lack a partial exon, one complete exon, or two complete exons plus the 3'UTR. Our preferred hypothesis is that all five truncated cDNAs observed arise from alternative splicing of full-length FREP genes. Genomic sequences lacking at least two introns and corresponding to the 3' ends of the cDNAs of FREP12.1 and its two truncated forms were also recovered. Although these could be the source of the truncated cDNAs, they are believed to be retrosequences.
Collapse
Affiliation(s)
- Si-Ming Zhang
- Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | | |
Collapse
|
557
|
Morales JF, Snow ET, Murnane JP. Environmental factors affecting transcription of the human L1 retrotransposon. II. Stressors. Mutagenesis 2003; 18:151-8. [PMID: 12621071 DOI: 10.1093/mutage/18.2.151] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Retrotransposons have clearly molded the structure of the human genome. The reverse transcriptase coded for by long interspersed nuclear elements (LINEs) accounts for 35% of the human genome, with 8-9 x 10(5) copies of the most common human LINE element, L1Hs. Retrotransposons cycle through an RNA intermediate with transcription as the rate limiting step. Because various retrotransposons have been demonstrated to be induced by environmental stimuli, we investigated the response of the L1Hs promoter to various agents. L1Hs promoter activity was analyzed by transfecting an L1Hs-expressing cell line with plasmids containing one of two L1Hs promoters fused to the LacZ reporter gene. L1Hs promoter activity was then monitored with a beta-galactosidase assay. Treatment with UV light and heat shock resulted in a small increase in beta-galactosidase activity from one promoter, while treatment with tetradecanoylphorbol 13-acetate resulted in small increases in beta-galactosidase activity from both promoters. No increase in beta-galactosidase activity was observed after exposure to X-rays or hydrogen peroxide.
Collapse
Affiliation(s)
- José F Morales
- Radiation Oncology Research Laboratory, University of California-San Francisco, 1855 Folsom Street, MCB 200, San Francisco, CA 94103, USA
| | | | | |
Collapse
|
558
|
Prak ETL, Dodson AW, Farkash EA, Kazazian HH. Tracking an embryonic L1 retrotransposition event. Proc Natl Acad Sci U S A 2003; 100:1832-7. [PMID: 12569170 PMCID: PMC149919 DOI: 10.1073/pnas.0337627100] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Long interspersed nuclear elements 1 (L1) are active retrotransposons that reside in many species, including humans and rodents. L1 elements produce an RNA intermediate that is reverse transcribed to DNA and inserted in a new genomic location. We have tagged an active human L1 element (L1(RP)) with a gene encoding enhanced GFP (EGFP). Expression of GFP occurs only if L1-EGFP has undergone a cycle of transcription, reverse transcription, and integration into a transcriptionally permissive genomic region. We show here that L1-EGFP can undergo retrotransposition in vivo and produce fluorescence in mouse testis. The retrotransposition event characterized here has occurred at a very early stage in the development of an L1-EGFP transgenic founder mouse.
Collapse
Affiliation(s)
- Eline T Luning Prak
- Department of Pathology and Laboratory Medicine, University of Pennsylvania, 405B Stellar Chance Laboratories, 422 Curie Boulevard, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
559
|
Bon E, Casaregola S, Blandin G, Llorente B, Neuvéglise C, Munsterkotter M, Guldener U, Mewes HW, Van Helden J, Dujon B, Gaillardin C. Molecular evolution of eukaryotic genomes: hemiascomycetous yeast spliceosomal introns. Nucleic Acids Res 2003; 31:1121-35. [PMID: 12582231 PMCID: PMC150231 DOI: 10.1093/nar/gkg213] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2002] [Accepted: 12/19/2002] [Indexed: 11/12/2022] Open
Abstract
As part of the exploratory sequencing program Génolevures, visual scrutinisation and bioinformatic tools were used to detect spliceosomal introns in seven hemiascomycetous yeast species. A total of 153 putative novel introns were identified. Introns are rare in yeast nuclear genes (<5% have an intron), mainly located at the 5' end of ORFs, and not highly conserved in sequence. They all share a clear non-random vocabulary: conserved splice sites and conserved nucleotide contexts around splice sites. Homologues of metazoan snRNAs and putative homologues of SR splicing factors were identified, confirming that the spliceosomal machinery is highly conserved in eukaryotes. Several introns' features were tested as possible markers for phylogenetic analysis. We found that intron sizes vary widely within each genome, and according to the phylogenetic position of the yeast species. The evolutionary origin of spliceosomal introns was examined by analysing the degree of conservation of intron positions in homologous yeast genes. Most introns appeared to exist in the last common ancestor of present day yeast species, and then to have been differentially lost during speciation. However, in some cases, it is difficult to exclude a possible sliding event affecting a pre-existing intron or a gain of a novel intron. Taken together, our results indicate that the origin of spliceosomal introns is complex within a given genome, and that present day introns may have resulted from a dynamic flux between intron conservation, intron loss and intron gain during the evolution of hemiascomycetous yeasts.
Collapse
Affiliation(s)
- Elisabeth Bon
- Laboratoire de Génétique Moléculaire et Cellulaire CNRS-INRA, Institut National Agronomique Paris-Grignon, F-78850 Thiverval-Grignon, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
560
|
Morgan K, Conklin D, Pawson AJ, Sellar R, Ott TR, Millar RP. A transcriptionally active human type II gonadotropin-releasing hormone receptor gene homolog overlaps two genes in the antisense orientation on chromosome 1q.12. Endocrinology 2003; 144:423-36. [PMID: 12538601 DOI: 10.1210/en.2002-220622] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
GnRH-II peptide hormone exhibits complete sequence conservation across vertebrate species, including man. Type-II GnRH receptor genes have been characterized recently in nonhuman primates, but the human receptor gene homolog contains a frameshift, a premature stop codon (UGA), and a 3' overlap of the RBM8A gene on chromosome 1q.12. A retrotransposed pseudogene, RBM8B, retains partial receptor sequence. In this study, bioinformatics show that the human receptor gene promoter overlaps the peroxisomal protein 11-beta gene promoter and the premature UGA is positionally conserved in chimpanzee. A CGA [arginine (Arg)] occurs in porcine DNA, but UGA is shifted one codon to the 5' direction in bovine DNA, suggesting independent evolution of premature stop codons. In contrast to marmoset tissue RNA, exon- and strand-specific probes are required to distinguish differently spliced human receptor gene transcripts in cell lines (HP75, IMR-32). RBM8B is not transcribed. Sequencing of cDNAs for spliced receptor mRNAs showed no evidence for alteration of the premature UGA by RNA editing, but alternative splicing circumvents the frameshift to encode a two-membrane-domain protein before this UGA. A stem-loop motif resembling a selenocysteine insertion sequence and a potential alternative translation initiation site might enable expression of further proteins involved in interactions within the GnRH system.
Collapse
Affiliation(s)
- Kevin Morgan
- Medical Research Council Human Reproductive Sciences Unit, University of Edinburgh Academic Centre, Edinburgh EH16 4SB, United Kingdom.
| | | | | | | | | | | |
Collapse
|
561
|
Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M. Identification of pseudogenes in the Drosophila melanogaster genome. Nucleic Acids Res 2003; 31:1033-7. [PMID: 12560500 PMCID: PMC149191 DOI: 10.1093/nar/gkg169] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are classed as either processed pseudogenes (made by reverse transcription from an mRNA) or duplicated pseudogenes, arising from duplication in the genomic DNA and subsequent disablement. Historically, there is anecdotal evidence that the fruit fly (Drosophila melanogaster) has few pseudogenes. Investigators have linked this to a high deletion rate of genomic DNA, for which there is evidence from genetic experiments on genome size. Here, we apply a homology-based pipeline that was developed previously to identify pseudogenes in other eukaryotic genomes, to the fruit fly, so as to derive the first complete survey of its pseudogene population. We find approximately 100 pseudogenes, with at least a sixth of these as candidate processed pseudogenes. This gives a much lower proportion of pseudogenes (compared with the size of the proteome) than in the genomes of other eukaryotes for which data are available (human, nematode and budding yeast). Closest matching proteins to Drosophila pseudogenes are significantly longer than the average protein in its proteome (up to approximately 60% more than the average protein's length), in contrast to the situation in the three other eukaryotic genomes. This may be due to the persistence of fragments of longer genes. In the fly pseudogene population, we found most pseudogenes for serine proteases (which are more abundant in the Drosophila lineage compared with the other eukaryotes), immunoglobulin-motif-containing proteins and cytochromes P450. Data on the sequences and positions of the putative pseudogenes are available at: http://www.pseudogene.org/fly. The detection of a small number of pseudogenes in the Drosophila genome and the higher mean length for the closest matching proteins to pseudogenes (possibly because remnants of genes encoding longer proteins are more likely to persist) are further evidence for a high deletion rate of genomic DNA in the fruit fly. The data are useful for molecular evolution study in Drosophila.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, PO Box 208114, New Haven, CT 06520-8114, USA.
| | | | | | | | | |
Collapse
|
562
|
Szak ST, Pickeral OK, Landsman D, Boeke JD. Identifying related L1 retrotransposons by analyzing 3' transduced sequences. Genome Biol 2003; 4:R30. [PMID: 12734010 PMCID: PMC156586 DOI: 10.1186/gb-2003-4-5-r30] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2002] [Revised: 03/06/2003] [Accepted: 03/24/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A large fraction of the human genome is attributable to L1 retrotransposon sequences. Not only do L1s themselves make up a significant portion of the genome, but L1-encoded proteins are thought to be responsible for the transposition of other repetitive elements and processed pseudogenes. In addition, L1s can mobilize non-L1, 3'-flanking DNA in a process called 3' transduction. Using computational methods, we collected DNA sequences from the human genome for which we have high confidence of their mobilization through L1-mediated 3' transduction. RESULTS The precursors of L1s with transduced sequence can often be identified, allowing us to reconstruct L1 element families in which a single parent L1 element begot many progeny L1s. Of the L1s exhibiting a sequence structure consistent with 3' transduction (L1 with transduction-derived sequence, L1-TD), the vast majority were located in duplicated regions of the genome and thus did not necessarily represent unique insertion events. Of the remaining L1-TDs, some lack a clear polyadenylation signal, but the alignment between the parent-progeny sequences nevertheless ends in an A-rich tract of DNA. CONCLUSIONS Sequence data suggest that during the integration into the genome of RNA representing an L1-TD, reverse transcription may be primed internally at A-rich sequences that lie downstream of the L1 3' untranslated region. The occurrence of L1-mediated transduction in the human genome may be less frequent than previously thought, and an accurate estimate is confounded by the frequent occurrence of segmental genomic duplications.
Collapse
Affiliation(s)
- Suzanne T Szak
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Current address: Biogen Inc, Cambridge, MA 02142, USA
| | - Oxana K Pickeral
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N. Wolfe St., Baltimore, MD 21205, USA
- Current address: Human Genome Sciences Inc., Rockville, MD 20850, USA
| | - David Landsman
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Jef D Boeke
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N. Wolfe St., Baltimore, MD 21205, USA
| |
Collapse
|
563
|
Birth of ‘human-specific’ genes during primate evolution. CONTEMPORARY ISSUES IN GENETICS AND EVOLUTION 2003. [DOI: 10.1007/978-94-010-0229-5_9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
564
|
Javaud C, Dupuy F, Maftah A, Julien R, Petit JM. The fucosyltransferase gene family: an amazing summary of the underlying mechanisms of gene evolution. CONTEMPORARY ISSUES IN GENETICS AND EVOLUTION 2003. [DOI: 10.1007/978-94-010-0229-5_6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
565
|
|
566
|
Turgeon B, Lang BF, Meloche S. The protein kinase ERK3 is encoded by a single functional gene: genomic analysis of the ERK3 gene family. Genomics 2002; 80:673-80. [PMID: 12504858 DOI: 10.1006/geno.2002.7013] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Extracellular signal-regulated kinase 3 (ERK3) is a distantly related member of the mitogen-activated protein (MAP) kinase family of serine/threonine kinases. Here, we report the characterization of the genomic loci encoding ERK3 in mice and humans. The mouse ERK3 gene (Mapk6) spans more than 20 kb and is split into six exons. Its structure is similar to that of the human MAPK6 gene, which extends over 40 kb. We also identified and characterized a mouse Mapk6 processed pseudogene. In humans, database analysis has revealed the presence of six MAPK6 processed pseudogenes localized on four different chromosomes. We further show that the structure of MAPK6 is closely related to that of the gene encoding the homologous protein kinase p63(MAPK) (MAPK4), suggesting that the two genes arose by duplication. Our analysis demonstrates that the ERK3 subfamily of MAP kinase genes is composed of two functional genes, MAPK6 and MAPK4, and several pseudogenes.
Collapse
Affiliation(s)
- Benjamin Turgeon
- Institut de Recherches Cliniques de Montréal, Department of Pharmacology, Université de Montréal, Montréal, Québec, H2W 1R7, Canada
| | | | | |
Collapse
|
567
|
Abstract
L1 elements are ubiquitous human transposons that replicate via an RNA intermediate. We have reconstituted the initial stages of L1 element transposition in vitro. The reaction requires only the L1 ORF2 protein, L1 3' RNA, a target DNA and appropriate buffer components. We detect branched molecules consisting of junctions between transposon 3' end cDNA and the target DNA, resulting from priming at a nick in the target DNA. 5' junctions of transposon cDNA and target DNA are also observed. The nicking and reverse transcription steps in the reaction can be uncoupled, as priming at pre-existing nicks and even double-strand breaks can occur. We find evidence for specific positioning of the L1 RNA with the ORF2 protein, probably mediated in part by the polyadenosine portion of L1 RNA. Polyguanosine, similar to a conserved region of the L1 3' UTR, potently inhibits L1 endonuclease (L1 EN) activity. L1 EN activity is also repressed in the context of the full-length ORF2 protein, but it and a second cryptic nuclease activity are released by ORF2p proteolysis. Additionally, heterologous RNA species such as Alu element RNA and L1 transcripts with 3' extensions are substrates for the reaction.
Collapse
Affiliation(s)
- Gregory J. Cost
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N.Wolfe Street, 617 Hunterian, Baltimore, MD 21205, USA and Génétique des Interactions Macromoléculaires, CNRS URA2171, Institut Pasteur, 25–28 rue Docteur Roux, 75724 Paris Cedex 15, France Corresponding author e-mail:
| | - Qinghua Feng
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N.Wolfe Street, 617 Hunterian, Baltimore, MD 21205, USA and Génétique des Interactions Macromoléculaires, CNRS URA2171, Institut Pasteur, 25–28 rue Docteur Roux, 75724 Paris Cedex 15, France Corresponding author e-mail:
| | - Alain Jacquier
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N.Wolfe Street, 617 Hunterian, Baltimore, MD 21205, USA and Génétique des Interactions Macromoléculaires, CNRS URA2171, Institut Pasteur, 25–28 rue Docteur Roux, 75724 Paris Cedex 15, France Corresponding author e-mail:
| | - Jef D. Boeke
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N.Wolfe Street, 617 Hunterian, Baltimore, MD 21205, USA and Génétique des Interactions Macromoléculaires, CNRS URA2171, Institut Pasteur, 25–28 rue Docteur Roux, 75724 Paris Cedex 15, France Corresponding author e-mail:
| |
Collapse
|
568
|
Abstract
We characterized members of the LINE (UnaL2) and SINE (UnaSINE1) families from the eel genome and found that these LINE/SINE partners share similar 3' tails. A retrotransposition assay in HeLa cells demonstrated that the 3' conserved tail of UnaL2 is necessary for its retrotransposition. This 3' tail is recognized in trans by the UnaL2 reverse transcriptase at a surprisingly high rate, and that of UnaSINE1 can also be recognized, thus providing experimental evidence that a SINE can be mobilized by the retrotransposition machinery of a partner LINE. We also demonstrated that short repeats at the 3' end of UnaL2 are required for retrotransposition suggesting that UnaL2 retrotransposes in a manner reminiscent of the reverse transcriptase activity of telomerases.
Collapse
Affiliation(s)
- Masaki Kajikawa
- Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, Japan
| | | |
Collapse
|
569
|
Pavlícek A, Paces J, Zíka R, Hejnar J. Length distribution of long interspersed nucleotide elements (LINEs) and processed pseudogenes of human endogenous retroviruses: implications for retrotransposition and pseudogene detection. Gene 2002; 300:189-94. [PMID: 12468100 DOI: 10.1016/s0378-1119(02)01047-8] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Deciphering the human genome includes reliable identification and structural characterization of individual retrotransposon elements. The most active group of autonomous transposable elements, the long interspersed nuclear elements (LINE), transpose themselves as well as other RNAs, including those of human endogenous retroviruses (HERV). During this transposition, however, the LINE-encoded reverse transcriptase (RT) often abortively dissociates from the RNA template, leaving a prematurely terminated, 5' truncated copy. We have analyzed the length distributions of LINEs and of processed pseudogenes derived from HERV-W. As expected, we have found that the majority of 5' truncated LINEs and HERV-W processed pseudogenes show a prevalence of very short elements terminated close to the 3' end. On the other hand, the number of complete elements is far above the expectation. The characteristic distribution in both cases indicates two important conclusions: (i) dissociation of LINE RT from the template cannot be fully explained by low processivity of RT modelled as a stochastic, Poisson-type process. (ii) Currently cited numbers of pseudogenes within the human genome are underestimated, since a large percentage of pseudogenes are terminated in the 3' untranslated region and remain undetectable in translated homology searches of protein databases against the human genome.
Collapse
Affiliation(s)
- Adam Pavlícek
- Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, Flemingovo nam. 2, Prague 6, CZ-16637, Czech Republic
| | | | | | | |
Collapse
|
570
|
Dasilva C, Hadji H, Ozouf-Costaz C, Nicaud S, Jaillon O, Weissenbach J, Roest Crollius H. Remarkable compartmentalization of transposable elements and pseudogenes in the heterochromatin of the Tetraodon nigroviridis genome. Proc Natl Acad Sci U S A 2002; 99:13636-41. [PMID: 12368471 PMCID: PMC129727 DOI: 10.1073/pnas.202284199] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2002] [Indexed: 11/18/2022] Open
Abstract
Tetraodon nigroviridis is among the smallest known vertebrate genomes and as such represents an interesting model for studying genome architecture and evolution. Previous studies have shown that Tetraodon contains several types of tandem and dispersed repeats, but that their overall contribution is >10% of the genome. Using genomic library hybridization, fluorescent in situ hybridization, and whole genome shotgun and directed sequencing, we have investigated the global and local organization of repeat sequences in Tetraodon. We show that both tandem and dispersed repeat elements are compartmentalized in specific regions that correspond to the short arms of small subtelocentric chromosomes. The concentration of repeats in these heterochromatic regions is in sharp contrast to their paucity in euchromatin. In addition, we have identified a number of pseudogenes that have arisen through either duplication of genes or the retro-transcription of mRNAs. These pseudogenes are amplified to high numbers, some with more than 200 copies, and remain almost exclusively located in the same heterochromatic regions as transposable elements. The sequencing of one such heterochromatic region reveals a complex pattern of duplications and inversions, reminiscent of active and frequent rearrangements that can result in the truncation and hence inactivation of transposable elements. This tight compartmentalization of repeats and pseudogenes is absent in large vertebrate genomes such as mammals and is reminiscent of genomes that remain compact during evolution such as Drosophila and Arabidopsis.
Collapse
Affiliation(s)
- Corinne Dasilva
- Genoscope and Centre National de la Recherche Scientifique, Unité Mixte de Recherche 8030, 2 Rue Gaston Crémieux, 91057 Evry Cedex, France
| | | | | | | | | | | | | |
Collapse
|
571
|
Ike A, Yamada S, Tanaka H, Nishimune Y, Nozaki M. Structure and promoter activity of the gene encoding ornithine decarboxylase antizyme expressed exclusively in haploid germ cells in testis (OAZt/Oaz3). Gene 2002; 298:183-93. [PMID: 12426106 DOI: 10.1016/s0378-1119(02)00978-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Ornithine decarboxylase antizyme 1 and 2 (OAZ1 and OAZ2) are expressed ubiquitously, and control the intracellular concentration of polyamines. Their testicular isoform, OAZt/Oaz3, is specifically expressed in differentiated haploid germ cells. We have identified and characterized the gene encoding OAZt in mice. The mouse OAZt gene contains, as does the human ortholog and paralogs, five exons and four introns. Comparison of the mouse OAZt with the human ortholog gene revealed that exon sizes are identical and nucleotide sequences in exons are highly homologous (83% identity). The major transcriptional start site was determined by primer extension assay. Promoter activity was confirmed by transgenic mouse assays, using the upstream region of the mouse OAZt gene fused to a EGFP reporter gene. The OAZt essential promoter located between -133 and +242, has two CREs and an Inr, and lacks a TATA box. These elements are conserved in the human ortholog but not in the paralogs, indicating that such a short upstream region including two CREs and Inr is sufficient to drive endogenous OAZt mRNA expression in the haploid testicular germ cells.
Collapse
Affiliation(s)
- Akiko Ike
- Department of Science for Laboratory Animal Experimentation, Research Institute for Microbial Diseases, Osaka University, 3-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | | | | | |
Collapse
|
572
|
Majumdar M, Bharadwaj A, Ghosh I, Ramachandran S, Datta K. Evidence for the presence of HABP1 pseudogene in multiple locations of mammalian genome. DNA Cell Biol 2002; 21:727-35. [PMID: 12443542 DOI: 10.1089/104454902760599708] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The gene encoding hyaluronan-binding protein 1 (HABP1) is expressed ubiquitously in different rat tissues, and is present in eukaryotic species from yeast to humans. Fluorescence in situ hybridization indicates that this is localized in human chromosome 17p13.3. Here, we report the presence of homologous sequences of HABP1 cDNA, termed processed HABP1 pseudogene in humans. This is concluded from an additional PCR product of ~0.5 kb, along with the expected band at approximately 5 kb as observed by PCR amplification of human genomic DNA with HABP1-specific primers. Partial sequencing of the 5-kb PCR product and comparison of the HABP1 cDNA with the sequence obtained from Genbank accession number AC004148 indicated that the HABP1 gene is comprised of six exons and five introns. The 0.5-kb additional PCR product was confirmed to be homologous to HABP1 cDNA by southern hybridization, sequencing, and by a sequence homology search. Search analysis with HABP1 cDNA sequence further revealed the presence of similar sequence in chromosomes 21 and 11, which could generate ~0.5 kb with the primers used. In this report, we describe the presence of several copies of the pseudogene of HABP1 spread over different chromosomes that vary in length and similarity to the HABP1 cDNA sequence. These are 1013 bp in chromosome 21 with 85.4% similarity, 1071 bp in chromosome 11 with 87.2% similarity, 818 bp in chromosome 15 with 82.3% similarity, and 323 bp in chromosome 4 with 84% similarity to HABP1 cDNA. We have also identified similar HABP1 pseudogenes in the rat and mouse genome. The human pseudogene sequence of HABP1 possesses a 10 base pair direct repeat of "AGAAAAATAA" in chromosome 21, a 12-bp direct repeat of "AG/CAAATTA/CAA/TTA" in chromosome 4, a 8-bp direct repeat of "ACAAAG/TCT" in chromosome 15. In the case of chromosome 11, there is an inverted repeat of "AGCCTGGGCGACAGAGCGAGA" ~50 bp upstream of the HABP1 pseudogene sequence. All of the HABP1 pseudogene sequences lack 5' promoter sequence and possess multiple mutations leading to the insertion of premature stop codons in all three reading frames. Rat and mouse homologs of the HABP1 pseudogene also contain multiple mutations, leading to the insertion of premature stop codons confirming the identity of a processed pseudogene.
Collapse
Affiliation(s)
- M Majumdar
- Biochemistry Laboratory, School of Environmental Sciences, Jawaharlal Nehru University, New Delhi, India
| | | | | | | | | |
Collapse
|
573
|
Abstract
The eukaryotic genome has undergone a series of epidemics of amplification of mobile elements that have resulted in most eukaryotic genomes containing much more of this 'junk' DNA than actual coding DNA. The majority of these elements utilize an RNA intermediate and are termed retroelements. Most of these retroelements appear to amplify in evolutionary waves that insert in the genome and then gradually diverge. In humans, almost half of the genome is recognizably derived from retroelements, with the two elements that are currently actively amplifying, L1 and Alu, making up about 25% of the genome and contributing extensively to disease. The mechanisms of this amplification process are beginning to be understood, although there are still more questions than answers. Insertion of new retroelements may directly damage the genome, and the presence of multiple copies of these elements throughout the genome has longer-term influences on recombination events in the genome and more subtle influences on gene expression.
Collapse
Affiliation(s)
- Prescott L Deininger
- Tulane Cancer Center, Department of Environmental Health Sciences, Tulane University Health Sciences Center, New Orleans, Louisiana 70112, USA.
| | | |
Collapse
|
574
|
Zhang Z, Harrison P, Gerstein M. Identification and analysis of over 2000 ribosomal protein pseudogenes in the human genome. Genome Res 2002; 12:1466-82. [PMID: 12368239 PMCID: PMC187539 DOI: 10.1101/gr.331902] [Citation(s) in RCA: 149] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2002] [Accepted: 08/12/2002] [Indexed: 11/24/2022]
Abstract
Mammals have 79 ribosomal proteins (RP). Using a systematic procedure based on sequence-homology, we have comprehensively identified pseudogenes of these proteins in the human genome. Our assignments are available at http://www.pseudogene.org or http://bioinfo.mbb.yale.edu/genome/pseudogene. In total, we found 2090 processed pseudogenes and 16 duplications of RP genes. In relation to the matching parent protein, each of the processed pseudogenes has an average relative sequence length of 97% and an average sequence identity of 76%. A small number (258) of them do not contain obvious disablements (stop codons or frameshifts) and, therefore, could be mistaken as functional genes, and 178 are disrupted by one or more repetitive elements. On average, processed pseudogenes have a longer truncation at the 5' end than the 3' end, consistent with the target-primed-reverse-transcription (TPRT) mechanism. Interestingly, on chromosome 16, an RPL26 processed pseudogene was found in the intron region of a functional RPS2 gene. The large-scale distribution of RP pseudogenes throughout the genome appears to result, chiefly, from random insertions with the numbers on each chromosome, consequently, proportional to its size. In contrast to RP genes, the RP pseudogenes have the highest density in GC-intermediate regions (41%-46%) of the genome, with the density pattern being between that of LINEs and Alus. This can be explained by a negative selection theory as we observed that GC-rich RP pseudogenes decay faster in GC-poor regions. Also, we observed a correlation between the number of processed pseudogenes and the GC content of the associated functional gene, i.e., relatively GC-poor RPs have more processed pseudogenes. This ranges from 145 pseudogenes for RPL21 down to 3 pseudogenes for RPL14. We were able to date the RP pseudogenes based on their sequence divergence from present-day RP genes, finding an age distribution similar to that for Alus. The distribution is consistent with a decline in retrotransposition activity in the hominid lineage during the last 40 Myr. We discuss the implications for retrotransposon stability and genome dynamics based on these new findings.
Collapse
Affiliation(s)
- Zhaolei Zhang
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA
| | | | | |
Collapse
|
575
|
Buzdin A, Ustyugova S, Gogvadze E, Vinogradova T, Lebedev Y, Sverdlov E. A new family of chimeric retrotranscripts formed by a full copy of U6 small nuclear RNA fused to the 3' terminus of l1. Genomics 2002; 80:402-6. [PMID: 12376094 DOI: 10.1006/geno.2002.6843] [Citation(s) in RCA: 90] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Long interspersed nuclear elements (LINE-1, L1) constitute a large family of mammalian retrotransposons that have been replicating and evolving in mammals for more than 100 million years and now compose 17% of the human genome. They have an important creative role in human genomic evolution through mechanisms such as new integrations, generation of processed pseudogenes, and transfer of non-L1 DNA flanking their 3' ends to new genomic locations. Here we present evidence that the L1 integration machinery was used for the creation of a new family of chimeric retrotranscripts, which contain a full copy of U6 small nuclear RNA and a 3' part of L1 at their 5' and 3' ends, respectively. There are at least 56 members of this family in the human genome. The integrations of such fused retrotranscripts into the human genome took place until recently. Here we report one U6-L1 insertion that is polymorphic in humans. We also propose a mechanism used to generate chimeric retrotranscripts.
Collapse
Affiliation(s)
- Anton Buzdin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia.
| | | | | | | | | | | |
Collapse
|
576
|
Abstract
The heterogeneous, short RNAs produced from the high, copy, short mobile elements (SINEs) interact with proteins to form RNA-protein (RNP) complexes. In particular, the BC1 RNA, which is transcribed to high levels specifically in brain and testis from one locus of the ID SINE family, exists as a discrete RNP complex. We expressed a series of altered BC1, and other SINE-related RNAs, in several cell lines and tested for the mobility of the resulting RNP complexes in a native PAGE assay to determine which portions of these SINE RNAs contribute to protein binding. When different SINE RNAs were substituted for the BC1 ID sequence, the resulting RNPs exhibited the same mobility as BC1. This indicates that the protein(s) binding to the ID portion of BC1 is not sequence specific and may be more dependent upon the secondary structure of the RNA. It also suggests that all SINE RNAs may bind a similar set of cellular proteins. Deletion of the A-rich region of BC1 RNA has a marked effect on the mobility of the RNP. Rodent cell lines exhibit a slightly different mobility for this shifted complex when compared to human cell lines, reflecting evolutionary differences in one or more of the protein components. On the basis of mobility change observed in RNP complexes when the A-rich region is removed, we decided to examine poly(A) binding protein (PABP) as a candidate member of the RNP. An antibody against the C terminus of PABP is able to immunoprecipitate BC1 RNA, confirming PABP's presence in the BC1 RNP. Given the ubiquitous role of poly(A) regions in the retrotransposition process, these data suggest that PABP may contribute to the SINE retrotransposition process.
Collapse
Affiliation(s)
- Neva West
- Tulane Cancer Center, SL-66, Department of Environmental Health Sciences, Tulane University Health Sciences Center, 1430 Tulane Avenue, New Orleans, LA 70112, USA
| | | | | | | | | |
Collapse
|
577
|
Cristofari G, Bampi C, Wilhelm M, Wilhelm FX, Darlix JL. A 5'-3' long-range interaction in Ty1 RNA controls its reverse transcription and retrotransposition. EMBO J 2002; 21:4368-79. [PMID: 12169639 PMCID: PMC126173 DOI: 10.1093/emboj/cdf436] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
LTR-retrotransposons are abundant components of all eukaryotic genomes and appear to be key players in their evolution. They share with retroviruses a reverse transcription step during their replication cycle. To better understand the replication of retrotransposons as well as their similarities to and differences from retroviruses, we set up an in vitro model system to examine minus-strand cDNA synthesis of the yeast Ty1 LTR-retrotransposon. Results show that the 5' and 3' ends of Ty1 genomic RNA interact through 14 nucleotide 5'-3' complementary sequences (CYC sequences). This 5'-3' base pairing results in an efficient initiation of reverse transcription in vitro. Transposition of a marked Ty1 element and Ty1 cDNA synthesis in yeast rely on the ability of the CYC sequences to base pair. This 5'-3' interaction is also supported by phylogenic analysis of all full-length Ty1 and Ty2 elements present in the Saccharomyces cerevisiae genome. These novel findings lead us to propose that circularization of the Ty1 genomic RNA controls initiation of reverse transcription and may limit reverse transcription of defective retroelements.
Collapse
Affiliation(s)
| | | | - Marcelle Wilhelm
- LaboRetro, INSERM U412, Ecole Normale Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon Cedex 07 and
Institut de Biologie Moléculaire et Cellulaire, 15, rue R. Descartes, 67084 Strasbourg, France Corresponding author e-mail:
| | - François-Xavier Wilhelm
- LaboRetro, INSERM U412, Ecole Normale Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon Cedex 07 and
Institut de Biologie Moléculaire et Cellulaire, 15, rue R. Descartes, 67084 Strasbourg, France Corresponding author e-mail:
| | - Jean-Luc Darlix
- LaboRetro, INSERM U412, Ecole Normale Supérieure de Lyon, 46 Allée d’Italie, 69364 Lyon Cedex 07 and
Institut de Biologie Moléculaire et Cellulaire, 15, rue R. Descartes, 67084 Strasbourg, France Corresponding author e-mail:
| |
Collapse
|
578
|
Abstract
LINE-1 (L1) retrotransposition continues to impact the human genome, yet little is known about how L1 integrates into DNA. Here, we developed a plasmid-based rescue system and have used it to recover 37 new L1 retrotransposition events from cultured human cells. Sequencing of the insertions revealed the usual L1 structural hallmarks; however, in four instances, retrotransposition generated large target site deletions. Remarkably, three of those resulted in the formation of chimeric L1s, containing the 5' end of an endogenous L1 fused precisely to our engineered L1. Thus, our data demonstrate multiple pathways for L1 integration in cultured cells, and show that L1 is not simply an insertional mutagen, but that its retrotransposition can result in significant deletions of genomic sequence.
Collapse
Affiliation(s)
- Nicolas Gilbert
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
| | | | | |
Collapse
|
579
|
Abstract
The LINE-1 (L1) retrotransposon, the most important human mobile element, shapes the genome in many ways. Now two groups provide evidence that L1 retrotransposition is associated with large genomic deletions and inversions in transformed cells. If these events occur at a similar frequency in vivo, they have had a substantial effect on human genome evolution.
Collapse
Affiliation(s)
- Haig H Kazazian
- Department of Genetics, School of Medicine, 475 Clinical Research Building, 415 Curie Boulevard, University of Pennsylvania, Philadelphia 19105, USA.
| | | |
Collapse
|
580
|
Symer DE, Connelly C, Szak ST, Caputo EM, Cost GJ, Parmigiani G, Boeke JD. Human l1 retrotransposition is associated with genetic instability in vivo. Cell 2002; 110:327-38. [PMID: 12176320 DOI: 10.1016/s0092-8674(02)00839-5] [Citation(s) in RCA: 352] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Retrotransposons have shaped eukaryotic genomes for millions of years. To analyze the consequences of human L1 retrotransposition, we developed a genetic system to recover many new L1 insertions in somatic cells. Forty-two de novo integrants were recovered that faithfully mimic many aspects of L1s that accumulated since the primate radiation. Their structures experimentally demonstrate an association between L1 retrotransposition and various forms of genetic instability. Numerous L1 element inversions, extra nucleotide insertions, exon deletions, a chromosomal inversion, and flanking sequence comobilization (called 5' transduction) were identified. In a striking number of integrants, short identical sequences were shared between the donor and the target site's 3' end, suggesting a mechanistic model that helps explain the structure of L1 insertions.
Collapse
Affiliation(s)
- David E Symer
- Department of Molecular Biology and Genetics, John Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| | | | | | | | | | | | | |
Collapse
|
581
|
Brouha B, Meischl C, Ostertag E, de Boer M, Zhang Y, Neijens H, Roos D, Kazazian HH. Evidence consistent with human L1 retrotransposition in maternal meiosis I. Am J Hum Genet 2002; 71:327-36. [PMID: 12094329 PMCID: PMC379165 DOI: 10.1086/341722] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2002] [Accepted: 05/10/2002] [Indexed: 11/04/2022] Open
Abstract
We have used a unique polymorphic 3' transduction to show that a human L1, or LINE-1 (long interspersed nucleotide element-1), retrotransposition event most likely occurred in the maternal primary oocyte during meiosis I. We characterized a truncated L1 retrotransposon with a 3' transduction that was inserted, in a Dutch male patient, into the X-linked gene CYBB, thereby causing chronic granulomatous disease. We used the unique flanking sequence to localize the precursor L1 locus, LRE3, to chromosome 2q24.1. In a cell culture assay, the retrotransposition frequency of LRE3 is greater than that for any other element that has been tested to date. The patient's mother had two LRE3 alleles that differed slightly in the 3'-flanking genomic DNA. The patient had a single LRE3 allele that was identical to one of the maternal alleles; however, the patient's insertion matched the maternal LRE3 allele that he did not inherit. Other data indicate that there is only a small chance that the father (unavailable for analysis) carries the precursor LRE3 allele. In addition, paternal origin of the insertion would have required that an LRE3 mRNA transcribed before meiosis II be carried separately from its precursor LRE3 allele in the fertilizing sperm. Since the mother carries a potential precursor allele and the insertion was on the patient's maternal X chromosome, it is highly likely that the insertion originated during maternal meiosis I.
Collapse
Affiliation(s)
- Brook Brouha
- Department of Genetics, University of Pennsylvania School of Medicine, 475 Clinical Research Building, 415 Curie Boulevard, Philadelphia, PA 19104, USA
| | | | | | | | | | | | | | | |
Collapse
|
582
|
Angata T, Kerr SC, Greaves DR, Varki NM, Crocker PR, Varki A. Cloning and characterization of human Siglec-11. A recently evolved signaling molecule that can interact with SHP-1 and SHP-2 and is expressed by tissue macrophages, including brain microglia. J Biol Chem 2002; 277:24466-74. [PMID: 11986327 DOI: 10.1074/jbc.m202833200] [Citation(s) in RCA: 153] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Siglecs are sialic acid-recognizing animal lectins of the immunoglobulin superfamily. We have cloned and characterized a novel human molecule, Siglec-11, that belongs to the subgroup of CD33/Siglec-3-related Siglecs. As with others in this subgroup, the cytosolic domain of Siglec-11 is phosphorylated at tyrosine residue(s) upon pervanadate treatment of cells and then recruits the protein-tyrosine phosphatases SHP-1 and SHP-2. However, Siglec-11 has several novel features relative to the other CD33/Siglec-3-related Siglecs. First, it binds specifically to alpha2-8-linked sialic acids. Second, unlike other CD33/Siglec-3-related Siglecs, Siglec-11 was not found on peripheral blood leukocytes. Instead, we observed its expression on macrophages in various tissues, such as liver Kupffer cells. Third, it was also expressed on brain microglia, thus becoming the second Siglec to be found in the nervous system. Fourth, whereas the Siglec-11 gene is on human chromosome 19, it lies outside the previously described CD33/Siglec-3-related Siglec cluster on this chromosome. Fifth, analyses of genome data bases indicate that Siglec-11 has no mouse ortholog and that it is likely to be the last canonical human Siglec to be reported. Finally, although Siglec-11 shows marked sequence similarity to human Siglec-10 in its extracellular domain, the cytosolic tail appears only distantly related. Analysis of genomic regions surrounding the Siglec-11 gene suggests that it is actually a chimeric molecule that arose from relatively recent gene duplication and recombination events, involving the extracellular domain of a closely related ancestral Siglec gene (which subsequently became a pseudogene) and a transmembrane and cytosolic tail derived from another ancestral Siglec.
Collapse
MESH Headings
- Amino Acid Sequence
- Antigens, CD/analysis
- Antigens, CD/chemistry
- Antigens, CD/genetics
- Antigens, CD/metabolism
- Antigens, Differentiation, Myelomonocytic/analysis
- Antigens, Differentiation, Myelomonocytic/chemistry
- Antigens, Differentiation, Myelomonocytic/genetics
- Antigens, Differentiation, Myelomonocytic/metabolism
- Appendix/cytology
- Appendix/metabolism
- Base Sequence
- Brain/physiology
- Cloning, Molecular
- Evolution, Molecular
- Humans
- Intracellular Signaling Peptides and Proteins
- Lectins/chemistry
- Lectins/genetics
- Lectins/metabolism
- Macrophages/physiology
- Membrane Proteins
- Microglia/physiology
- Molecular Sequence Data
- Organ Specificity
- Palatine Tonsil/cytology
- Palatine Tonsil/metabolism
- Protein Tyrosine Phosphatase, Non-Receptor Type 11
- Protein Tyrosine Phosphatase, Non-Receptor Type 6
- Protein Tyrosine Phosphatases/metabolism
- Pseudogenes
- RNA, Messenger/genetics
- Reverse Transcriptase Polymerase Chain Reaction
- Sequence Alignment
- Sequence Homology, Amino Acid
- Sialic Acid Binding Ig-like Lectin 3
- Transcription, Genetic
Collapse
Affiliation(s)
- Takashi Angata
- Glycobiology Research and Training Center, Department of Medicine, University of California, San Diego, La Jolla, California 92093-0687, USA
| | | | | | | | | | | |
Collapse
|
583
|
Esnault C, Casella JF, Heidmann T. A Tetrahymena thermophila ribozyme-based indicator gene to detect transposition of marked retroelements in mammalian cells. Nucleic Acids Res 2002; 30:e49. [PMID: 12034850 PMCID: PMC117211 DOI: 10.1093/nar/30.11.e49] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We devised an indicator gene for retrotransposition based on an autocatalytic ribozyme element--the Tetrahymena thermophila 23S rRNA group I intron--which can self-splice in vitro and does not require--at variance with nuclear mRNA introns--any specific pathway and cellular component for the completion of the splicing process. Several constructs, with the Tetrahymena intron adequately modified so as to be inserted at various positions within a neomycin-containing cassette under conditions that restore the neomycin-coding sequence after splicing out of the intron, were assayed for splicing efficiency in mammalian cells in culture. We show, both by northern blot analysis and by the recovery of neomycin activity upon retroviral transduction of the cassettes, that splicing efficiency depends on both the local base pairing and the global position of the intron within the neomycin transcript, and that some constructs are functional. We further show that they allow the efficient sorting out of retrotransposition events when assayed, as a control, with a human LINE retrotransposon. These indicator genes should be of great help in elucidating the mechanisms of transposition of a series of retroelements associated with transcripts not prone to nuclear mRNA intron splicing and previously not opened to any retrotransposition assay.
Collapse
MESH Headings
- 3T3 Cells
- Animals
- Genes, Reporter/genetics
- HeLa Cells
- Humans
- Introns/genetics
- Long Interspersed Nucleotide Elements/genetics
- Mice
- Neomycin
- Nucleic Acid Conformation
- RNA Splicing/genetics
- RNA, Catalytic/chemistry
- RNA, Catalytic/genetics
- RNA, Catalytic/metabolism
- RNA, Messenger/genetics
- RNA, Messenger/metabolism
- RNA, Protozoan/chemistry
- RNA, Protozoan/genetics
- RNA, Protozoan/metabolism
- RNA, Ribosomal, 23S/chemistry
- RNA, Ribosomal, 23S/genetics
- RNA, Ribosomal, 23S/metabolism
- Recombination, Genetic/genetics
- Retroelements/genetics
- Retroviridae/genetics
- Tetrahymena thermophila/enzymology
- Tetrahymena thermophila/genetics
- Transduction, Genetic
Collapse
Affiliation(s)
- Cécile Esnault
- Unité des Rétrovirus Endogènes et Eléments Rétroïdes des Eucaryotes Supérieurs, CNRS UMR 1573, Institut Gustave Roussy, 39 rue Camille Desmoulins, 94805 Villejuif Cedex, France
| | | | | |
Collapse
|
584
|
Abstract
SINEs and LINEs are short and long interspersed retrotransposable elements, respectively, that invade new genomic sites using RNA intermediates. SINEs and LINEs are found in almost all eukaryotes (although not in Saccharomyces cerevisiae) and together account for at least 34% of the human genome. The noncoding SINEs depend on reverse transcriptase and endonuclease functions encoded by partner LINEs. With the completion of many genome sequences, including our own, the database of SINEs and LINEs has taken a great leap forward. The new data pose new questions that can only be answered by detailed studies of the mechanism of retroposition. Current work ranges from the biochemistry of reverse transcription and integration invitro, target site selection in vivo, nucleocytoplasmic transport of the RNA and ribonucleoprotein intermediates, and mechanisms of genomic turnover. Two particularly exciting new ideas are that SINEs may help cells survive physiological stress, and that the evolution of SINEs and LINEs has been shaped by the forces of RNA interference. Taken together, these studies promise to explain the birth and death of SINEs and LINEs, and the contribution of these repetitive sequence families to the evolution of genomes.
Collapse
Affiliation(s)
- Alan M Weiner
- Department of Biochemistry, HSB J417, University of Washington, Box 357350, Seattle, WA 98195-7350, USA.
| |
Collapse
|
585
|
Harrison PM, Gerstein M. Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol 2002; 318:1155-74. [PMID: 12083509 DOI: 10.1016/s0022-2836(02)00109-2] [Citation(s) in RCA: 121] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Protein families can be used to understand many aspects of genomes, both their "live" and their "dead" parts (i.e. genes and pseudogenes). Surveys of genomes have revealed that, in every organism, there are always a few large families and many small ones, with the overall distribution following a power-law. This commonality is equally true for both genes and pseudogenes, and exists despite the fact that the specific families that are enlarged differ greatly between organisms. Furthermore, because of family structure there is great redundancy in proteomes, a fact linked to the large number of dispensable genes for each organism and the small size of the minimal, indispensable sub-proteome. Pseudogenes in prokaryotes represent families that are in the process of being dispensed with. In particular, the genome sequences of certain pathogenic bacteria (Mycobacterium leprae, Yersinia pestis and Rickettsia prowazekii) show how an organism can undergo reductive evolution on a large scale (i.e. the dying out of families) as a result of niche change. There appears to be less pressure to delete pseudogenes in eukaryotes. These can be divided into two varieties, duplicated and processed, where the latter involves reverse transcription from an mRNA intermediate. We discuss these collectively in yeast, worm, fly, and human. The fly has few pseudogenes apparently because of its high rate of genomic DNA deletion. In the other three organisms, the distribution of pseudogenes on the chromosome and amongst different families is highly non-uniform. Pseudogenes tend not to occur in the middle of chromosome arms, and tend to be associated with lineage-specific (as opposed to highly conserved) families that have environmental-response functions. This may be because, rather than being dead, they may form a reservoir of diverse "extra parts" that can be resurrected to help an organism adapt to its surroundings. In yeast, there may be a novel mechanism involving the [PSI+] prion that potentially enables this resurrection. In worm, the pseudogenes tend to arise out of families (e.g. chemoreceptors) that are greatly expanded in it compared to the fly. The human genome stands out in having many processed pseudogenes. These have a character very different from those of the duplicated variety, to a large extent just representing random insertions. Thus, their occurrence tends to be roughly in proportion to the amount of mRNA for a particular protein and to reflect the extent of the intergenic sequences. Further information about pseudogenes is available at http://genecensus.org/pseudogene
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | | |
Collapse
|
586
|
Chambeyron S, Bucheton A, Busseau I. Tandem UAA repeats at the 3'-end of the transcript are essential for the precise initiation of reverse transcription of the I factor in Drosophila melanogaster. J Biol Chem 2002; 277:17877-82. [PMID: 11882661 DOI: 10.1074/jbc.m200996200] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Non-long terminal repeat retrotransposons, widespread among eukaryotic genomes, transpose by reverse transcription of an RNA intermediate. Some of them, like L1 in the human, terminate at the 3'-end with a poly(dA) stretch whereas others, like the I factor in Drosophila melanogaster, have instead a short sequence repeated in tandem. This suggests different requirements for the initiation of reverse transcription. Here, we have used an RNA circularization/reverse transcription-PCR technique to analyze the 5'- and 3'-ends of the full-length transcripts produced by the I factor at the time of active retrotransposition. These transcripts are capped and polyadenylated similar to conventional messenger RNAs. We have analyzed the 3'-ends of transcripts and transposed copies produced by I elements mutated at the 3'-ends. Transcripts devoid of tandem UAA repeats, although capable of building the components of the retrotransposition machinery, are inefficiently used as retrotransposition intermediates. Such transcripts produce rare new integrated copies issued from the inaccurate initiation of reverse transcription near the 3'-end of the element. The tandem UAA repeats at the 3'-end of the transcripts of I are required for the efficient and precise initiation of reverse transcription. This strong specificity of the I factor reverse transcriptase for its own transcript has implications for the impact of I factor retrotransposition on the host genome.
Collapse
Affiliation(s)
- Séverine Chambeyron
- Institut de Génétique Humaine, CNRS, 141 Rue de la Cardonille, 34396 Montpellier Cedex 5, France
| | | | | |
Collapse
|
587
|
Casaregola S, Neuvéglise C, Bon E, Gaillardin C. Ylli, a non-LTR retrotransposon L1 family in the dimorphic yeast Yarrowia lipolytica. Mol Biol Evol 2002; 19:664-77. [PMID: 11961100 DOI: 10.1093/oxfordjournals.molbev.a004125] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
During the course of a random sequencing project of the genome of the dimorphic yeast Yarrowia lipolytica, we have identified sequences that were repeated in the genome and that matched the reverse transcriptase (RT) sequence of non-long terminal repeat (non-LTR) retrotransposons. Extension of sequencing on each side of this zone of homology allowed the definition of an element over 6 kb long. The conceptual translation of this sequence revealed two open reading frames (ORFs) that displayed several characteristics of non-LTR retrotransposons: a Cys-rich motif in the ORF1, an N-terminal endonuclease, a central RT, and a C-terminal zinc finger domain in the ORF2. We called this element Ylli (for Y. lipolytica LINE). A total of 19 distinct repeats carrying the 3' untranslated region (UTR) and all ending with a poly-A tail were detected. Most of them were very short, 17 being 134 bp long or less. The number of copies of Ylli was estimated to be around 100 if these short repeats are 5' truncations. No 5' UTR was clearly identified, indicating that entire and therefore active elements might be very rare in the Y. lipolytica strain tested. Ylli does not seem to have any insertion specificity. Phylogenetic analysis of the RT domain unambiguously placed Ylli within the L1 clade. It forms a monophyletic group with the Zorro non-LTR retrotransposons discovered in another dimorphic yeast Candida albicans. BLAST comparisons showed that ORF2 of Ylli is closely related to that of the slime mold Dictyostelium discoideum L1 family, TRE.
Collapse
Affiliation(s)
- Serge Casaregola
- Collection de Levures d'Intérêt Biotechnologique, Laboratoire de Génétique Moleculaire et Cellulaire, INRA UR216, CNRS URA1925, INA-PG, F-78850 Thiverval-Grignon, France.
| | | | | | | |
Collapse
|
588
|
Nigumann P, Redik K, Mätlik K, Speek M. Many human genes are transcribed from the antisense promoter of L1 retrotransposon. Genomics 2002; 79:628-34. [PMID: 11991712 DOI: 10.1006/geno.2002.6758] [Citation(s) in RCA: 174] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Human L1 retrotransposon has two transcription-regulatory regions: an internal or sense promoter driving transcription of the full-length L1, and an antisense promoter (ASP) driving transcription in the opposite direction into adjacent cellular sequences yielding chimeric transcripts. Both promoters are located in the 5'-untranslated region (5'-UTR) of L1. Chimeric transcripts derived from the L1 ASP are highly represented in expressed-sequence tag (EST) databases. Using a bioinformatics approach, we have characterized 10 chimeric ESTs (cESTs) derived from the EST division of GenBank. These cESTs contained 3' regions similar or identical to known cellular mRNA sequences. They were accurately spliced and preferentially expressed in tumor cell lines. Analysis of the hundreds of cESTs suggests that the L1 ASP-driven transcription is a common phenomenon not only for tumor cells but also for normal ones and may involve transcriptional interference or epigenetic control of different cellular genes.
Collapse
Affiliation(s)
- Pilvi Nigumann
- Center for Gene Technology, Tallinn Technical University and National Institute of Chemical Physics and Biophysics, Tallinn EE12618, Estonia
| | | | | | | |
Collapse
|
589
|
Beck P, Dingermann T, Winckler T. Transfer RNA gene-targeted retrotransposition of Dictyostelium TRE5-A into a chromosomal UMP synthase gene trap. J Mol Biol 2002; 318:273-85. [PMID: 12051837 DOI: 10.1016/s0022-2836(02)00097-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The genome of the eukaryotic microorganism Dictyostelium discoideum hosts a family of seven non-long terminal repeat retrotransposons (TREs) that show remarkable insertion preferences near tRNA genes. We developed an in vivo assay to detect tRNA gene-targeted retrotransposition of endogenous TREs in a reporter strain of D. discoideum. A tRNA gene positioned within an artificial intron was placed into the D. discoideum UMP synthase gene. This construct was inserted into the D. discoideum genome and presented as a landmark for de novo TRE insertions. We show that the tRNA gene-tagged UMP synthase gene was frequently disrupted by de novo insertions of endogenous TRE5-A copies, thus rendering the resulting mutants resistant to 5-fluoroorotic acid selection. Approximately 96% of all isolated 5-FOA-resistant clones contained TRE5-A insertions, whereas the remaining 4% resulted from transposition-independent mutations. The inserted TRE5-As showed complex structural variations and were found about 50 bp upstream of the reporter tRNA gene, similar to previously analysed genomic copies of TRE5-A. No integration by other members of the TRE family was observed. We found that only 51% of the de novo insertions were derived from autonomous TRE5-A.1 copies. The remaining 49% of new insertions were due to TRE5-A.2 elements, which lack the proteins required for reverse transcription and integration, but retain functional promoter sequences.
Collapse
Affiliation(s)
- Peter Beck
- Institut für Pharmazeutische Biologie, Universität Frankfurt/M. (Biozentrum), Marie-Curie-Strasse 9 D-60439 Frankfurt am Main, Germany
| | | | | |
Collapse
|
590
|
Abstract
This study examines the intragenomic spread of the human endogenous retrovirus family HERV-W from insertions present within the draft sequence of the human genome. Identification of shared diagnostic differences and phylogenetic analyses revealed the existence of three main subfamilies. The average divergence between sequences for each of the subfamilies suggests that most of the HERV-W elements were inserted within the genome during a short period of evolutionary time. Each one of the subfamilies consists of two types of insertions, the expected proviral sequences and other sequences resembling the structure of processed retrogenes. These HERV-W retrosequences extend from the R region of the 5' long-terminal repeat (LTR) to the R region of the 3' LTR (as viral genomic RNAs), end in poly(A) 3' tails, and are flanked by direct repeats longer than the proviral integrations. Furthermore, several of the HERV-W retrosequences are 5'-truncated at different sites. I suggest the involvement of the L1 machinery in these integrations and discuss the characteristic features of the evolutionary history of HERV-W, with emphasis on the putative impact of HERV-W retrosequence integrations on the mammalian genome.
Collapse
Affiliation(s)
- Javier Costas
- Departamento de Bioloxía Fundamental, Universidade de Santiago de Compostela, E-15782 Santiago de Compostela, Spain.
| |
Collapse
|
591
|
Pavlícek A, Paces J, Elleder D, Hejnar J. Processed pseudogenes of human endogenous retroviruses generated by LINEs: their integration, stability, and distribution. Genome Res 2002; 12:391-9. [PMID: 11875026 PMCID: PMC155283 DOI: 10.1101/gr.216902] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
We report here the presence of numerous processed pseudogenes derived from the W family of endogenous retroviruses in the human genome. These pseudogenes are structurally colinear with the retroviral mRNA followed by a poly(A) tail. Our analysis of insertion sites of HERV-W processed pseudogenes shows a strong preference for the insertion motif of long interspersed nuclear element (LINE) retrotransposons. The genomic distribution, stability during evolution, and frequent truncations at the 5' end resemble those of the pseudogenes generated by LINEs. We therefore suggest that HERV-W processed pseudogenes arose by multiple and independent LINE-mediated retrotransposition of retroviral mRNA. These data document that the majority of HERV-W copies are actually nontranscribed promoterless pseudogenes. The current search for HERV-Ws associated with several human diseases should concentrate on a small subset of transcriptionally competent elements.
Collapse
Affiliation(s)
- Adam Pavlícek
- Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, Prague 6, CZ-16637, Czech Republic
| | | | | | | |
Collapse
|
592
|
Harrison P, Kumar A, Lan N, Echols N, Snyder M, Gerstein M. A small reservoir of disabled ORFs in the yeast genome and its implications for the dynamics of proteome evolution. J Mol Biol 2002; 316:409-19. [PMID: 11866506 DOI: 10.1006/jmbi.2001.5343] [Citation(s) in RCA: 91] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We surveyed the sequenced Saccharomyces cerevisiae genome (strain S288C) comprehensively for open reading frames (ORFs) that could encode full-length proteins but contain obvious mid-sequence disablements (frameshifts or premature stop codons). These pseudogenic features are termed disabled ORFs (dORFs). Using homology to annotated yeast ORFs and non-yeast proteins plus a simple region extension procedure, we have found 183 dORFs. Combined with the 38 existing annotations for potential dORFs, we have a total pool of up to 221 dORFs, corresponding to less than approximately 3% of the proteome. Additionally, we found 20 pairs of annotated ORFs for yeast that could be merged into a single ORF (termed a mORF) by read-through of the intervening stop codon, and may comprise a complete ORF in other yeast strains. Focussing on a core pool of 98 dORFs with a verifying protein homology, we find that most dORFs are substantially decayed, with approximately 90% having two or more disablements, and approximately 60% having four or more. dORFs are much more yeast-proteome specific than live yeast genes (having about half the chance that they are related to a non-yeast protein). They show a dramatically increased density at the telomeres of chromosomes, relative to genes. A microarray study shows that some dORFs are expressed even though they carry multiple disablements, and thus may be more resistant to nonsense-mediated decay. Many of the dORFs may be involved in responding to environmental stresses, as the largest functional groups include growth inhibition, flocculation, and the SRP/TIP1 family. Our results have important implications for proteome evolution. The characteristics of the dORF population suggest the sorts of genes that are likely to fall in and out of usage (and vary in copy number) in a strain-specific way and highlight the role of subtelomeric regions in engendering this diversity. Our results also have important implications for the effects of the [PSI+] prion. The dORFs disabled by only a single stop and the mORFs (together totalling 35) provide an estimate for the extent of the sequence population that can be resurrected readily through the demonstrated ability of the [PSI+] prion to cause nonsense-codon read-through. Also, the dORFs and mORFs that we find have properties (e.g. growth inhibition, flocculation, vanadate resistance, stress response) that are potentially related to the ability of [PSI+] to engender substantial phenotypic variation in yeast strains under different environmental conditions. (See genecensus.org/pseudogene for further information.)
Collapse
Affiliation(s)
- Paul Harrison
- Department of Molecular Biophysics & Biochemistry, Yale University, New Haven, CT 06520-8114, USA
| | | | | | | | | | | |
Collapse
|
593
|
Takahashi H, Fujiwara H. Transplantation of target site specificity by swapping the endonuclease domains of two LINEs. EMBO J 2002; 21:408-17. [PMID: 11823433 PMCID: PMC125841 DOI: 10.1093/emboj/21.3.408] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Long interspersed elements (LINEs) are ubiquitous genomic elements in higher eukaryotes. Here we develop a novel assay to analyze in vivo LINE retrotransposition using the telomeric repeat-specific elements SART1 and TRAS1. We demonstrate by PCR that silkworm SART1, which is expressed from a recombinant baculovirus, transposes in Sf9 cells into the chromosomal (TTAGG)n sequences, at the same specific nucleotide position as in the silkworm genome. Thus authentic retrotransposition by complete reverse transcription of the entire RNA transcription unit and occasional 5' truncation is observed. The retrotransposition requires conserved domains in both open reading frames (ORFs), including the ORF1 cysteine- histidine motifs. In contrast to human L1, recognition of the 3' untranslated region sequence is crucial for SART1 retrotransposition, which results in efficient trans-complementation. Swapping the endonuclease domain from TRAS1 into SART1 converts insertion specificity to that of TRAS1. Thus the primary determinant of in vivo target selection is the endonuclease domain, suggesting that modified LINEs could be used as gene therapy vectors, which deliver only genes of interest but not retrotransposons themselves in trans to specific genomic locations.
Collapse
Affiliation(s)
- Hidekazu Takahashi
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, Bioscience Building 501, Kashiwa, Chiba 277-8562, Japan
Present address: Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA Corresponding author e-mail:
| | - Haruhiko Fujiwara
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, Bioscience Building 501, Kashiwa, Chiba 277-8562, Japan
Present address: Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA Corresponding author e-mail:
| |
Collapse
|
594
|
Goodin JL, Rutherford CL. Identification of differentially expressed genes during cyclic adenosine monophosphate-induced neuroendocrine differentiation in the human prostatic adenocarcinoma cell line LNCaP. Mol Carcinog 2002. [DOI: 10.1002/mc.10025] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
595
|
Abstract
L1 retrotransposons comprise 17% of the human genome. Although most L1s are inactive, some elements remain capable of retrotransposition. L1 elements have a long evolutionary history dating to the beginnings of eukaryotic existence. Although many aspects of their retrotransposition mechanism remain poorly understood, they likely integrate into genomic DNA by a process called target primed reverse transcription. L1s have shaped mammalian genomes through a number of mechanisms. First, they have greatly expanded the genome both by their own retrotransposition and by providing the machinery necessary for the retrotransposition of other mobile elements, such as Alus. Second, they have shuffled non-L1 sequence throughout the genome by a process termed transduction. Third, they have affected gene expression by a number of mechanisms. For instance, they occasionally insert into genes and cause disease both in humans and in mice. L1 elements have proven useful as phylogenetic markers and may find other practical applications in gene discovery following insertional mutagenesis in mice and in the delivery of therapeutic genes.
Collapse
Affiliation(s)
- E M Ostertag
- Department of Genetics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania 19104, USA.
| | | |
Collapse
|
596
|
Chen HH, Liu TYC, Huang CJ, Choo KB. Generation of two homologous and intronless zinc-finger protein genes, zfp352 and zfp353, with different expression patterns by retrotransposition. Genomics 2002; 79:18-23. [PMID: 11827453 DOI: 10.1006/geno.2001.6664] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
We have previously reported a mouse zinc-finger protein gene, Zfp352 (formerly 2czf48), that is expressed in early mouse embryos. Here, we report the genomic structure of Zfp352 and its lung-specific homolog, Zfp353. The two genes map on different chromosomes at 4C6 and 8B3.1. Both genes are intronless, except for the presence of a single 4.6-kb intron in the 5' untranslated region of Zfp352. The genes use different RNA start sites located 1.2 kb apart within the 5' homologous region. LINE1 sequences are structurally associated with the genes and form an integral part of Zfp353 transcripts, suggesting previous retrotransposition events. We propose a model of evolution of the genes. The main feature of the model is the presence of a fortuitous upstream promoter and an intron in the first retrotransposition site, creating a pre-Zfp352 gene with a 5' untranslated region intron. A second retrotransposition event copying from the pre-Zfp352 retroposon and removing the fortuitous intron resulted in the intronless Zfp353 at a different chromosomal location and with a different mode of expression. The model may be applicable to other genes with a similar structure with a single intron in the 5' untranslated region. The exact role of LINE1 in the retrotransposition events remains to be elucidated.
Collapse
Affiliation(s)
- Huang-Hui Chen
- Recombinant DNA Laboratory, Department of Medical Research and Education, Veterans General Hospital-Taipei, Shih Pai, Taipei, Taiwan 11217
| | | | | | | |
Collapse
|
597
|
Szak ST, Pickeral OK, Makalowski W, Boguski MS, Landsman D, Boeke JD. Molecular archeology of L1 insertions in the human genome. Genome Biol 2002; 3:research0052. [PMID: 12372140 PMCID: PMC134481 DOI: 10.1186/gb-2002-3-10-research0052] [Citation(s) in RCA: 158] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2002] [Revised: 07/02/2002] [Accepted: 08/13/2002] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND As the rough draft of the human genome sequence nears a finished product and other genome-sequencing projects accumulate sequence data exponentially, bioinformatics is emerging as an important tool for studies of transposon biology. In particular, L1 elements exhibit a variety of sequence structures after insertion into the human genome that are amenable to computational analysis. We carried out a detailed analysis of the anatomy and distribution of L1 elements in the human genome using a new computer program, TSDfinder, designed to identify transposon boundaries precisely. RESULTS Structural variants of L1 elements shared similar trends in the length and quality of their target site duplications (TSDs) and poly(A) tails. Furthermore, we found no correlation between the composition and genomic location of the pre-insertion locus and the resulting anatomy of the L1 insertion. We verified that L1 insertions with TSDs have the 5'-TTAAAA-3' cleavage site associated with L1 endonuclease activity. In addition, the second target DNA cut required for L1 insertion weakly matches the consensus pattern TTAAAA. On the other hand, the L1-internal breakpoints of deleted and inverted L1 elements do not resemble L1 endonuclease cleavage sites. Finally, the genome sequence data indicate that whereas singly inverted elements are common, doubly inverted elements are almost never found. CONCLUSIONS The sequence data give no indication that the creation of L1 structural variants depends on characteristics of the insertion locus. In addition, the formation of 5' truncated and 5' inverted L1s are probably not due to the action of the L1 endonuclease.
Collapse
Affiliation(s)
- Suzanne T Szak
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Current addresses: Biogen, Inc., Cambridge, MA 02142, USA
- These authors contributed equally to this work
| | - Oxana K Pickeral
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N Wolfe St, Baltimore, MD 21205, USA
- Human Genome Sciences, Inc., Rockville, MD 20850, USA
- These authors contributed equally to this work
| | - Wojciech Makalowski
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Department of Biology, The Pennsylvania State University, 0208 Mueller Lab, University Park, PA 16802, USA
| | - Mark S Boguski
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N Wolfe St, Baltimore, MD 21205, USA
- Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue, North Seattle, WA 98109, USA
| | - David Landsman
- National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Jef D Boeke
- Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, 725 N Wolfe St, Baltimore, MD 21205, USA
| |
Collapse
|
598
|
Kitanaka J, Wang XB, Kitanaka N, Hembree CM, Uhl GR. Genomic organization of the murine G protein beta subunit genes and related processed pseudogenes. DNA SEQUENCE : THE JOURNAL OF DNA SEQUENCING AND MAPPING 2001; 12:345-54. [PMID: 11913780 DOI: 10.3109/10425170109084458] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The functional significance of heterotrimeric guanine nucleotide binding protein (G protein) for the many physiological processes including the molecular mechanisms of drug addiction have been described. In investigating the changes of mRNA expression after acute psychostimulant administration, we previously identified a cDNA encoding a G protein beta1 subunit (Gbeta1) that was increased up to four-fold in certain brain regions after administration of psychostimulants. The mouse Gbeta1 gene (the mouse genetic symbol, GNB1) was mapped to chromosome 4, but little was known of its genetic features. To characterize the GNB1 gene further, we have cloned and analyzed the genomic structures of the mouse GNBI gene and its homologous sequences. The GNBI gene spans at least 50 kb, and consists of 12 exons and 11 introns. The exon/intron boundaries were determined and found to follow the GT/AG rule. Exons 3-11 encode the Gbeta1 protein, and the exon 2 is an alternative, resulting in putative two splicing variants. Although intron 11 is additional for GNBI compared with GNB2 and GNB3, the intron positions within the protein coding region of GNB1, GNB2 and GNB3 are identical, suggesting that GNB1 should have diverged from the ancestral gene family earlier than the genes for GNB2 and GNB3. We also found the 5'-truncated processed pseudogenes with 71-89% similarities to GNBI mRNA sequence, suggesting that the truncated cDNA copies, which have been reverse-transcribed from a processed mRNA for GNB1, might have been integrated into several new locations in the mouse genome.
Collapse
Affiliation(s)
- J Kitanaka
- Molecular Neurobiology Branch, Intramural Research Program, National Institute on Drug Abuse, National Institutes of Health, Baltimore, MD 21224, USA.
| | | | | | | | | |
Collapse
|
599
|
Lenoir A, Lavie L, Prieto JL, Goubely C, Coté JC, Pélissier T, Deragon JM. The evolutionary origin and genomic organization of SINEs in Arabidopsis thaliana. Mol Biol Evol 2001; 18:2315-22. [PMID: 11719581 DOI: 10.1093/oxfordjournals.molbev.a003778] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We have characterized the two families of SINE retroposons present in Arabidopsis thaliana. The origin, distribution, organization, and evolutionary history of RAthE1 and RAthE2 elements were studied and compared to the well-characterized SINE S1 element from Brassica. Our studies show that RAthE1, RAthE2, and S1 retroposons were generated independently from three different tRNAs. The RAthE1 and RAthE2 families are older than the S1 family and are present in all tested Cruciferae species. The evolutionary history of the RAthE1 family is unusual for SINEs. The 144 RAthE1 elements of the Arabidopsis genome cannot be classified in distinct subfamilies of different evolutionary ages as is the case for S1, RAthE2, and mammalian SINEs. Instead, most RAthE1 elements were probably derived steadily from a single source gene that was maintained intact and active for at least 12-20 Myr, a result suggesting that the RAthE1 source gene was under selection. The distribution of RAthE1 and RAthE2 elements on the Arabidopsis physical map was studied. We observed that, in contrast to other Arabidopsis transposable elements, SINEs are not concentrated in the heterochromatic regions. Instead, SINEs are grouped in the euchromatic chromosome territories several hundred kilobase pairs long. In these territories, SINE elements are closely associated with genes. A retroposition partnership between Arabidopsis SINEs and LINEs is proposed.
Collapse
Affiliation(s)
- A Lenoir
- Centre National de la Recherche Scientifique, Université Blaise Pascal Clermont-Ferrand II, Aubière cedex, France
| | | | | | | | | | | | | |
Collapse
|
600
|
Witte CP, Le QH, Bureau T, Kumar A. Terminal-repeat retrotransposons in miniature (TRIM) are involved in restructuring plant genomes. Proc Natl Acad Sci U S A 2001; 98:13778-83. [PMID: 11717436 PMCID: PMC61118 DOI: 10.1073/pnas.241341898] [Citation(s) in RCA: 142] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2001] [Indexed: 11/18/2022] Open
Abstract
A new group of long terminal repeats (LTR) retrotransposons, termed terminal-repeat retrotransposons in miniature (TRIM), are described that are present in both monocotyledonous and dicotyledonous plant. TRIM elements have terminal direct repeat sequences between approximately 100 and 250 bp in length that encompass an internal domain of approximately 100-300 bp. The internal domain contains primer binding site and polypurine tract motifs but lacks the coding domains required for mobility. Thus TRIM elements are not capable of autonomous transposition and probably require the help of mobility-related proteins encoded by other retrotransposons. The structural organization of TRIM elements suggests an evolutionary relationship to either LTR retrotransposons or retroviruses. The past mobility of TRIM elements is indicated by the presence of flanking 5-bp direct repeats found typically at LTR retrotransposon insertion sites, the high degree of sequence conservation between elements from different genomic locations, and the identification of related to empty sites (RESites). TRIM elements seem to be involved actively in the restructuring of plant genomes, affecting the promoter, coding region and intron-exon structure of genes. In solanaceous species and maize, TRIM elements provided target sites for further retrotransposon insertions. In Arabidopsis, evidence is provided that the TRIM element also can be involved in the transduction of host genes.
Collapse
Affiliation(s)
- C P Witte
- Scottish Crop Research Institute, Invergowrie, DD2 5DA Dundee, Scotland
| | | | | | | |
Collapse
|