1
|
Komissarov AS, Galkina SA, Koshel EI, Kulak MM, Dyomin AG, O'Brien SJ, Gaginskaya ER, Saifitdinova AF. New high copy tandem repeat in the content of the chicken W chromosome. Chromosoma 2017; 127:73-83. [PMID: 28951974 DOI: 10.1007/s00412-017-0646-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Revised: 09/13/2017] [Accepted: 09/18/2017] [Indexed: 11/26/2022]
Abstract
The content of repetitive DNA in avian genomes is considerably less than in other investigated vertebrates. The first descriptions of tandem repeats were based on the results of routine biochemical and molecular biological experiments. Both satellite DNA and interspersed repetitive elements were annotated using library-based approach and de novo repeat identification in assembled genome. The development of deep-sequencing methods provides datasets of high quality without preassembly allowing one to annotate repetitive elements from unassembled part of genomes. In this work, we search the chicken assembly and annotate high copy number tandem repeats from unassembled short raw reads. Tandem repeat (GGAAA)n has been identified and found to be the second after telomeric repeat (TTAGGG)n most abundant in the chicken genome. Furthermore, (GGAAA)n repeat forms expanded arrays on the both arms of the chicken W chromosome. Our results highlight the complexity of repetitive sequences and update data about organization of sex W chromosome in chicken.
Collapse
Affiliation(s)
- Aleksey S Komissarov
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, Sredniy av. 41, 199034, Saint Petersburg, Russia
| | - Svetlana A Galkina
- Department of Genetics and Biotechnology, Saint Petersburg State University, Universitetskaya emb. 7/9, 199034, Saint Petersburg, Russia
- Saint Petersburg Association of Scientists and Scholars, Universitetskaya emb. 5, Saint Petersburg, 199034, Russia
| | - Elena I Koshel
- Department of Cytology and Histology, Saint Petersburg State University, Universitetskaya emb. 7/9, 199034, Saint Petersburg, Russia
| | - Maria M Kulak
- Department of Cytology and Histology, Saint Petersburg State University, Universitetskaya emb. 7/9, 199034, Saint Petersburg, Russia
| | - Aleksander G Dyomin
- Saint Petersburg Association of Scientists and Scholars, Universitetskaya emb. 5, Saint Petersburg, 199034, Russia
- Chromas Research Resource Center, Saint Petersburg State University, Oranienbaumskoye sh. 2, 198504, Saint Petersburg, Russia
| | - Stephen J O'Brien
- Theodosius Dobzhansky Center for Genome Bioinformatics, Saint Petersburg State University, Sredniy av. 41, 199034, Saint Petersburg, Russia
- Oceanographic Center, Nova Southeastern University, Fort Lauderdale, Florida, 33004, USA
| | - Elena R Gaginskaya
- Department of Cytology and Histology, Saint Petersburg State University, Universitetskaya emb. 7/9, 199034, Saint Petersburg, Russia
| | - Alsu F Saifitdinova
- Chromas Research Resource Center, Saint Petersburg State University, Oranienbaumskoye sh. 2, 198504, Saint Petersburg, Russia.
- International Centre of Reproductive Medicine, Komendantskiy av. 53-1, Saint Petersburg, 197350, Russia.
| |
Collapse
|
2
|
Doll RF, Bruce A, Smith FI. Regulation of the human acid beta-glucosidase promoter in multiple cell types. BIOCHIMICA ET BIOPHYSICA ACTA 1995; 1261:57-67. [PMID: 7893761 DOI: 10.1016/0167-4781(94)00215-o] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Acid beta-glucosidase (beta Glc) is a housekeeping enzyme whose expression is ubiquitous, but differs greatly according to tissue of origin. Expression of a reporter gene under the control of a 622 bp fragment of the beta Glc promoter correlated roughly with the relative amount of beta Glc mRNA detected in five different cell lines, suggesting that elements within this region play a role in determining differential expression of the beta Glc gene. Experiments using deletion mutants revealed that differential expression of beta Glc is not due to the presence of promoter elements that are active in only certain cell types, but rather due to subtle changes in the magnitude of the effect of the different elements. Strikingly, regulatory elements located upstream of the TATA box are dispensible in several cell types, whereas elements located within exon 1 of the beta Glc gene are essential for reporter gene expression in cultured cells. At least two exon 1 elements regulate mRNA levels, and one double stranded probe containing exon 1 sequences binds a factor present in extracts from HeLa and glioblastoma cells. Additionally, at least two of the exon 1 elements act in an orientation-independent fashion. Thus, it is likely that at least a subset of the exon 1 elements act as transcriptional enhancers.
Collapse
Affiliation(s)
- R F Doll
- Biomedical Sciences Department, Eunice Kennedy Shriver Center for Mental Retardation, Waltham, MA 02254
| | | | | |
Collapse
|
3
|
Guo X, Zhang YP, Mitchell DA, Denhardt DT, Chambers AF. Identification of a ras-activated enhancer in the mouse osteopontin promoter and its interaction with a putative ETS-related transcription factor whose activity correlates with the metastatic potential of the cell. Mol Cell Biol 1995; 15:476-87. [PMID: 7799957 PMCID: PMC231995 DOI: 10.1128/mcb.15.1.476] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The role of RAS in transducing signals from an activated receptor into altered gene expression is becoming clear, though some links in the chain are still missing. Cells possessing activated RAS express higher levels of osteopontin (OPN), an alpha v beta 3 integrin-binding secreted phosphoprotein implicated in a number of developmental, physiological, and pathological processes. We report that in T24 H-ras-transformed NIH 3T3 cells enhanced transcription contributes to the increased expression of OPN. Transient transfection studies, DNA-protein binding assays, and methylation protection experiments have identified a novel ras-activated enhancer, distinct from known ras response elements, that appears responsible for part of the increase in OPN transcription in cells with an activated RAS. In electrophoretic mobility shift assays, the protein-binding motif GGAGGCAGG was found to be essential for the formation of several complexes, one of which (complex A) was generated at elevated levels by cell lines that are metastatic. Southwestern blotting and UV light cross-linking studies indicated the presence of several proteins able to interact with this sequence. The proteins that form these complexes have molecular masses estimated at approximately 16, 28, 32, 45, 80, and 100 kDa. Because the approximately 16-kDa protein was responsible for complex A formation, we have designated it MATF for metastasis-associated transcription factor. The GGANNNAGG motif is also found in some other promoters, suggesting that they may be similarly controlled by MATF.
Collapse
Affiliation(s)
- X Guo
- Department of Biological Sciences, Rutgers University, Piscataway, New Jersey 08855-1059
| | | | | | | | | |
Collapse
|
4
|
Borstnik B, Pumpernik D, Lukman D, Ugarković D, Plohl M. Tandemly repeated pentanucleotides in DNA sequences of eucaryotes. Nucleic Acids Res 1994; 22:3412-7. [PMID: 8078778 PMCID: PMC523737 DOI: 10.1093/nar/22.16.3412] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Genetic sequence data banks were scanned in order to retrieve tandemly repeated pentanucleotides (pnts). It was found that among 102 (=(1024-4)/2/5) possible distinct pnts roughly each fourth is involved in tandem repeats. It is shown that tandemly repeated pnts are composed of frequently occurring di- and trinucleotides and that those pnts which occur frequently in the form of mono- or di-pnts form also tandem repeats either in the form of satellites or in the form of shorter tandem repeats. Human satellite III is taken as a specific example. It is shown that the first guanine within GG-AAT pnt exhibits the highest mutability. Sequential distribution of base changes gives evidence that the mutations do not occur at random positions but in a correlated fashion so that long stretches of original pnts remain intact. It is found that pnts related to the satellite III are present in introns and flanking regions of some structural genes, but are not preserved between orthologous genes of related species. The results corroborate the most plausible mechanism of their evolution--rapid amplification followed by successive divergence of repeat units by various mutational processes.
Collapse
Affiliation(s)
- B Borstnik
- National Institute of Chemistry, Ljubljana, Slovenia
| | | | | | | | | |
Collapse
|
5
|
Tagle DA, Stanhope MJ, Siemieniak DR, Benson P, Goodman M, Slightom JL. The beta globin gene cluster of the prosimian primate Galago crassicaudatus: nucleotide sequence determination of the 41-kb cluster and comparative sequence analyses. Genomics 1992; 13:741-60. [PMID: 1639402 DOI: 10.1016/0888-7543(92)90150-q] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The nucleotide sequence of the beta globin gene cluster of the prosimian Galago crassicaudatus has been determined. A total sequence spanning 41,101 bp contains and links together previously published sequences of the five galago beta-like globin genes (5'-epsilon-gamma-psi eta-delta-beta-3'). A computer-aided search for middle interspersed repetitive sequences identified 10 LINE (L1) elements, including a 5' truncated repeat that is orthologous to the full-length L1 element found in the human epsilon-gamma intergenic region. SINE elements that were identified included one Alu type I repeat, four Alu type II repeats, and two methionine tRNA-derived Monomer (type III) elements. Alu type II and Monomer sequences are unique to the galago genome. Structural analyses of the cluster sequence reveals that it is relatively A+T rich (about 62%) and regions with high G+C content are associated primarily with globin coding regions. Comparative analyses with the beta globin cluster sequences of human, rabbit, and mouse reveal extensive sequence homologies in their genic regions, but only human, galago, and rabbit sequences share extensive intergenic sequence homologies. Divergence analyses of aligned intergenic and flanking sequences from orthologous human, galago, and rabbit sequences show a gradation in the rate of nucleotide sequence evolution along the cluster where sequences 5' of the epsilon globin gene region show the least sequence divergence and sequences just 5' of the beta globin gene region show the greatest sequence divergence.
Collapse
Affiliation(s)
- D A Tagle
- Department of Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan 48201
| | | | | | | | | | | |
Collapse
|
6
|
Michel D, Chatelain G, Herault Y, Brun G. The long repetitive polypurine/polypyrimidine sequence (TTCCC)48 forms DNA triplex with PU-PU-PY base triplets in vivo. Nucleic Acids Res 1992; 20:439-43. [PMID: 1741277 PMCID: PMC310405 DOI: 10.1093/nar/20.3.439] [Citation(s) in RCA: 28] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Polypurine/polypyrimidine repetitive sequences occur with high frequency in eucaryotic genomes, particularly around transcription units. Since such sequences are known to adopt triple stranded-structures under appropriate conditions in vitro, it is of major interest to know if they occur in vivo, and thus if they can have some biological importance by inducing structural constraints in the genomic DNA. To this end, we have isolated a (TTCCC)48 sequence, present in the promoter of an avian gene, and tested its ability to form PU-PY-PY and PU-PU-PY triple helices in vitro, through the oligonucleotide gel shift technique and single strand-specific nuclease footprinting. We have then developed an oligonucleotide protection assay, which can be adapted to in vivo investigations. This strategy leads us to conclude that in vivo conditions allow preponderant formation of triplex of the PU-PU-PY class.
Collapse
Affiliation(s)
- D Michel
- Laboratoire de Biologie Moléculaire et Cellulaire CNRS, UMR 49, Ecole Normale Supérieure de Lyon, France
| | | | | | | |
Collapse
|
7
|
VanWye JD, Bronson EC, Anderson JN. Species-specific patterns of DNA bending and sequence. Nucleic Acids Res 1991; 19:5253-61. [PMID: 1923808 PMCID: PMC328884 DOI: 10.1093/nar/19.19.5253] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Nucleotide sequences in the GenEMBL database were analyzed using strategies designed to reveal species-specific patterns of DNA bending and DNA sequence. The results uncovered striking species-dependent patterns of bending with more variations among individual organisms than between prokaryotes and eukaryotes. The frequency of bent sites in sequences from different bacteria was related to genomic A + T content and this relationship was confirmed by electrophoretic analysis of genomic DNA. However, base composition was not an accurate predictor for DNA bending in eukaryotes. Sequences from C. elegans exhibited the highest frequency of bent sites in the database and the RNA polymerase II locus from the nematode was the most bent gene in GenEMBL. Bent DNA extended throughout most introns and gene flanking segments from C.elegans while exon regions lacked A-tract bending characteristics. Independent evidence for the strong bending character of this genome was provided by electrophoretic studies which revealed that a large number of the fragments from C.elegans DNA exhibited anomalous gel mobilities when compared to genomic fragments from over 20 other organisms. The prevalence of bent sites in this genome enabled us to detect selectively C.elegans sequences in a computer search of the database using as probes C.elegans introns, bending elements, and a 20 nucleotide consensus sequence for bent DNA. This approach was also used to provide additional examples of species-specific sequence patterns in eukaryotes where it was shown that (A) greater than or equal to 10 and (A.T) greater than or equal to 5 tracts are prevalent throughout the untranslated DNA of D.discodium and P.falciparum, respectively. These results provide new insight into the organization of eukaryotic DNA because they show that species-specific patterns of simple sequences are found in introns and in other untranslated regions of the genome.
Collapse
Affiliation(s)
- J D VanWye
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | | | | |
Collapse
|
8
|
Bezouska K, Crichlow G, Rose J, Taylor M, Drickamer K. Evolutionary conservation of intron position in a subfamily of genes encoding carbohydrate-recognition domains. J Biol Chem 1991. [DOI: 10.1016/s0021-9258(18)98999-4] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
9
|
McHugh KP, Madsen CS, de Kloet SR. A highly repeated retropseudogene-like sequence in DNA of the redbreasted merganser (Mergus serrator). Gene X 1990; 87:193-7. [PMID: 2332168 DOI: 10.1016/0378-1119(90)90301-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Two highly repeated nucleotide sequences (RBMI and RBMII) cloned from an EcoRI digest of DNA of the redbreasted merganser (Mergus serrator) account for approx. 5 to 10% of the DNA of M. serrator and the closely related Mergus merganser. Complete DNA digestion of seven members of the Mergini with EcoRI produces distinct, relatively species-specific patterns of a few high-Mr (greater than 1.5 kb) fragments of RBMI-like material. In such digests RBMII forms ladder-type patterns with monomers of approx. 200 bp. The sequence of a cloned 2.6-kb RBMI fragment from M. serrator contains several extended (up to 70 bp) and modified poly(dA) sequences, two open reading frames in opposite orientation to the longest poly(dA) sequence and two direct 10-bp repeats suggesting that RBMI is a rearranged retropseudogene-like element.
Collapse
Affiliation(s)
- K P McHugh
- Institute of Molecular Biophysics, Florida State University, Tallahassee 32306
| | | | | |
Collapse
|
10
|
Jeltsch JM, Turcotte B, Garnier JM, Lerouge T, Krozowski Z, Gronemeyer H, Chambon P. Characterization of multiple mRNAs originating from the chicken progesterone receptor gene. Evidence for a specific transcript encoding form A. J Biol Chem 1990. [DOI: 10.1016/s0021-9258(19)39689-9] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
|
11
|
Zopf D, Dineva B, Betz H, Gundelfinger ED. Isolation of the chicken middle-molecular weight neurofilament (NF-M) gene and characterization of its promoter. Nucleic Acids Res 1990; 18:521-9. [PMID: 2106668 PMCID: PMC333457 DOI: 10.1093/nar/18.3.521] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
We have isolated and sequenced genomic DNA clones covering the coding region of the chicken mid-size neurofilament (NF-M) gene and greater than 1 kb of its 5' upstream region. The NF-M gene contains two introns which both are located within the highly conserved C-terminal region of the rod domain. The 5' end of the corresponding mRNA was assigned to a G residue 40 nucleotides upstream of the translation start site and in appropriate distance from a potential TATA box. To functionally analyze the NF-M promoter, constructs carrying 112, 222, and 1026 nucleotides of the 5' upstream region in front of a luciferase reporter gene were tested for their capability to direct luciferase expression after transient transfection into various cell lines. Significant luciferase activity was recorded both in rat phaeochromocytoma (PC12) cells and murine fibroblasts. In PC12 cells, in which neurite outgrowth is induced by nerve growth factor (NGF), expression was stimulated up to 13-fold within 3 days of NGF treatment. This closely resembles expression of the endogenous NF-M gene in response to this hormone.
Collapse
Affiliation(s)
- D Zopf
- ZMBH, Universität Heidelberg, FRG
| | | | | | | |
Collapse
|
12
|
O'Hara PJ, Grant FJ. The human factor VII gene is polymorphic due to variation in repeat copy number in a minisatellite. Gene 1988; 66:147-58. [PMID: 2970988 DOI: 10.1016/0378-1119(88)90232-6] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The gene coding for human factor VII, a vitamin K-dependent coagulation factor, contains five minisatellite imperfect tandem repeats with monomer element lengths ranging from 14 to 37 bp, and copy numbers ranging from 6 to 52. Three of these repeats are entirely within introns, one is entirely in an untranslated portion of an exon, and one spans an exon-intron border and contains coding sequence. A consensus sequence derived from a comparison of the monomers is similar to a core sequence found in other minisatellites. All of the minisatellites display higher-order periodicities. At least one of these minisatellites is polymorphic. A variation in repeat copy number has been observed in a tandem-repeat region in the seventh factor-VII intron.
Collapse
|
13
|
Calzone FJ, Lee JJ, Le N, Britten RJ, Davidson EH. A long, nontranslatable poly(A) RNA stored in the egg of the sea urchin Strongylocentrotus purpuratus. Genes Dev 1988; 2:305-18. [PMID: 2454211 DOI: 10.1101/gad.2.3.305] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Nontranslatable transcripts containing interspersed repetitive sequence elements constitute a major fraction of the poly(A) RNA stored in the cytoplasm of both the sea urchin egg and the amphibian oocyte. We report the first complete sequence of a representative interspersed maternal RNA transcript, called ISp1. The transcript is about 3.7 kb in length [including poly(A) tail]; and the 5' half consists of a cluster of repetitive sequences, whereas the 3' half is single copy. Other repetitive sequences occur in the 5' and 3' regions flanking the transcription unit. In several cloned alleles, the flanking repetitive and single-copy sequences differ, indicating a high degree of insertional and deletional rearrangement around, as well as within, the transcription unit. No significant open reading frames exist in any region of the ISp1 transcript, nor is it spliced to give rise to translatable mRNA in egg or embryo. A 620-nucleotide repetitive sequence element at the 5' end of the ISp1 transcript is also represented in a large number of other long interspersed maternal poly(A) RNAs. In addition, this sequence appears in a prevalent set of small polyadenylated RNAs about 600-nucleotides in length, which disappear almost completely by the gastrula stage of development. The structural features of the ISp1 RNA uncovered in this work exclude several hypotheses of interspersed maternal poly(A) RNA origin and function.
Collapse
Affiliation(s)
- F J Calzone
- Division of Biology, California Institute of Technology, Pasadena 91125
| | | | | | | | | |
Collapse
|
14
|
Delaey B, Dirckx L, Decourt JL, Claessens F, Peeters B, Rombauts W. Rat prostatic binding protein: the complete sequence of the C2 gene and its flanking regions. Nucleic Acids Res 1987; 15:1627-41. [PMID: 2881277 PMCID: PMC340571 DOI: 10.1093/nar/15.4.1627] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The complete sequence (2879 bp) of the androgen-controlled rat prostatic binding protein C2 gene and 1023 bp of the 5'- and 2127 bp of the 3'-flanking regions have been determined. The gene contains three exons (93, 203 and 147 bp) and two introns (1630 and 806 bp). It is flanked by two homopurine-homopyrimidine stretches of 55 and 131 nucleotides respectively, located at positions -405 and 4151. These sequences are remarkably sensitive towards S1-nuclease, indicating an altered DNA conformation under superhelical stress. Several palindromes and dyad structures are observed in the 5'-upstream region of the gene and at position -457, and 80% homology to the consensus sequence of a glucocorticoid receptor binding site is found.
Collapse
|
15
|
Human homologs of TU transposon sequences: polypurine/polypyrimidine sequence elements that can alter DNA conformation in vitro and in vivo. Mol Cell Biol 1987. [PMID: 3025605 DOI: 10.1128/mcb.6.11.3632] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We previously have shown that homologs of the outer domain segment of the inverted repeat termini (IVR-OD) of the sea urchin TU transposons are conserved among multiple eucaryotic species, including humans. We report here that two cloned human DNA IVR-OD homologs, Hut2 and Hut17, consist of a series of tandem repeats of the trimer AGG/TCC, forming segments (313 and 221 base pairs in length, respectively) of polypurine/polypyrimidine (pPu/pPy or "Puppy") asymmetry in the two DNA strands; these are punctuated at certain sites with variant trimers, which are different for the two clones. Sequences homologous to the Hut2 pPu/pPy tract exist at multiple sites in the DNA of a wide variety of eucaryotes. Hybridization of human DNA with a Hut2 probe or with a previously described chicken DNA pPu/pPy sequence indicates that pPu/pPy sequences can be grouped into families distinguishable by the extent of their homology with each probe at different hybridization stringencies. Moreover, particular pPu/pPy tracts show species-specific differences in their distribution. Both the Hut2 and Hut17 pPu/pPy tracts are cleaved by S1 nuclease when tested on supercoiled plasmids. Most if not all of the 313-base-pair Hut2 pPu/pPy tract is also sensitive to S1 in its native location in HeLa cell chromatin, indicating that the sequence contains conformational information that can be expressed in vivo. This view is supported by evidence that exogenously derived Hut2 pPu/pPy tracts introduced into mouse L cells and integrated in chromatin can assume an S1-sensitive conformation.
Collapse
|
16
|
Hoffman-Liebermann B, Liebermann D, Troutt A, Kedes LH, Cohen SN. Human homologs of TU transposon sequences: polypurine/polypyrimidine sequence elements that can alter DNA conformation in vitro and in vivo. Mol Cell Biol 1986; 6:3632-42. [PMID: 3025605 PMCID: PMC367124 DOI: 10.1128/mcb.6.11.3632-3642.1986] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
We previously have shown that homologs of the outer domain segment of the inverted repeat termini (IVR-OD) of the sea urchin TU transposons are conserved among multiple eucaryotic species, including humans. We report here that two cloned human DNA IVR-OD homologs, Hut2 and Hut17, consist of a series of tandem repeats of the trimer AGG/TCC, forming segments (313 and 221 base pairs in length, respectively) of polypurine/polypyrimidine (pPu/pPy or "Puppy") asymmetry in the two DNA strands; these are punctuated at certain sites with variant trimers, which are different for the two clones. Sequences homologous to the Hut2 pPu/pPy tract exist at multiple sites in the DNA of a wide variety of eucaryotes. Hybridization of human DNA with a Hut2 probe or with a previously described chicken DNA pPu/pPy sequence indicates that pPu/pPy sequences can be grouped into families distinguishable by the extent of their homology with each probe at different hybridization stringencies. Moreover, particular pPu/pPy tracts show species-specific differences in their distribution. Both the Hut2 and Hut17 pPu/pPy tracts are cleaved by S1 nuclease when tested on supercoiled plasmids. Most if not all of the 313-base-pair Hut2 pPu/pPy tract is also sensitive to S1 in its native location in HeLa cell chromatin, indicating that the sequence contains conformational information that can be expressed in vivo. This view is supported by evidence that exogenously derived Hut2 pPu/pPy tracts introduced into mouse L cells and integrated in chromatin can assume an S1-sensitive conformation.
Collapse
|
17
|
Abstract
A Fourier transform g(n) of a sequence of bases along a given stretch of DNA is defined. The transform is invariant to the labelling of the bases and can therefore be used as a measure of periodicity for segments of DNA with differing base content. It can also be conveniently used to search for base periodicities within large DNA data bases.
Collapse
|
18
|
Willard C, Wong E, Hess JF, Shen CK, Chapman B, Wilson AC, Schmid CW. Comparison of human and chimpanzee zeta 1 globin genes. J Mol Evol 1985; 22:309-15. [PMID: 3003369 DOI: 10.1007/bf02115686] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The DNA base sequences of the entire chimpanzee zeta 1 globin gene and an additional 1 kb of DNA flanking both the human and chimpanzee genes have been determined. Whereas the human zeta 1 gene contains a termination codon in the sixth position, the chimpanzee gene appears to be functional. This finding confirms Proudfoot et al.'s suggestion that the human zeta 1 gene was recently inactivated. Like the corresponding human zeta 1 and zeta 2 genes, the first and second introns of the chimpanzee zeta 1 gene are occupied largely by tandem repeats of short oligonucleotides. These tandem repeats have undergone several rearrangements since the divergence of the human and chimpanzee zeta 1 genes.
Collapse
|
19
|
Christophe D, Cabrer B, Bacolla A, Targovnik H, Pohl V, Vassart G. An unusually long poly(purine)-poly(pyrimidine) sequence is located upstream from the human thyroglobulin gene. Nucleic Acids Res 1985; 13:5127-44. [PMID: 2991855 PMCID: PMC321854 DOI: 10.1093/nar/13.14.5127] [Citation(s) in RCA: 102] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
A region of human genomic DNA encompassing the 5' end of the thyroglobulin gene has been sequenced and the position of the transcriptional start site has been determined. The 5' non-translated portion of the mRNA displays a quasi-palindromic sequence which could allow this region to adopt a hairpin structure. The first exon of the gene encodes a 19 amino-acids signal peptide and the 3 first amino acids of the mature protein. Apart from the canonical TATA-Box and from a CAAT-Box homology, the promoter region contains a 209 bp-long poly(purine)-poly (pyrimidine) sequence located between positions-512 and -304 relative to the transcription start. When contained in a supercoiled plasmid, this sequence exhibits sensitivity to S1 nuclease at two distinct positions. A precise mapping of the borders of the sensitive regions was achieved by extending primers from both ends of the sequence after digestion by the enzyme. The resulting data can be explained by a model involving the formation of a triple helix structure.
Collapse
|
20
|
Abstract
We isolated clones and determined the sequence of portions of mouse and human cellular DNA which cross-hybridize strongly with the IR3 repetitive region of Epstein-Barr virus. The sequences were found to be tandem arrays of a simple sequence based on the triplet GGA, very similar to the IR3 repeat. The cellular repeats have distinct differences from the viral repeat region, however, and their sequences do not appear capable of being translated into a purely glycine-plus-alanine protein domain like the portion of the Epstein-Barr nuclear antigen coded by IR3. Although the relationship between IR3 and the cellular repeats is left unclear, the cellular repeats have many interesting features. The tandem arrays are about 1 to several kilobases long, much shorter than satellite tandem repeats and larger than other interspersed, tandem repeats. Each of the repeats is a distinct variation, perhaps diverged from a common sequence, (GGA)n. This family is present in the genomes of all species tested and appears to be a ubiquitous feature of all higher eucaryotic genomes.
Collapse
|
21
|
Abstract
We isolated clones and determined the sequence of portions of mouse and human cellular DNA which cross-hybridize strongly with the IR3 repetitive region of Epstein-Barr virus. The sequences were found to be tandem arrays of a simple sequence based on the triplet GGA, very similar to the IR3 repeat. The cellular repeats have distinct differences from the viral repeat region, however, and their sequences do not appear capable of being translated into a purely glycine-plus-alanine protein domain like the portion of the Epstein-Barr nuclear antigen coded by IR3. Although the relationship between IR3 and the cellular repeats is left unclear, the cellular repeats have many interesting features. The tandem arrays are about 1 to several kilobases long, much shorter than satellite tandem repeats and larger than other interspersed, tandem repeats. Each of the repeats is a distinct variation, perhaps diverged from a common sequence, (GGA)n. This family is present in the genomes of all species tested and appears to be a ubiquitous feature of all higher eucaryotic genomes.
Collapse
|
22
|
Braga EA, Avdonina TA, Zhurkin VB, Nosikov VV. Structural organization of rat ribosomal RNA genes: interspersed sequences and their putative role in the alignment of nucleosomes. Gene 1985; 36:249-62. [PMID: 3000877 DOI: 10.1016/0378-1119(85)90180-5] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We have observed four regions containing highly repetitive interspersed sequences in the nontranscribed spacer (NTS) of the rat rRNA genes. Two of them (A and B) are located at a distance of 3-5 kb upstream from the transcription start point and two others (C and D) at a distance of 2-5 kb downstream from the 3' end of the 28S rRNA gene. These repetitive sequences are widely dispersed in the genome and are included both in small-copy regions and in the families of extended reiterated sequences. The sequences of three fragments were determined: one from the C2 region, 1100 bp in length and two from A and C1 regions, 110-120 bp long. These regions are characterized by the presence of 'simple' sequences, such as (AC)n, (ACC)n, (GAG)n, (GGGA)n, (TAAG)n, and also of long blocks, (G)n and (A)n. In the C2 region two palindromes, 16 and 14 nt long, were found, one of them including a XhoI site. Mobile element B2 was observed in regions B and C. All four regions, A, B, C and D, contain sets of simple sequences, among which some common elements have been found. Theoretical prediction of the nucleosomal disposition in the C region indicates that the combination of simple sequences existing in the given area secures fixed positions of the nucleosomes, one of the nucleosomes being formed on the B2 element. Moreover, a striking periodicity, with the repeat length close to that of the rat nucleosomal DNA, has been observed. A hypothesis is put forward that the simple sequences can dictate the location of nucleosomes on the adjoining DNA sequences, thereby regulating the gene activity.
Collapse
|
23
|
Khandekar P, Saidapet C, Krauskopf M, Zarraga AM, Lin WL, Mendola C, Siddiqui MA. Co-ordinate control of gene expression. Muscle-specific 7 S RNA contains sequences homologous to 3'-untranslated regions of myosin genes and repetitive DNA. J Mol Biol 1984; 180:417-35. [PMID: 6084716 DOI: 10.1016/0022-2836(84)90020-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
We have cloned and sequenced a complementary DNA copy (pSS48) of a novel muscle-specific, low molecular weight RNA, 7 S RNA, isolated from embryonic chick cardiac muscle cells. The hybridization pattern of plasmid pSS48 DNA to chick genomic DNA suggests that 7 S RNA is derived from the repetitive chick DNA with a repetition frequency of about 300 copies per haploid genome. Under low stringency, pSS48 DNA also hybridizes with high specificity to the single copy gene for chick myosin light chain (MLC) and to myosin heavy chain (MHC), and possibly to other co-ordinately expressed genes for chick muscle proteins. The sequence analysis of recombinant plasmids pSS48, pML10 and pMHC8, for 7 S RNA, MLC mRNA and MHC RNA, respectively, indicated that short nucleotide stretches homologous to 7 S RNA reside in the 3' untranslated regions of the respective genes. The 7 S RNA sequence appears to be highly specific for the chick muscle tissue, since RNA and DNA from several sources did not hybridize to pSS48 DNA. Furthermore, the 7 S RNA-like sequence(s) appears in chick blastodermal cells preferentially earlier than the onset of transcription of genes for major muscle proteins. These results, taken together, suggest a possible function for 7 S RNA in expression of muscle-specific genes during chick development.
Collapse
|
24
|
Abstract
The construction of a small library of mouse repetitive DNA has been previously reported (Pietras et al., Nucleic Acids Res. 11:6965-6983, 1983). Here we report that the 35 plasmids in this library corresponding to highly repeated (greater than 30,000 copies per genome) dispersed DNA sequences can be grouped into no more than 5 distinct families. These families together comprise 8 to 10% of the mouse genome. They include the previously described small elements B1, B2, and R and the large MIF-1 element. Twelve of the 35 clones contain evolutionarily conserved (EC) sequences. One EC clone in our library mostly consists of alternating dCdT residues; another consists of tandem repeats of the sequence CCTCT. The majority of B1s and B2s in the genome appear to be homogeneous, whereas R sequences, ECs, and MIF-1s are heterogeneous. Two earlier reports showed highly repeated mammalian DNA sequences in the herpesvirus genome (Peden et al., Cell 31:71-80, 1982; Puga et al., Cell 31:81-87, 1982). We show that sequences homologous to our EC clones are present in the herpesvirus genome, although these polypyrimidine stretches are not detected in poxvirus, adenovirus, and simian virus 40 genomes. We detect transcripts containing homology to all of these sequences in a nuclear transcription assay. Also, we show that small, polyadenylated RNA molecules homologous to B2 sequences are expressed in undifferentiated embryonal carcinoma cells but not in their differentiated derivatives. The significance of these findings is discussed.
Collapse
|
25
|
Bennett KL, Hill RE, Pietras DF, Woodworth-Gutai M, Kane-Haas C, Houston JM, Heath JK, Hastie ND. Most highly repeated dispersed DNA families in the mouse genome. Mol Cell Biol 1984; 4:1561-71. [PMID: 6208477 PMCID: PMC368948 DOI: 10.1128/mcb.4.8.1561-1571.1984] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The construction of a small library of mouse repetitive DNA has been previously reported (Pietras et al., Nucleic Acids Res. 11:6965-6983, 1983). Here we report that the 35 plasmids in this library corresponding to highly repeated (greater than 30,000 copies per genome) dispersed DNA sequences can be grouped into no more than 5 distinct families. These families together comprise 8 to 10% of the mouse genome. They include the previously described small elements B1, B2, and R and the large MIF-1 element. Twelve of the 35 clones contain evolutionarily conserved (EC) sequences. One EC clone in our library mostly consists of alternating dCdT residues; another consists of tandem repeats of the sequence CCTCT. The majority of B1s and B2s in the genome appear to be homogeneous, whereas R sequences, ECs, and MIF-1s are heterogeneous. Two earlier reports showed highly repeated mammalian DNA sequences in the herpesvirus genome (Peden et al., Cell 31:71-80, 1982; Puga et al., Cell 31:81-87, 1982). We show that sequences homologous to our EC clones are present in the herpesvirus genome, although these polypyrimidine stretches are not detected in poxvirus, adenovirus, and simian virus 40 genomes. We detect transcripts containing homology to all of these sequences in a nuclear transcription assay. Also, we show that small, polyadenylated RNA molecules homologous to B2 sequences are expressed in undifferentiated embryonal carcinoma cells but not in their differentiated derivatives. The significance of these findings is discussed.
Collapse
|
26
|
Sun L, Paulson KE, Schmid CW, Kadyk L, Leinwand L. Non-Alu family interspersed repeats in human DNA and their transcriptional activity. Nucleic Acids Res 1984; 12:2669-90. [PMID: 6546796 PMCID: PMC318698 DOI: 10.1093/nar/12.6.2669] [Citation(s) in RCA: 106] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Randomly selected human genomic clones have been surveyed for the presence of non-Alu family interspersed repeats. Four such families of repeats have been isolated and characterized with respect to repetition frequency, interspersion, base sequence, sequence divergence, in vitro RNA polymerase III transcription, elongation of transcripts in isolated nuclei, and in vivo transcription. The two most abundant of the four families of repeats correspond to previously reported families of repeats, namely the kpn I family and poly (CA). We conclude that most of the highly repetitive (greater than 50,000 copies) human interspersed repeats have already been identified. Two lower abundance repeats families are also described here. The abundance with which each of these families is represented in nuclear RNA qualitatively corresponds to their genomic reiteration frequencies. Further, the complementary strands of each repeat family are approximately symmetrically transcribed. The abundance of these repeats in cytoplasmic RNA is qualitatively less than in nuclear RNA. The bulk of the in vivo transcriptional activity of these repeats thus appears to be nonspecific read through from other promoters.
Collapse
|
27
|
Smith EJ, Bizub D, Scholl DR, Skalka AM. Characterization of a solitary long terminal repeat of avian endogenous virus origin. Virology 1984; 134:493-6. [PMID: 6545075 DOI: 10.1016/0042-6822(84)90319-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
A recombinant lambda phage library constructed with a partial EcoR1 digest of DNA from a normal RPRL line 15B chicken was screened using 32P-labeled plasmid containing Rous-associated virus (pRAV-2). Nucleotide sequence analyses of a fragment of one subclone revealed the presence of a solitary long terminal repeat (LTR) that is similar to the LTRs of avian endogenous retroviruses ev1 and ev2. This LTR is flanked by unique 6 bp direct repeats characteristic of the target site for duplication of avian leukosis viruses.
Collapse
|
28
|
Frolova EI, Zalmanzon ES. A study of viral genomes in cells transformed by the nononcogenic human adenovirus type 5 and highly oncogenic bovine adenovirus type 3. Curr Top Microbiol Immunol 1984; 111:65-89. [PMID: 6488880 DOI: 10.1007/978-3-642-69549-0_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
|
29
|
Dybvig K, Clark CD, Aliperti G, Schlesinger MJ. A chicken repetitive DNA sequence that is highly sensitive to single-strand specific endonucleases. Nucleic Acids Res 1983; 11:8495-508. [PMID: 6231528 PMCID: PMC326598 DOI: 10.1093/nar/11.23.8495] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
A DNA sequence consisting of the 5-mer AGAGG repeated tandemly 32 times has been detected in a chicken genomic clone and found to be present in about 2000 copies per chicken genome. This sequence was highly susceptible to single-strand specific endonucleases isolated from Aspergillus oryzae (S1) and mung bean, but cleavage by a single-strand specific endonuclease isolated from Neurospora crassa occurred only at a pH below 5.5. Endonucleolytic cutting of the AGAGG sequence by the single-strand specific enzymes required a supercoiled substrate and was independent of ionic strength.
Collapse
|
30
|
Gebhard W, Zachau HG. Simple DNA sequences and dispersed repetitive elements in the vicinity of mouse immunoglobulin K light chain genes. J Mol Biol 1983; 170:567-73. [PMID: 6313945 DOI: 10.1016/s0022-2836(83)80161-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The simple DNA sequences (T-G)20, (T-T-T-G-C)20 and (G-C-C-T-C-T)30 were found in the vicinity of mouse immunoglobulin genes and of dispersed repetitive elements as the R, B1 and B2 sequences. On the basis of sequence data, blot hybridizations with salmon and mouse DNA and with defined mouse DNA fragments, possible functional and evolutionary aspects of simple DNA sequences are discussed.
Collapse
|