1
|
Copy number polymorphism in the α-globin gene cluster of European rabbit (Oryctolagus cuniculus). Heredity (Edinb) 2011; 108:531-6. [PMID: 22146981 DOI: 10.1038/hdy.2011.118] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Comparative genomic studies have revealed that mammals typically possess two or more tandemly duplicated copies of the α-globin (HBA) gene. The domestic rabbit represents an exception to this general rule, as this species was found to possess a single HBA gene. Previous electrophoretic surveys of HBA polymorphism in natural populations of the European rabbit (Oryctolagus cuniculus) revealed extensive geographic variation in the frequencies of three main electromorphs. The variation in frequency of two electromorphs is mainly partitioned between two distinct subspecies of European rabbit, and a third is restricted to the hybrid zone between the two rabbit subspecies in Iberia. Here we report the results of a survey of nucleotide polymorphism, which revealed HBA copy number polymorphism in Iberian populations of the European rabbit. By characterizing patterns of HBA polymorphism in populations from the native range of the European rabbit, we were able to identify the specific amino-acid substitutions that distinguish the previously characterized electromorphs. Within the hybrid zone, we observed the existence of a second HBA gene duplicate, named HBA2, that mostly represents a novel sequence haplotype, which occurs in higher frequency within the hybrid zone, and thus appears to have arisen in hybrids of the two distinct subspecies. Although this novel gene is also present in other wild Iberian populations, it is almost absent from French populations, which suggest a recent ancestry, associated with the establishment of the post-Pleistocene contact zone between the two European rabbit subspecies.
Collapse
|
2
|
Evidence for contrasting modes of selection at interacting globin genes in the European rabbit (Oryctolagus cuniculus). Heredity (Edinb) 2008; 100:602-9. [PMID: 18493260 DOI: 10.1038/hdy.2008.26] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
In hybrid zones between genetically differentiated populations, variation in locus-specific rates of introgression may reflect adaptation to different environments or adaptation to different genetic backgrounds. The European rabbit, Oryctolagus cuniculus, is well-suited to studies of such hybrid zone dynamics because it is composed of two genetically divergent subspecies that hybridize in a zone of secondary contact in central Iberia. A species-wide survey of allozyme variation revealed a broad range of locus-specific divergence levels (F(ST) ranged from 0 to 0.54, mean F(ST)=0.16). Interestingly, the two loci that fell at opposite ends of the distribution of F(ST) values, haemoglobin alpha-chain (HBA) and haemoglobin beta-chain (HBB), encode interacting subunits of the haemoglobin protein. The contrasting patterns of spatial variation at these two loci could not be reconciled under a neutral model of population structure. The HBA gene exhibited higher-than-expected levels of population differentiation, consistent with a history of spatially varying selection. The HBB gene exhibited lower-than-expected levels of population differentiation, consistent with some form of spatially uniform selection. Patterns of linkage disequilibrium and allele frequency variation do not appear to fit any simple model of two-locus epistatic selection.
Collapse
|
3
|
Wang Z, Wei GH, Liu DP, Liang CC. Unravelling the world of cis-regulatory elements. Med Biol Eng Comput 2007; 45:709-18. [PMID: 17541666 DOI: 10.1007/s11517-007-0195-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2006] [Accepted: 05/03/2007] [Indexed: 12/16/2022]
Abstract
Genome-wide comparisons indicate that only studying the coding regions will not be enough for explaining the biological complexity of an organism, while the genetic variants and the epigenetic differences of cis-regulatory elements are crucial to elucidate many complicated biological phenomena. Their various regulatory functions also play indispensable roles in forming organismal polymorphism. Recent studies showed that the cis-regulatory elements can regulate gene expression as nuclear organizers, and involve in functional noncoding transcription and produce regulatory noncoding RNA molecules. Novel high-throughput strategies and in silico analysis make a great amount data of cis-regulatory elements available. Particularly, the computational methods could help to combine reductionist studies with network biomedical investigations, and begin the era to understand organismal regulatory events at systems biology level.
Collapse
Affiliation(s)
- Zhao Wang
- National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Dong Dan San Tiao 5, 100005 Beijing, China
| | | | | | | |
Collapse
|
4
|
Pipkin ME, Lichtenheld MG. A reliable method to display authentic DNase I hypersensitive sites at long-ranges in single-copy genes from large genomes. Nucleic Acids Res 2006; 34:e34. [PMID: 16510851 PMCID: PMC1388096 DOI: 10.1093/nar/gkl006] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The study of eukaryotic gene transcription depends on methods to discover distal cis-acting control sequences. Comparative bioinformatics is one powerful strategy to reveal these domains, but still requires conventional wet-bench techniques to elucidate their specificity and function. The DNase I hypersensitivity assay (DHA) is also a method to identify regulatory domains, but can also suggest their function. Technically however, the classical DHA is constrained to mapping gene loci in small increments of approximately 20 kb. This limitation hinders efficient and comprehensive analysis of distal gene regions. Here, we report an improved method termed mega-DHA that extends the range of existing DHAs to facilitate assaying intervals that approach 100 kb. We demonstrate its feasibility for efficient analysis of single-copy genes within a large and complex genome by assaying 230 kb of the human ADAMTS14-perforin-paladin gene cluster in four experiments. The results identify distinct networks of regulatory domains specific to expression of perforin and its two neighboring genes.
Collapse
Affiliation(s)
- Matthew E. Pipkin
- Department of Microbiology and Immunology, University of Miami, Miller School of MedicineMiami, FL, USA
| | - Mathias G. Lichtenheld
- Department of Microbiology and Immunology, University of Miami, Miller School of MedicineMiami, FL, USA
- The Sylvester Comprehensive Cancer Center, University of Miami, Miller School of MedicineMiami, FL, USA
- The Center for HIV Research, University of Miami, Miller School of MedicineMiami, FL, USA
- To whom correspondence should be addressed at Mathias G. Lichtenheld, Department of Microbiology and Immunology, University of Miami Miller School of Medicine, 1580 N.W. 10th Avenue, Batchelor Children Research Institute Room 738, Miami, FL 33136, USA. Tel: +1 305 243 3301; Fax: +1 305 243 7211;
| |
Collapse
|
5
|
Abstract
The genomes from three mammals (human, mouse, and rat), two worms, and several yeasts have been sequenced, and more genomes will be completed in the near future for comparison with those of the major model organisms. Scientists have used various methods to align and compare the sequenced genomes to address critical issues in genome function and evolution. This review covers some of the major new insights about gene content, gene regulation, and the fraction of mammalian genomes that are under purifying selection and presumed functional. We review the evolutionary processes that shape genomes, with particular attention to variation in rates within genomes and along different lineages. Internet resources for accessing and analyzing the treasure trove of sequence alignments and annotations are reviewed, and we discuss critical problems to address in new bioinformatic developments in comparative genomics.
Collapse
Affiliation(s)
- Webb Miller
- The Center for Comparative Genomics and Bioinformatics, The Huck Institutes of Life Sciences, Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA.
| | | | | | | |
Collapse
|
6
|
Abstract
The recent completion of the human genome sequence has enabled the identification of a large fraction of our gene catalogue and their physical chromosomal position. However, current efforts lag at defining the cis-regulatory sequences that control the spatial and temporal patterns of each gene's expression. This task remains difficult due to our lack of knowledge of the vocabulary controlling gene regulation and the vast genomic search space, with greater than 95% of our genome being noncoding. Recent comparative genomic-based strategies are beginning to aid in the identification of functional sequences based on their high levels of evolutionary conservation. This has proven successful for comparisons between closely related species such as human-primate or human-mouse, but also holds true for distant evolutionary comparisons, such as human-fish or human-bird. In this review we provide support for the utility of cross-species sequence comparisons by illustrating several applications of this strategy, including the identification of new genes and functional non-coding sequences. We also discuss emerging concepts as this field matures, such as how to properly select which species for comparison, which may differ significantly between independent studies.
Collapse
Affiliation(s)
- Marcelo A Nobrega
- Genome Sciences Department, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA
| | | |
Collapse
|
7
|
Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A, Green ED, Hardison RC, Miller W. MultiPipMaker and supporting tools: Alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res 2003; 31:3518-24. [PMID: 12824357 PMCID: PMC168985 DOI: 10.1093/nar/gkg579] [Citation(s) in RCA: 169] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Analysis of multiple sequence alignments can generate important, testable hypotheses about the phylogenetic history and cellular function of genomic sequences. We describe the MultiPipMaker server, which aligns multiple, long genomic DNA sequences quickly and with good sensitivity (available at http://bio.cse.psu.edu/ since May 2001). Alignments are computed between a contiguous reference sequence and one or more secondary sequences, which can be finished or draft sequence. The outputs include a stacked set of percent identity plots, called a MultiPip, comparing the reference sequence with subsequent sequences, and a nucleotide-level multiple alignment. New tools are provided to search MultiPipMaker output for conserved matches to a user-specified pattern and for conserved matches to position weight matrices that describe transcription factor binding sites (singly and in clusters). We illustrate the use of MultiPipMaker to identify candidate regulatory regions in WNT2 and then demonstrate by transfection assays that they are functional. Analysis of the alignments also confirms the phylogenetic inference that horses are more closely related to cats than to cows.
Collapse
Affiliation(s)
- Scott Schwartz
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Hardison RC, Roskin KM, Yang S, Diekhans M, Kent WJ, Weber R, Elnitski L, Li J, O'Connor M, Kolbe D, Schwartz S, Furey TS, Whelan S, Goldman N, Smit A, Miller W, Chiaromonte F, Haussler D. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res 2003; 13:13-26. [PMID: 12529302 PMCID: PMC430971 DOI: 10.1101/gr.844103] [Citation(s) in RCA: 225] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2002] [Accepted: 11/14/2002] [Indexed: 11/24/2022]
Abstract
Six measures of evolutionary change in the human genome were studied, three derived from the aligned human and mouse genomes in conjunction with the Mouse Genome Sequencing Consortium, consisting of (1) nucleotide substitution per fourfold degenerate site in coding regions, (2) nucleotide substitution per site in relics of transposable elements active only before the human-mouse speciation, and (3) the nonaligning fraction of human DNA that is nonrepetitive or in ancestral repeats; and three derived from human genome data alone, consisting of (4) SNP density, (5) frequency of insertion of transposable elements, and (6) rate of recombination. Features 1 and 2 are measures of nucleotide substitutions at two classes of "neutral" sites, whereas 4 is a measure of recent mutations. Feature 3 is a measure dominated by deletions in mouse, whereas 5 represents insertions in human. It was found that all six vary significantly in megabase-sized regions genome-wide, and many vary together. This indicates that some regions of a genome change slowly by all processes that alter DNA, and others change faster. Regional variation in all processes is correlated with, but not completely accounted for, by GC content in human and the difference between GC content in human and mouse.
Collapse
Affiliation(s)
- Ross C Hardison
- Department of Biochemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Chiaromonte F, Yang S, Elnitski L, Yap VB, Miller W, Hardison RC. Association between divergence and interspersed repeats in mammalian noncoding genomic DNA. Proc Natl Acad Sci U S A 2001; 98:14503-8. [PMID: 11717405 PMCID: PMC64711 DOI: 10.1073/pnas.251423898] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The amount of noncoding genomic DNA sequence that aligns between human and mouse varies substantially in different regions of their genomes, and the amount of repetitive DNA also varies. In this report, we show that divergence in noncoding nonrepetitive DNA is strongly correlated with the amount of repetitive DNA in a region. We investigated aligned DNA in four large genomic regions with finished human sequence and almost or completely finished mouse sequence. These regions, totaling 5.89 Mb of DNA, are on different chromosomes and vary in their base composition. An analysis based on sliding windows of 10 kb shows that the fraction of aligned noncoding nonrepetitive DNA and the fraction of repetitive DNA are negatively correlated, both at the level of an entire region and locally within it. This conclusion is strongly supported by a randomization study, in which repetitive elements are removed and randomly relocated along the sequences. Thus, regions of noncoding genomic DNA that accumulated fewer point mutations since the primate-rodent divergence also suffered fewer retrotransposition events. These results indicate that some regions of the genome are more "flexible" over the time scale of mammalian evolution, being able to accommodate many point mutations and insertions, whereas other regions are more "rigid" and accumulate fewer changes. Stronger conservation is generally interpreted as indicating more extensive or more important function. The evidence presented here of correlated variation in the rates of different evolutionary processes across noncoding DNA must be considered in assessing such conservation for evidence of selection.
Collapse
Affiliation(s)
- F Chiaromonte
- Department of Statistics, Pennsylvania State University, University Park, PA 16802, USA
| | | | | | | | | | | |
Collapse
|
10
|
Abstract
With the continuing accomplishments of the human genome project, high-throughput strategies to identify DNA sequences that are important in mammalian gene regulation are becoming increasingly feasible. In contrast to the historic, labour-intensive, wet-laboratory methods for identifying regulatory sequences, many modern approaches are heavily focused on the computational analysis of large genomic data sets. Data from inter-species genomic sequence comparisons and genome-wide expression profiling, integrated with various computational tools, are poised to contribute to the decoding of genomic sequence and to the identification of those sequences that orchestrate gene regulation. In this review, we highlight several genomic approaches that are being used to identify regulatory sequences in mammalian genomes.
Collapse
Affiliation(s)
- L A Pennacchio
- Genome Sciences Department, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, USA
| | | |
Collapse
|
11
|
Ferrand N, Azevedo M, Mougel F. A diallelic short tandem repeat (CCCCG)4 or 5, located in intron 1 of rabbit alpha-globin gene. Anim Genet 2000; 31:74-5. [PMID: 10690373 DOI: 10.1111/j.1365-2052.2000.579-9.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- N Ferrand
- Departamento de Zoologia e Antropologia, Faculdade de Ciências, Universidade do Porto, Portugal.
| | | | | |
Collapse
|
12
|
Endrizzi M, Huang S, Scharf JM, Kelter AR, Wirth B, Kunkel LM, Miller W, Dietrich WF. Comparative sequence analysis of the mouse and human Lgn1/SMA interval. Genomics 1999; 60:137-51. [PMID: 10486205 DOI: 10.1006/geno.1999.5910] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Human chromosome 5q11.2-q13.3 and its ortholog on mouse chromosome 13 contain candidate genes for an inherited human neurodegenerative disorder called spinal muscular atrophy (SMA) and for an inherited mouse susceptibility to infection with Legionella pneumophila (Lgn1). These homologous genomic regions also have unusual repetitive organizations that create practical difficulties in mapping and raise interesting issues about the evolutionary origin of the repeats. In an attempt to analyze this region in detail, and as a way to identify additional candidate genes for these diseases, we have determined the sequence of 179 kb of the mouse Lgn1/SMA interval. We have analyzed this sequence using BLAST searches and various exon prediction programs to identify potential genes. Since these methods can generate false-positive exon declarations, our alignments of the mouse sequence with available human orthologous sequence allowed us to discriminate rapidly among this collection of potential coding regions by indicating which regions were well conserved and were more likely to represent actual coding sequence. As a result of our analysis, we accurately mapped two additional genes in the SMA interval that can be tested for involvement in the pathogenesis of SMA. While no new Lgn1 candidates emerged, we have identified new genetic markers that exclude Smn as an Lgn1 candidate. In addition to providing important resources for studying SMA and Lgn1, our data provide further evidence of the value of sequencing the mouse genome as a means to help with the annotation of the human genomic sequence and vice versa.
Collapse
Affiliation(s)
- M Endrizzi
- Department of Genetics, Harvard Medical School, 200 Longwood Avenue, Boston, Massachusetts 02115, USA
| | | | | | | | | | | | | | | |
Collapse
|
13
|
Abstract
BACKGROUND Nucleotide substitution rates and G + C content vary considerably among mammalian genes. It has been proposed that the mammalian genome comprises a mosaic of regions - termed isochores - with differing G + C content. The regional variation in gene G + C content might therefore be a reflection of the isochore structure of chromosomes, but the factors influencing the variation of nucleotide substitution rate are still open to question. RESULTS To examine whether nucleotide substitution rates and gene G + C content are influenced by the chromosomal location of genes, we compared human and murid (mouse or rat) orthologues known to belong to one of the chromosomal (autosomal) segments conserved between these species. Multiple members of gene families were excluded from the dataset. Sets of neighbouring genes were defined as those lying within 1 centiMorgan (cM) of each other on the mouse genetic map. For both synonymous substitution rates and G + C content at silent sites, neighbouring genes were found to be significantly more similar to each other than sets of genes randomly drawn from the dataset. Moreover, we demonstrated that the regional similarities in G + C content (isochores) and synonymous substitution rate were independent of each other. CONCLUSIONS Our results provide the first substantial statistical evidence for the existence of a regional variation in the synonymous substitution rate within the mammalian genome, indicating that different chromosomal regions evolve at different rates. This regional phenomenon which shapes gene evolution could reflect the existence of 'evolutionary rate units' along the chromosome.
Collapse
Affiliation(s)
- G Matassi
- Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham, NG7 2UH, UK.
| | | | | |
Collapse
|
14
|
Abstract
Transcriptional repression in eukaryotes often involves tens or hundreds of kilobase pairs, two to three orders of magnitude more than the bacterial operator/repressor model does. Classical repression, represented by this model, was maintained over the whole span of evolution under different guises, and consists of repressor factors interacting primarily with promoters and, in later evolution, also with enhancers. The use of much larger amounts of DNA in the other mode of repression, here called the sectorial mode ('superrepression'), results in the conceptual transfer of so-called junk DNA to the domain of functional DNA. This contribution to the solution of the c-value paradox involves perhaps 15% of genomic 'junk,' and encompasses the bulk of the introns, thought to fill a stabilizing role in sectorially repressed chromatin structures. In the case of developmental genes, such structures appear to be heterochromatoid in character. However, solid clues regarding general structural features of superrepressed terminal differentiation genes remain elusive. The competition among superrepressible DNA sectors for sectorially binding factors offers, in principle, a molecular mechanism for developmental switches. Position effect variegation may be considered an abnormal manifestation of normal processes that underly development and involve heterochromatoid sectorial repression, which is apparently required for local elimination or modulation of morphological features (morpholysis). Sectorial repression of genes participating either in development or in terminal differentiation is considered instrumental in establishing stable cell types, and provides a basis for the distinction between determination and cell type specification. The gamut of possible stable cell types may have been broadened by the appearance in evolution of heavy isochores. Additional types of relatively frequent GC-rich cis-acting DNA motifs may offer reiterated binding sites to factors endowed with a selective (though not individually strong) affinity for these motifs. The majority of sequence motifs thought to be used in superrepression need not be individually maintained by natural selection. It is re-emphasized that the dispensability of sequences is not an indicator of their nonfunctionality and that in many cases, along noncoding sequences, nucleotides tend to fill functions collectively, rather than individually.
Collapse
Affiliation(s)
- E Zuckerkandl
- Institute of Molecular Medical Sciences, Palo Alto, CA 94306, USA
| |
Collapse
|
15
|
Shewchuk BM, Hardison RC. CpG islands from the alpha-globin gene cluster increase gene expression in an integration-dependent manner. Mol Cell Biol 1997; 17:5856-66. [PMID: 9315643 PMCID: PMC232433 DOI: 10.1128/mcb.17.10.5856] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
In contrast to other globin genes, the human and rabbit alpha-globin genes are expressed in transfected erythroid and nonerythroid cells in the absence of an enhancer. This enhancer-independent expression of the alpha-globin gene requires extensive sequences not only from the 5' flanking sequence but also from the intragenic region. However, the features of these internal sequences that are responsible for their positive effect are unclear. We tested several possible determinants of this activity. One possibility is that a previously identified array of discrete binding sites for known and potential regulatory proteins within the alpha-globin gene comprise an intragenic enhancer specific for the alpha-globin promoter, but directed rearrangements of the sequences show that this is not the case. Alternatively, the promoter may extend into the gene, with the function of the discrete binding sites being dependent on maintenance of their proper positions and orientations relative to the 5' flanking sequence. However, the positive effects observed in gene fusions do not localize to a discrete region of the alpha-globin gene and the results of internal deletions and point mutations argue against a required role of the targeted discrete binding sites. A third possibility is that the CpG island, which includes both the 5' flanking and intragenic regions associated with the positive activity, may itself have a more general effect on expression in transfected cells. Indeed, we show that the size of the CpG island in constructs correlates with the level of gene expression. Furthermore, the alpha-globin promoter is more active in the context of a previously inactive CpG island than in an A+T-rich context, showing that the CpG island provides an environment more permissive for expression. These effects are seen only after integration, suggesting a possible mechanism at the level of chromatin structure.
Collapse
Affiliation(s)
- B M Shewchuk
- Department of Biochemistry and Molecular Biology, The Center for Gene Regulation, The Pennsylvania State University, University Park, 16802, USA
| | | |
Collapse
|
16
|
Hardison RC, Oeltjen J, Miller W. Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. Genome Res 1997; 7:959-66. [PMID: 9331366 DOI: 10.1101/gr.7.10.959] [Citation(s) in RCA: 209] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- R C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | | | | |
Collapse
|
17
|
Bailey AD, Shen CC, Shen CK. Molecular origin of the mosaic sequence arrangements of higher primate alpha-globin duplication units. Proc Natl Acad Sci U S A 1997; 94:5177-82. [PMID: 9144211 PMCID: PMC24652 DOI: 10.1073/pnas.94.10.5177] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
The human adult alpha-globin locus consists of three pairs of homology blocks (X, Y, and Z) interspersed with three nonhomology blocks (I, II, and III), and three Alu family repeats, Alu1, Alu2, and Alu3. It has been suggested that an ancient primate alpha-globin-containing unit was ancestral to the X, Y, and Z and the Alu1/Alu2 repeats. However, the evolutionary origin of the three nonhomologous blocks has remained obscure. We have now analyzed the sequence organization of the entire adult alpha-globin locus of gibbon (Hylobates lar). DNA segments homologous to human block I occur in both duplication units of the gibbon alpha-globin locus. Detailed interspecies sequence comparisons suggest that nonhomologous blocks I and II, as well as another sequence, IV, were all part of the ancestral alpha-globin-containing unit prior to its tandem duplication. However, sometime thereafter, block I was deleted from the human alpha1-globin-containing unit, and block II was also deleted from the alpha2-globin-containing unit in both human and gibbon. These were probably independent events both mediated by independent illegitimate recombination processes. Interestingly, the end points of these deletions coincide with potential insertion sites of Alu family repeats. These results suggest that the shaping of DNA segments in eukaryotic genomes involved the retroposition of repetitive DNA elements in conjunction with simple DNA recombination processes.
Collapse
Affiliation(s)
- A D Bailey
- Section of Molecular and Cellular Biology, University of California, Davis, CA 95616, USA
| | | | | |
Collapse
|
18
|
James-Pederson M, Yost S, Shewchuk B, Zeigler T, Miller R, Hardison R. Flanking and intragenic sequences regulating the expression of the rabbit alpha-globin gene. J Biol Chem 1995; 270:3965-73. [PMID: 7533158 DOI: 10.1074/jbc.270.8.3965] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Despite their descent from a common ancestral gene and the requirement for coordinated, tissue-specific regulation, the alpha- and beta-globin genes in many mammals are regulated in distinctly different ways. Unlike the beta-globin gene, the rabbit alpha-globin gene is transiently expressed at a high level without an added enhancer in transfected erythroid and non-erythroid cells. By examining a series of alpha/beta fusion genes, we show that internal sequences of the rabbit alpha-globin gene (within the first two exons and introns) are required along with the 5' flank for this enhancer-independent expression. Furthermore, deletion of the introns of the alpha-globin gene, or replacement by introns of the beta-globin gene, results in severely decreased expression of the transfecting genes. Hybrid constructs between segments of the alpha-globin gene and a luciferase gene confirm that internal alpha-globin sequences are needed for high level production of RNA in transfected cells. The flanking and internal sequences implicated in regulation of the rabbit alpha-globin gene coincide with a prominent CpG-rich island and may comprise an extended promoter (including both flanking and intragenic sequences) that is active in transfected cells without an enhancer.
Collapse
Affiliation(s)
- M James-Pederson
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park 16802
| | | | | | | | | | | |
Collapse
|
19
|
Nuclear protein-binding sites in a transcriptional control region of the rabbit alpha-globin gene. Mol Cell Biol 1993. [PMID: 8355692 DOI: 10.1128/mcb.13.9.5439] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The 5'-flanking and internal regions of the rabbit alpha-globin gene, which constitute a CpG island, are required for enhancer-independent expression in transfected cells. In this study, electrophoretic mobility shift assays revealed that a battery of nuclear proteins from both erythroid and nonerythroid cells bind specifically to these regulatory regions. Assays based on exonuclease III digestion, methylation interference, and DNase I footprinting identified sequences bound by proteins in crude nuclear extracts and by purified transcription factor Sp1. In the 5' flank, recognition sites for the transcription factors alpha-IRP (positions -53 to -44 relative to the cap site), CP1 (-73 to -69), and Sp1 (-95 to -90) are bound by proteins in K562 cell nuclear extracts, as are three extended upstream regions. Two recognition sites for Sp1 in intron 1 are also bound both by proteins in crude nuclear extracts and by purified Sp1. The sequences CCAC in intron 2 and C5 in the 3'-untranslated region also bind proteins. A major binding site found in exon 1, TATGGCGC, matches in sequence and methylation interference pattern the binding site for nuclear protein YY1, and binding is inhibited through competition by YY1-specific oligonucleotides. The protein-binding sites flanking and internal to the rabbit alpha-globin gene may form an extended promoter.
Collapse
|
20
|
Yost SE, Shewchuk B, Hardison R. Nuclear protein-binding sites in a transcriptional control region of the rabbit alpha-globin gene. Mol Cell Biol 1993; 13:5439-49. [PMID: 8355692 PMCID: PMC360253 DOI: 10.1128/mcb.13.9.5439-5449.1993] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
The 5'-flanking and internal regions of the rabbit alpha-globin gene, which constitute a CpG island, are required for enhancer-independent expression in transfected cells. In this study, electrophoretic mobility shift assays revealed that a battery of nuclear proteins from both erythroid and nonerythroid cells bind specifically to these regulatory regions. Assays based on exonuclease III digestion, methylation interference, and DNase I footprinting identified sequences bound by proteins in crude nuclear extracts and by purified transcription factor Sp1. In the 5' flank, recognition sites for the transcription factors alpha-IRP (positions -53 to -44 relative to the cap site), CP1 (-73 to -69), and Sp1 (-95 to -90) are bound by proteins in K562 cell nuclear extracts, as are three extended upstream regions. Two recognition sites for Sp1 in intron 1 are also bound both by proteins in crude nuclear extracts and by purified Sp1. The sequences CCAC in intron 2 and C5 in the 3'-untranslated region also bind proteins. A major binding site found in exon 1, TATGGCGC, matches in sequence and methylation interference pattern the binding site for nuclear protein YY1, and binding is inhibited through competition by YY1-specific oligonucleotides. The protein-binding sites flanking and internal to the rabbit alpha-globin gene may form an extended promoter.
Collapse
Affiliation(s)
- S E Yost
- Department of Molecular and Cell Biology, Pennsylvania State University, University Park 16802
| | | | | |
Collapse
|
21
|
Turker MS, Cooper GE, Bishop PL. Region-specific rates of molecular evolution: a fourfold reduction in the rate of accumulation of "silent" mutations in transcribed versus nontranscribed regions of homologous DNA fragments derived from two closely related mouse species. J Mol Evol 1993; 36:31-40. [PMID: 8433377 DOI: 10.1007/bf02407304] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
We have sequenced homologous DNA fragments of 2.7 and 2.8 kbp derived from the closely related mouse species Mus musculus domesticus (M. domesticus) and Mus musculus musculus (M. musculus), respectively. These two species diverged approximately 1 million years ago. Each DNA fragment contains 1.35 kbp of the 3' end of the constitutively expressed 2.2-kbp aprt (adenine phosphoribosyltransferase) gene and a similarly sized nontranscribed region downstream of the aprt gene. The aprt gene region contains protein coding sequences (0.35 kbp), intronic sequences (0.75 kbp), and a 3' nontranslated sequence (0.25 kbp). Both the M. domesticus and M. musculus downstream regions share three partial copies of the B1 repetitive element with the M. musculus downstream region containing an additional complete copy of this element. A comparison of the 2.7- and 2.8-kbp DNA fragments revealed a total of 63 molecular alterations (i.e., mutations) that were approximately fourfold more abundant in the nontranscribed downstream region than in the transcribed aprt gene. Of the 11 mutations observed in the transcribed region, 7 were found in introns, 3 in the 3' untranslated sequence, and 1 was a synonymous change in an exon. A comparison of the human and M. domesticus aprt genes has previously revealed no homology in either the intronic or 3' nontranslated regions with the exception of a 26-bp sequence in intron 3 and sequences at the exon/intron boundaries necessary for correct mRNA splicing (Broderick et al., Proc. Natl. Acad. Sci. USA, 84:3349, 1987). Therefore, there does not appear to be selective pressure for sequences within these regions. We conclude that there is a lower rate of accumulation of "silent" mutations in the transcribed mouse aprt gene than in a contiguous nontranscribed downstream region. A possible molecular mechanism involving preferential DNA repair for the transcribed region is discussed.
Collapse
Affiliation(s)
- M S Turker
- Department of Pathology, University of Kentucky College of Medicine, Lexington 40536
| | | | | |
Collapse
|