1
|
Halligan DL, Keightley PD. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Res 2006; 16:875-84. [PMID: 16751341 PMCID: PMC1484454 DOI: 10.1101/gr.5022906] [Citation(s) in RCA: 181] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Non-coding DNA comprises approximately 80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.
Collapse
Affiliation(s)
- Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | |
Collapse
|
2
|
Comeron JM. Selective and mutational patterns associated with gene expression in humans: influences on synonymous composition and intron presence. Genetics 2005; 167:1293-304. [PMID: 15280243 PMCID: PMC1470943 DOI: 10.1534/genetics.104.026351] [Citation(s) in RCA: 158] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
We report the results of a comprehensive study of the influence of gene expression on synonymous codons, amino acid composition, and intron presence and size in human protein-coding genes. First, in addition to a strong effect of isochores, we have detected the influence of transcription-associated mutational biases (TAMB) on gene composition. Genes expressed in different tissues show diverse degrees of TAMB, with genes expressed in testis showing the greatest influence. Second, the study of tissues with no evidence of TAMB reveals a consistent set of optimal synonymous codons favored in highly expressed genes. This result exposes the consequences of natural selection on synonymous composition to increase efficiency of translation in the human lineage. Third, overall amino acid composition of proteins closely resembles tRNA abundance but there is no difference in amino acid composition in differentially expressed genes. Fourth, there is a negative relationship between expression and CDS length. Significantly, this is observed only among genes with introns, suggesting that the cause for this relationship in humans cannot be associated only with costs of amino acid biosynthesis. Fifth, we show that broadly and highly expressed genes have more, although shorter, introns. The selective advantage for having more introns in highly expressed genes is likely counterbalanced by containment of transcriptional costs and a minimum exon size for proper splicing.
Collapse
Affiliation(s)
- Josep M Comeron
- Department of Biological Sciences, University of Iowa, Iowa City, Iowa 52242, USA.
| |
Collapse
|
3
|
Gomulski LM, Brogna S, Babaratsas A, Gasperi G, Zacharopoulou A, Savakis C, Bourtzis K. Molecular Basis of the Size Polymorphism of the First Intron of theAdh-1 Gene of the Mediterranean Fruit Fly, Ceratitis capitata. J Mol Evol 2004; 58:732-42. [PMID: 15461430 DOI: 10.1007/s00239-004-2596-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
The first intron of the gene encoding one of the alcohol dehydrogenase isoenzymes (ADH-1) in Ceratitis capitata is highly polymorphic in size. Five size variants of this intron were isolated from different strains and populations and characterized. Restriction map and sequence analysis showed that the intron size polymorphism is due to the presence or absence of (a) a copy of a defective mariner-like element, postdoc; (b) an approximately 550-bp 3' indel which exhibits no similarity to any known sequence; and (c) a central duplication of 704 bp consisting of part of the 3' end of the postdoc element, the region between postdoc and the 3' indel, and the first 20 bp of the 3' indel. The homologous Adh-1 intron was amplified from the congeneric species, Ceratitis rosa, in order to obtain an outgroup for comparative and phylogenetic analyses. The C. rosa introns were polymorphic in size, ranging from about 1100 to 2000 bp, the major difference between them being the presence or absence of a mariner-like element Crmar2, unrelated to the postdoc element. Phylogenetic analysis suggests that the shorter intron variants in C. capitata may represent the ancestral form of the intron, the longest variants apparently being the most recent.
Collapse
Affiliation(s)
- Ludvik M Gomulski
- Department of Animal Biology, University of Pavia, Piazza Botta 9, 127100 Pavia, Italy
| | | | | | | | | | | | | |
Collapse
|
4
|
Ptak SE, Petrov DA. How intron splicing affects the deletion and insertion profile in Drosophila melanogaster. Genetics 2002; 162:1233-44. [PMID: 12454069 PMCID: PMC1462315 DOI: 10.1093/genetics/162.3.1233] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Studies of "dead-on-arrival" transposable elements in Drosophila melanogaster found that deletions outnumber insertions approximately 8:1 with a median size for deletions of approximately 10 bp. These results are consistent with the deletion and insertion profiles found in most other Drosophila pseudogenes. In contrast, a recent study of D. melanogaster introns found a deletion/insertion ratio of 1.35:1, with 84% of deletions being shorter than 10 bp. This discrepancy could be explained if deletions, especially long deletions, are more frequently strongly deleterious than insertions and are eliminated disproportionately from intron sequences. To test this possibility, we use analysis and simulations to examine how deletions and insertions of different lengths affect different components of splicing and determine the distribution of deletions and insertions that preserve the original exons. We find that, consistent with our predictions, longer deletions affect splicing at a much higher rate compared to insertions and short deletions. We also explore other potential constraints in introns and show that most of these also disproportionately affect large deletions. Altogether we demonstrate that constraints in introns may explain much of the difference in the pattern of deletions and insertions observed in Drosophila introns and pseudogenes.
Collapse
Affiliation(s)
- Susan E Ptak
- Department of Biological Sciences, Stanford University, California 94305, USA.
| | | |
Collapse
|
5
|
Comeron JM, Kreitman M. The correlation between intron length and recombination in drosophila. Dynamic equilibrium between mutational and selective forces. Genetics 2000; 156:1175-90. [PMID: 11063693 PMCID: PMC1461334 DOI: 10.1093/genetics/156.3.1175] [Citation(s) in RCA: 150] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Intron length is negatively correlated with recombination in both Drosophila melanogaster and humans. This correlation is not likely to be the result of mutational processes alone: evolutionary analysis of intron length polymorphism in D. melanogaster reveals equivalent ratios of deletion to insertion in regions of high and low recombination. The polymorphism data do reveal, however, an excess of deletions relative to insertions (i.e., a deletion bias), with an overall deletion-to-insertion events ratio of 1.35. We propose two types of selection favoring longer intron lengths. First, the natural mutational bias toward deletion must be opposed by strong selection in very short introns to maintain the minimum intron length needed for the intron splicing reaction. Second, selection will favor insertions in introns that increase recombination between mutations under the influence of selection in adjacent exons. Mutations that increase recombination, even slightly, will be selectively favored because they reduce interference among selected mutations. Interference selection acting on intron length mutations must be very weak, as indicated by frequency spectrum analysis of Drosophila intron length polymorphism, making the equilibrium for intron length sensitive to changes in the recombinational environment and population size. One consequence of this sensitivity is that the advantage of longer introns is expected to decrease inversely with the rate of recombination, thus leading to a negative correlation between intron length and recombination rate. Also in accord with this model, intron length differs between closely related Drosophila species, with the longest variant present more often in D. melanogaster than in D. simulans. We suggest that the study of the proposed dynamic model, taking into account interference among selected sites, might shed light on many aspects of the comparative biology of genome sizes including the C value paradox.
Collapse
Affiliation(s)
- J M Comeron
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
| | | |
Collapse
|
6
|
Intron-exon structures. ACTA ACUST UNITED AC 1998. [DOI: 10.1016/s1067-5701(98)80020-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
7
|
Ala-Kokko L, Kvist AP, Metsäranta M, Kivirikko KI, de Crombrugghe B, Prockop DJ, Vuorio E. Conservation of the sizes of 53 introns and over 100 intronic sequences for the binding of common transcription factors in the human and mouse genes for type II procollagen (COL2A1). Biochem J 1995; 308 ( Pt 3):923-9. [PMID: 8948452 PMCID: PMC1136812 DOI: 10.1042/bj3080923] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Over 11,000 bp of previously undefined sequences of the human COL2A1 gene were defined. The results made it possible to compare the intron structures of a highly complex gene from man and mouse. Surprisingly, the sizes of the 53 introns of the two genes were highly conserved with a mean difference of 13%. After alignment of the sequences, 69% of the intron sequences were identical. The introns contained consensus sequences for the binding of over 100 different transcription factors that were conserved in the introns of the two genes. The first intron of the gene contained 80 conserved consensus sequences and the remaining 52 introns of the gene contained 106 conserved sequences for the binding of transcription factors. The 5'-end of intron 2 in both genes had a potential for forming a stem loop in RNA transcripts.
Collapse
Affiliation(s)
- L Ala-Kokko
- Collagen Research Unit, University of Oulu, Finland
| | | | | | | | | | | | | |
Collapse
|
8
|
Wälchli C, Koller E, Trueb J, Trueb B. Structural comparison of the chicken genes for alpha 1(VI) and alpha 2(VI) collagen. EUROPEAN JOURNAL OF BIOCHEMISTRY 1992; 205:583-9. [PMID: 1572359 DOI: 10.1111/j.1432-1033.1992.tb16816.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The chicken alpha 1(VI) polypeptide is encoded by a single gene spanning 21 kbp of genomic DNA. This gene is composed of 34 exons and 33 introns. Its structure is closely related to that of the alpha 2(VI) collagen gene, suggesting that the two genes evolved by gene duplication. Both genes contain 19 exons coding for the triple-helical domain. These exons are multiples of 9 bp (27, 36, 45, 54, 63 and 90 bp) and encode an integral number of collagenous Gly-Xaa-Yaa triplets. Since there is no convincing correlation to a building block of 54 bp, it is unlikely that type VI collagen has evolved from a primordial 54-bp module as suggested for all fibrillar collagens.
Collapse
Affiliation(s)
- C Wälchli
- Laboratorium für Biochemie I, Eidgenössische Technische Hochschule, Zürich, Switzerland
| | | | | | | |
Collapse
|
9
|
Nah H, Upholt W. Type II collagen mRNA containing an alternatively spliced exon predominates in the chick limb prior to chondrogenesis. J Biol Chem 1991. [DOI: 10.1016/s0021-9258(18)54517-8] [Citation(s) in RCA: 48] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
10
|
Goldfarb LG, Brown P, McCombie WR, Goldgaber D, Swergold GD, Wills PR, Cervenakova L, Baron H, Gibbs CJ, Gajdusek DC. Transmissible familial Creutzfeldt-Jakob disease associated with five, seven, and eight extra octapeptide coding repeats in the PRNP gene. Proc Natl Acad Sci U S A 1991; 88:10926-30. [PMID: 1683708 PMCID: PMC53045 DOI: 10.1073/pnas.88.23.10926] [Citation(s) in RCA: 209] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
The PRNP gene, encoding the amyloid precursor protein that is centrally involved in Creutzfeldt-Jakob disease (CJD), has an unstable region of five variant tandem octapeptide coding repeats between codons 51 and 91. We screened a total of 535 individuals for the presence of extra repeats in this region, including patients with sporadic and familial forms of spongiform encephalopathy, members of their families, other neurological and non-neurological patients, and normal controls. We identified three CJD families (in each of which the proband's disease was neuropathologically confirmed and experimentally transmitted to primates) that were heterozygous for alleles with 10, 12, or 13 repeats, some of which had "wobble" nucleotide substitutions. We also found one individual with 9 repeats and no nucleotide substitutions who had no evidence of neurological disease. These observations, together with data on published British patients with 11 and 14 repeats, strongly suggest that the occurrence of 10 or more octapeptide repeats in the encoded amyloid precursor protein predisposes to CJD.
Collapse
Affiliation(s)
- L G Goldfarb
- Laboratory of Central Nervous System Studies, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD 20892
| | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Sandell LJ, Morris N, Robbins JR, Goldring MB. Alternatively spliced type II procollagen mRNAs define distinct populations of cells during vertebral development: differential expression of the amino-propeptide. J Biophys Biochem Cytol 1991; 114:1307-19. [PMID: 1894696 PMCID: PMC2289128 DOI: 10.1083/jcb.114.6.1307] [Citation(s) in RCA: 261] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Type II collagen is a major component of cartilage providing structural integrity to the tissue. Type II procollagen can be expressed in two forms by differential splicing of the primary gene transcript. The two mRNAs either include (type IIA) or exclude (type IIB) an exon (exon 2) encoding the major portion of the amino (NH2)-propeptide (Ryan, M. C., and L. J. Sandell. 1990. J. Biol. Chem. 265:10334-10339). The expression of the two procollagens was examined in order to establish a potential functional significance for the two type II procollagen mRNAs. First, to establish whether the two mRNAs are functional, we showed that both mRNAs can be translated and the proteins secreted into the extracellular environment. Both proteins were identified as type II procollagens. Secondly, to test the hypothesis that differential expression of type II procollagens may be a marker for a distinct population of cells, specific procollagen mRNAs were localized in tissue by in situ hybridization to oligonucleotides spanning the exon junctions. Embryonic vertebral column was chosen as a source of tissue undergoing rapid chondrogenesis, allowing the examination of a variety of cell types related to cartilage. In this issue, each procollagen mRNA had a distinct tissue distribution during chondrogenesis with type IIB expressed in chondrocytes and type IIA expressed in cells surrounding cartilage in prechondrocytes. The morphology of the cells expressing the two collagen types was distinct: the cells expressing type IIA are narrow, elongated, and "fibroblastic" in appearance while the cells expressing type IIB are large and round. The expression of type IIB appears to be correlated with abundant synthesis and accumulation of cartilagenous extracellular matrix. The expression of type IIB is spatially correlated with the high level expression of the cartilage proteoglycan, aggrecan, establishing type IIB procollagen and aggrecan as markers for the chondrocyte phenotype. Transcripts of type II collagen, primarily type IIA, are also expressed in embryonic spinal ganglion. While small amounts of type II collagen have been previously detected in noncartilagenous tissues, the detection of this new form of the collagen in relatively high abundance in embryonic nerve tissue is unique. Taken together, these findings imply a potential functional difference between type IIA and type IIB procollagens and indicate that the removal of exon 2 from the pre-mRNA, and consequently the NH2-propeptide from the collagen molecule, may be an important step in chondrogenesis. In addition, type II procollagen, specifically type IIA, may function in noncartilage tissues, particularly during development.
Collapse
Affiliation(s)
- L J Sandell
- Department of Orthopaedics, University of Washington, Seattle
| | | | | | | |
Collapse
|
12
|
Metsäranta M, Toman D, de Crombrugghe B, Vuorio E. Mouse type II collagen gene. Complete nucleotide sequence, exon structure, and alternative splicing. J Biol Chem 1991. [DOI: 10.1016/s0021-9258(18)55382-5] [Citation(s) in RCA: 107] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
13
|
Huang MC, Seyer JM, Thompson JP, Spinella DG, Cheah KS, Kang AH. Genomic organization of the human procollagen alpha 1(II) collagen gene. EUROPEAN JOURNAL OF BIOCHEMISTRY 1991; 195:593-600. [PMID: 1999183 DOI: 10.1111/j.1432-1033.1991.tb15742.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The nucleotide sequence of the human procollagen alpha 1(II) collagen gene extending from within the first intron through exon 15, and part of the 15th intron has been determined. This sequence analysis (7056 bases) identifies the intron/exon organization of the region of this gene encoding the N-propeptide and part of the triple-helical domain. Structural comparison of this with the genes of other human fibrillar collagens shows considerable diversity in terms of size and number of introns and exons that encodes the N-propeptide domain. Although the genomic structure of the human procollagen alpha 1(II) gene is quite different from the rat procollagen alpha 1(II) gene, the nucleotide coding sequences are 89% identical.
Collapse
Affiliation(s)
- M C Huang
- Department of Medicine, University of Tennessee
| | | | | | | | | | | |
Collapse
|
14
|
Cheah KS, Au PK, Lau ET, Little PF, Stubbs L. The mouse Col2a-1 gene is highly conserved and is linked to Int-1 on chromosome 15. Mamm Genome 1991; 1:171-83. [PMID: 1797232 DOI: 10.1007/bf00351064] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Type II collagen is the major extracellular matrix component of cartilage and correct expression of the alpha 1(II) collagen gene is important for vertebrate skeletal development. In order to provide the basis for studying the control of type II collagen gene expression in embryogenesis and in mouse models of human connective tissue disease, the complete mouse Col2-a1 gene has been isolated in a single cosmid clone, cosMco1.2, and partially characterized. The gene is approximately 30 kb and is highly conserved in exon/intron structure and nucleotide and amino acid sequence (greater than 80% homology) when compared with the human, rat, bovine and chicken equivalents. A high degree of conservation was also found in the 5' flanking region of the rat, human and mouse alpha 1(II) collagen genes, including the presence of several G + C and C + T rich, direct repeat motifs. The sites of transcription start, termination codon and polyadenylation have also been identified. Unlike chicken, bovine and human, where polyA attachment is at a single site, for the mouse Col2a-1 gene two polyadenylation sites are utilized. Col2a-1 has also been localized by interspecies backcross analysis to the central portion of mouse Chromosome (Chr) 15, approximately 8 centiMorgans (cM) proximal of Int-1 and 18 cM distal of Myc. Col2a-1 is therefore included in a linkage group which is conserved on human Chr 12q.
Collapse
Affiliation(s)
- K S Cheah
- Department of Biochemistry, Hong Kong University
| | | | | | | | | |
Collapse
|
15
|
Tryggvason K, Soininen R, Hostikka SL, Ganguly A, Huotari M, Prockop DJ. Structure of the human type IV collagen genes. Ann N Y Acad Sci 1990; 580:97-111. [PMID: 2186699 DOI: 10.1111/j.1749-6632.1990.tb17922.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
16
|
Baldwin CT, Reginato AM, Smith C, Jimenez SA, Prockop DJ. Structure of cDNA clones coding for human type II procollagen. The alpha 1(II) chain is more similar to the alpha 1(I) chain than two other alpha chains of fibrillar collagens. Biochem J 1989; 262:521-8. [PMID: 2803268 PMCID: PMC1133299 DOI: 10.1042/bj2620521] [Citation(s) in RCA: 88] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Overlapping cDNA clones were isolated for human type II procollagen. Nucleotide sequencing of the clones provided over 2.5 kb of new coding sequences for the human pro alpha 1(II) gene and the first complete amino acid sequence of type II procollagen from any species. Comparison with published data for cDNA clones covering the entire lengths of the human type I and type III procollagens made it possible to compare in detail the coding sequences and primary structures of the three most abundant human fibrillar collagens. The results indicated that the marked preference in the third base codons for glycine, proline and alanine previously seen in other fibrillar collagens was maintained in type II procollagen. The domains of the pro alpha 1(II) chain are about the same size as the same domains of the pro alpha chains of type I and type III procollagens. However, the major triple-helical domain is 15 amino acid residues less than the triple-helical domain of type III procollagen. Comparison of hydropathy profiles indicated that the alpha chain domain of type II procollagen is more similar to the alpha chain domain of the pro alpha 1(I) chain than to the pro alpha 2(I) chain or the pro alpha 1(III) chain. The results therefore suggest that selective pressure in the evolution of the pro alpha 1(II) and pro alpha 1(I) genes is more similar than the selective pressure in the evolution of the pro alpha 2(I) and pro alpha 1(III) genes.
Collapse
Affiliation(s)
- C T Baldwin
- Department of Biochemistry and Molecular Biology, Jefferson Medical College, Thomas Jefferson University, Philadelphia, PA 19107
| | | | | | | | | |
Collapse
|
17
|
Soininen R, Huotari M, Ganguly A, Prockop DJ, Tryggvason K. Structural Organization of the Gene for the α1 Chain of Human Type IV Collagen. J Biol Chem 1989. [DOI: 10.1016/s0021-9258(18)80034-5] [Citation(s) in RCA: 59] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
18
|
Abstract
The structure of type II collagen gene is extremely well conserved but contains a cluster of high frequency polymorphisms in a 2.2 kb area. Here we report the nucleotide sequence of this DNA area, essential for the PCR-facilitated RFLP-analyses of this gene. In the structural analyses we found four differences in the deduced human amino acid sequence when compared to the published bovine amino acid sequence. The donor and acceptor signals and branch point signals required for the splicing events were in agreement with mammalian consensus sequences. The frequency of inverted repeats which could provoke the DNA strand to loop formation and consequently to deletion mutations did not differ from that found in other sequenced genes coding for fibrillar collagens. mutations did not differ from that found in other sequenced genes coding for fibrillar collagens.
Collapse
Affiliation(s)
- M Vikkula
- Laboratory of Molecular Genetics, National Public Health Institute, Helsinki, Finland
| | | |
Collapse
|
19
|
Abstract
Collagens are a structurally and functionally heterogenous group of proteins encoded by a family of genes that share evolutionary history. Collagen gene expression is regulated both in developmental, tissue-specific manners as well as in response to a variety of biologic and pharmacologic inducers. In the present review we have attempted to synthesize a conceptual overview of the available information from studies aimed at deciphering the molecular mechanisms of collagen gene expression. We have chosen to focus our discussion mainly, although not exclusively, to observations relating to type I collagen gene for a number of practical reasons. The underlying theme that emerges from this survey of the literature is that the regulation of collagen gene expression is complex, utilizing transcriptional, posttranscriptional and translational mechanisms. Although the transcriptional control mechanisms that involve activation and modulation of collagen gene transcription by RNA polymerase II appear to predominate, preferential stabilization of collagen mRNAs and modulation of translational discrimination appear to play significant roles in the regulation of collagen biosynthesis under some physiological situations. Molecular organization of the regulatory regions of collagen genes reveal a mosaic of subdomains with overlapping sequence motifs, involved in positive and negative transcriptional regulation. The precise identity of the cis-acting subdomains of the promoter/enhancer-proximal DNA of collagen gene and how they interact with the trans-acting nuclear protein(s) have yet to be elucidated and will remain the focus of future studies.
Collapse
Affiliation(s)
- R Raghow
- Department of Pharmacology, University of Tennessee, Memphis
| | | |
Collapse
|
20
|
Abstract
A single Ly-5 gene is known to generate a variety of transmembrane glycoprotein isoforms that distinguish various cell lineages and stages of differentiation within the hematopoietic developmental compartment of the mouse. Systems homologous to Ly-5 are known in rats and in humans. The complete exon-intron organization of the Ly-5 gene is described in this report. The Ly-5 gene occupies about 120 kilobases of chromosome 1 and comprises 34 exons, of which 32 (Ex-3 to Ex-34) are protein coding. Ex-1, Ex-2, and parts of Ex-3 and Ex-34 are untranslated. In all cDNA clones examined, either Ex-1 or Ex-2 was represented, but not both, implying that Ex-1 and Ex-2 in Ly-5 mRNA may be mutually exclusive. Primer extension and S1 nuclease protection mapping were used to identify initiation (cap) sites for transcription. The finding of putative cap sites for Ex-1 and Ex-2, and of corresponding TATA-like sequences, suggests the presence of two promoters. In both Ex-1+ and Ex-2+ cDNA clones the next exon is Ex-3, which has a translation-initiating codon. The intron between Ex-3 and Ex-4 is unusually long, about 50 kilobases. Evidence is given that Ex-5, like Ex-6 and Ex-7 (studied previously), is another alternative exon that is selectively programmed, alone or together with Ex-6 or Ex-7 or both, to generate actual or potential Ly-5 isoforms by alternative splicing.
Collapse
|
21
|
Abstract
A single Ly-5 gene is known to generate a variety of transmembrane glycoprotein isoforms that distinguish various cell lineages and stages of differentiation within the hematopoietic developmental compartment of the mouse. Systems homologous to Ly-5 are known in rats and in humans. The complete exon-intron organization of the Ly-5 gene is described in this report. The Ly-5 gene occupies about 120 kilobases of chromosome 1 and comprises 34 exons, of which 32 (Ex-3 to Ex-34) are protein coding. Ex-1, Ex-2, and parts of Ex-3 and Ex-34 are untranslated. In all cDNA clones examined, either Ex-1 or Ex-2 was represented, but not both, implying that Ex-1 and Ex-2 in Ly-5 mRNA may be mutually exclusive. Primer extension and S1 nuclease protection mapping were used to identify initiation (cap) sites for transcription. The finding of putative cap sites for Ex-1 and Ex-2, and of corresponding TATA-like sequences, suggests the presence of two promoters. In both Ex-1+ and Ex-2+ cDNA clones the next exon is Ex-3, which has a translation-initiating codon. The intron between Ex-3 and Ex-4 is unusually long, about 50 kilobases. Evidence is given that Ex-5, like Ex-6 and Ex-7 (studied previously), is another alternative exon that is selectively programmed, alone or together with Ex-6 or Ex-7 or both, to generate actual or potential Ly-5 isoforms by alternative splicing.
Collapse
Affiliation(s)
- Y Saga
- Memorial Sloan-Kettering Cancer Center, New York, New York 10021
| | | | | | | | | |
Collapse
|
22
|
Nah HD, Rodgers BJ, Kulyk WM, Kream BE, Kosher RA, Upholt WB. In situ hybridization analysis of the expression of the type II collagen gene in the developing chicken limb bud. COLLAGEN AND RELATED RESEARCH 1988; 8:277-94. [PMID: 2850886 DOI: 10.1016/s0174-173x(88)80001-3] [Citation(s) in RCA: 71] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
In situ hybridization with [32P]- or [35S]-labeled double-stranded DNA or single-stranded RNA probes was used to investigate the temporal and spatial distribution of cartilage-characteristic type II collagen mRNA during embryonic chick limb development and cartilage differentiation in vivo. When the type II collagen probes were hybridized to sections through embryonic limb buds at the earliest stages of their development (stages 18-25), an accumulation of silver grains representing type II collagen mRNA first became detectable in the proximal central core of the limb coincident with the prechondrogenic condensation of mesenchymal cells that characterizes the onset of cartilage differentiation. At later stages of development (stage 32; 7 days) intense hybridization signals with the type II collagen probes were localized over the well differentiated cartilage rudiments, whereas few or no silver grains above background were observed over the non-chondrogenic tissues. In contrast, sections hybridized with a probe complementary to mRNA for the alpha 1 chain of type I collagen exhibited an intense hybridization signal over the perichondrium and little or no signal over the cartilage primordia. At all stages of development examined, [32P]-labeled double-stranded DNA probes or single-stranded RNA probes labeled with either [32P] or [35S] provided adequate hybridization signals. Several experimental protocols were employed to control for the potential cross-hybridization and non-specific hybridization of the type II collagen probes. These included the utilization of labeled noncomplementary "sense-strand" type II collagen RNA as a control probe for nonspecific background, and prehybridization with a large excess of appropriate unlabeled RNA to block sequences in heterologous collagen RNAs that might cross-hybridize to the specific labeled probe.
Collapse
Affiliation(s)
- H D Nah
- Department of Bio Structure and Function, University of Connecticut Health Center, Farmington 06032
| | | | | | | | | | | |
Collapse
|
23
|
Abstract
In considering the origin and evolution of proteins, the possibility that proteins evolved from exons coding for specific structure-function modules is attractive for its economy and simplicity but is not systematically supported by the available data. However, the number of correspondences between exons and units of protein structure-function that have so far been identified appears to be greater than expected by chance alone. The available data also show (i) that exons are fairly limited in size but are large enough to specify structure-function modules in proteins; (ii) that the position of introns for homologous domains in the same gene is reasonably stable, but there is also evidence for mechanisms that alter the position or existence of introns; and (iii) that it is possible that the observed relationship of exons to protein structure represents a degenerate state of an ancestral correspondence between exons and structure-function modules in proteins.
Collapse
Affiliation(s)
- T W Traut
- Department of Biochemistry, University of North Carolina School of Medicine, Chapel Hill 27599-7260
| |
Collapse
|
24
|
Nardelli D, van het Schip FD, Gerber-Huber S, Haefliger JA, Gruber M, Ab G, Wahli W. Comparison of the organization and fine structure of a chicken and a Xenopus laevis vitellogenin gene. J Biol Chem 1987. [DOI: 10.1016/s0021-9258(18)47735-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
25
|
Qian S, Zhang JY, Kay MA, Jacobs-Lorena M. Structural analysis of the Drosophila rpA1 gene, a member of the eucaryotic 'A' type ribosomal protein family. Nucleic Acids Res 1987; 15:987-1003. [PMID: 3103101 PMCID: PMC340503 DOI: 10.1093/nar/15.3.987] [Citation(s) in RCA: 66] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The expression of ribosomal protein (r-protein) genes is uniquely regulated at the translational level during early development of Drosophila. Here we report results of a detailed analysis of the r-protein rpA1 gene. A cloned DNA sequence coding for rpA1 has been identified by hybrid-selected translation and amino acid composition analysis. The rpA1 gene was localized to polytene chromosome band 53CD. The nucleotide sequence of the rpA1 gene and its cDNA have been determined. rpA1 is a single copy gene and sequence comparison between the gene and its cDNA indicates that this r-protein gene is intronless. Allelic restriction site polymorphisms outside of the gene were observed, while the coding sequence is well conserved between two Drosophila strains. The protein has unusual domains rich in Ala and charged residues. The rpA1 is homologous to the "A" family of eucaryotic acidic r-proteins which are known to play a key role in the initiation and elongation steps of protein synthesis.
Collapse
|
26
|
Affiliation(s)
- A Danchin
- Unité de Régulation de l'Expression Génétique, Institut Pasteur, Paris, France
| |
Collapse
|
27
|
Morgan WR, Ward DC. Three splicing patterns are used to excise the small intron common to all minute virus of mice RNAs. J Virol 1986; 60:1170-4. [PMID: 3783817 PMCID: PMC253380 DOI: 10.1128/jvi.60.3.1170-1174.1986] [Citation(s) in RCA: 68] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
We identified three splicing patterns used to excise the small intron common to all three transcripts encoded by minute virus of mice. Sequence analysis of minute virus of mice-specific cDNAs indicated that two donor and two acceptor splice sites were used: in pattern 1, the most frequent, nucleotide 2280 was spliced to nucleotide 2377; in pattern 2, nucleotides 2317 and 2399 were joined. Oligonucleotide probes, each specific for one of the four possible splice junction sequences, were synthesized and hybridized to viral mRNAs immobilized on nitrocellulose filters. The probes specific for splice patterns 1 and 2 hybridized to all three viral mRNAs, as did a third oligomer specific for a splicing pattern in which nucleotides 2280 and 2399 were joined. The fourth potential splicing pattern, linking nucleotides 2317 and 2377, was not detected. The presence of three splicing patterns in the transcripts designated R2 and R3 would allow the translation of five distinct polypeptides from these two mRNAs.
Collapse
|