1
|
Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J. Enrichment of gene-coding sequences in maize by genome filtration. Science 2004; 302:2118-20. [PMID: 14684821 DOI: 10.1126/science.1090047] [Citation(s) in RCA: 171] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Approximately 80% of the maize genome comprises highly repetitive sequences interspersed with single-copy, gene-rich sequences, and standard genome sequencing strategies are not readily adaptable to this type of genome. Methodologies that enrich for genic sequences might more rapidly generate useful results from complex genomes. Equivalent numbers of clones from maize selected by techniques called methylation filtering and High C0t selection were sequenced to generate approximately 200,000 reads (approximately 132 megabases), which were assembled into contigs. Combination of the two techniques resulted in a sixfold reduction in the effective genome size and a fourfold increase in the gene identification rate in comparison to a nonenriched library.
Collapse
Affiliation(s)
- C A Whitelaw
- The Institute for Genomic Research (TIGR), 9712 Medical Center Drive, Rockville, MD 20850, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Yan L, Echenique V, Busso C, SanMiguel P, Ramakrishna W, Bennetzen JL, Harrington S, Dubcovsky J. Cereal genes similar to Snf2 define a new subfamily that includes human and mouse genes. Mol Genet Genomics 2002; 268:488-99. [PMID: 12471446 DOI: 10.1007/s00438-002-0765-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2002] [Accepted: 09/23/2002] [Indexed: 10/27/2022]
Abstract
Genes from the SNF2 family play important roles in transcriptional regulation, maintenance of chromosome integrity and DNA repair. This study describes the molecular cloning and characterization of cereal genes from this family. The predicted proteins exhibit a novel C-terminal domain that defines a new subfamily designated SNF2P that includes human and mouse proteins. Comparison between genomic and cDNA sequences showed that cereal Snf2P genes consisted of 17 exons, including one only 8 bp long. Two barley alleles differed by the presence of a 7.7-kb non-LTR retrotransposon in intron 6. An alternative annotation of the orthologous Arabidopsis gene would improve its similarity with the other members of the subfamily. Intron 2 was not spliced out in approximately half of the rice Snf2P mRNAs present in leaves, resulting in a premature stop codon. Transcripts from the barley and wheat Snf2P genes were found in apexes, leaves, sheaths, roots and spikes. The Snf2P genes exist as single copies on wheat chromosome arm 5A(m)L and in the colinear regions on barley chromosome arm 4HL and rice chromosome 3. High-density genetic mapping and RT-PCR suggest that Snf2P is not a candidate gene for the tightly linked vernalization gene Vrn2.
Collapse
Affiliation(s)
- L Yan
- Dept. of Agronomy and Range Science, One Shields Avenue, University of California, Davis, CA 95616, USA
| | | | | | | | | | | | | | | |
Collapse
|
3
|
Abstract
Retrotransposons, transposable elements related to animal retroviruses, are found in all eukaryotes investigated and make up the majority of many plant genomes. Their ubiquity points to their importance, especially in their contribution to the large-scale structure of complex genomes. The nature and frequency of retro-element appearance, activation and amplification are poorly understood in all higher eukaryotes. Here we employ a novel approach to determine the insertion dates for 17 of 23 retrotransposons found near the maize adh1 gene, and two others from unlinked sites in the maize genome, by comparison of long terminal repeat (LTR) divergences with the sequence divergence between adh1 in maize and sorghum. All retrotransposons examined have inserted within the last six million years, most in the last three million years. The structure of the adh1 region appears to be standard relative to the other gene-containing regions of the maize genome, thus suggesting that retrotransposon insertions have increased the size of the maize genome from approximately 1200 Mb to 2400 Mb in the last three million years. Furthermore, the results indicate an increased mutation rate in retrotransposons compared with genes.
Collapse
Affiliation(s)
- P SanMiguel
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907-1392, USA
| | | | | | | | | |
Collapse
|
4
|
Abstract
For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that insert between genes. These retroelements are less abundant in smaller genome plants, including rice and sorghum. Although 5- to 200-kb blocks of methylated, presumably heterochromatic, retrotransposons flank most maize genes, rice and sorghum genes are often adjacent. Similar genes are commonly found in the same relative chromosomal locations and orientations in each of these three species, although there are numerous exceptions to this collinearity (i.e., rearrangements) that can be detected at the levels of both the recombinational map and cloned DNA. Evolutionarily conserved sequences are largely confined to genes and their regulatory elements. Our results indicate that a knowledge of grass genome structure will be a useful tool for gene discovery and isolation, but the general rules and biological significance of grass genome organization remain to be determined. Moreover, the nature and frequency of exceptions to the general patterns of grass genome structure and collinearity are still largely unknown and will require extensive further investigation.
Collapse
Affiliation(s)
- J L Bennetzen
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907-1392, USA
| | | | | | | | | | | |
Collapse
|
5
|
Abstract
Previously, we have demonstrated microcolinearity of gene composition and orientation in sh2/a1-homologous regions of the rice, sorghum, and maize genomes. However, the sh2 and a1 homologues are only about 20 kb apart in both rice and sorghum, while they are separated by about 140 kb in maize. In order to further define sequence organization and conservation in sh2/a1-homologous regions, we have completely sequenced a 42,446-bp segment of sorghum DNA. Four genes were identified: a homologue of sh2, two homologues of a1, and a putative transcriptional regulatory gene. A solo long terminal repeat of the retroelement Leriathan was detected between the two a1 homologues, and eight miniature inverted repeat transposable elements were found in this region. Comparison of the sorghum sequence with the sequence of the homologous segment from rice indicated that only the identified genes were evolutionarily conserved between these two species, which have evolved independently for over 50 million years. The introns of the a1 homologues have evolved faster than the introns of the sh2 homologue. The a1 tandem duplication appears to be an ancient event that may have preceded the ancestral divergence of maize, sorghum, and rice.
Collapse
Affiliation(s)
- M Chen
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | | | | |
Collapse
|
6
|
Abstract
Isolates of Streptococcus suis serotype 5 collected from three sows and nine of their pigs at birth were analyzed by genomic DNA fingerprinting. The cleavage patterns of DNA from S. suis isolated from the sows matched the cleavage patterns of DNA from S. suis isolated from their respective pigs.
Collapse
Affiliation(s)
- S F Amass
- Department of Veterinary Clinical Sciences, Purdue University, West Lafayette, Indiana 47907, USA.
| | | | | |
Collapse
|
7
|
Chen M, SanMiguel P, de Oliveira AC, Woo SS, Zhang H, Wing RA, Bennetzen JL. Microcolinearity in sh2-homologous regions of the maize, rice, and sorghum genomes. Proc Natl Acad Sci U S A 1997; 94:3431-5. [PMID: 9096411 PMCID: PMC20387 DOI: 10.1073/pnas.94.7.3431] [Citation(s) in RCA: 168] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Large regions of genomic colinearity have been demonstrated among grass species by recombinational mapping, but the degree of chromosomal conservation at the sub-centimorgan level has not been extensively investigated. We cloned the rice and sorghum genes homologous to the sh2 locus of maize on bacterial artificial chromosomes (BACs), and observed that a homologue of the maize a1 gene was also present on each of these BACs. In sorghum, we found a direct duplication of a1 homologues separated by about 10 kb. In maize, sh2 and a1 are approximately 140 kb apart and transcribed in the same direction, with sh2 upstream of a1. In rice and sorghum, this arrangement is fully conserved. However, the sh2 and a1 homologues are separated by about 19 kb in both rice and sorghum. We found low-copy-number and repetitive DNAs between the sh2 and a1 homologues of sorghum and rice. The sh2 and a1 homologues cross-hybridized, but the repetitive DNA and most low-copy-number sequences between these genes did not. These results indicate that maize, sorghum, and rice have conserved gene order and composition in the sh2-a1 region, but have acquired extensive qualitative and quantitative differences in the sequences between these genes.
Collapse
Affiliation(s)
- M Chen
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | | | | | | | | | | |
Collapse
|
8
|
Avramova Z, Tikhonov A, SanMiguel P, Jin YK, Liu C, Woo SS, Wing RA, Bennetzen JL. Gene identification in a complex chromosomal continuum by local genomic cross-referencing. Plant J 1996; 10:1163-1168. [PMID: 9011097 DOI: 10.1046/j.1365-313x.1996.10061163.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Most higher plants have complex genomes containing large quantities of repetitive DNA interspersed with low-copy-number sequences. Many of these repetitive DNAs are mobile and have homology to RNAs in various cell types. This can make it difficult to identify the genes in a long chromosomal continuum. It was decided to use genic sequence conservation and grass genome co-linearity as tools for gene identification. A bacterial artificial chromosome (BAC) clone containing sorghum genomic DNA was selected using a maize Adh1 probe. The 165 kb sorghum BAC was tested for hybridization to a set of clones representing the contiguous 280 kb of DNA flanking maize Adh1. None of the repetitive maize DNAs hybridized, but most of the low-copy-number sequences did. A low-copy-number sequence that did cross-hybridize was found to be a gene, while one that did not was found to be a low-copy-number retrotransposon that was named Reina. Regions of cross-hybridization were co-linear between the two genomes, but closer together in the smaller sorghum genome. These results indicate that local genomic cross-referencing by hybridization of orthologous clones can be an efficient and rapid technique for gene identification and studies of genome organization.
Collapse
Affiliation(s)
- Z Avramova
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | | | | | | | | | | | | |
Collapse
|
9
|
SanMiguel P, Tikhonov A, Jin YK, Motchoulskaia N, Zakharov D, Melake-Berhan A, Springer PS, Edwards KJ, Lee M, Avramova Z, Bennetzen JL. Nested retrotransposons in the intergenic regions of the maize genome. Science 1996; 274:765-8. [PMID: 8864112 DOI: 10.1126/science.274.5288.765] [Citation(s) in RCA: 798] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The relative organization of genes and repetitive DNAs in complex eukaryotic genomes is not well understood. Diagnostic sequencing indicated that a 280-kilobase region containing the maize Adh1-F and u22 genes is composed primarily of retrotransposons inserted within each other. Ten retroelement families were discovered, with reiteration frequencies ranging from 10 to 30,000 copies per haploid genome. These retrotransposons accounted for more than 60 percent of the Adh1-F region and at least 50 percent of the nuclear DNA of maize. These elements were largely intact and are dispersed throughout the gene-containing regions of the maize genome.
Collapse
Affiliation(s)
- P SanMiguel
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Bennetzen JL, SanMiguel P, Liu CN, Chen M, Tikhonov A, Costa de Oliveira A, Jin YK, Avramova Z, Woo SS, Zhang H, Wing RA. The Hybaid Lecture. Microcollinearity and segmental duplication in the evolution of grass nuclear genomes. Symp Soc Exp Biol 1996; 50:1-3. [PMID: 9039427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Recent studies have shown that grass genomes have very similar gene compositions and regions of conserved gene order, as exemplified by collinear genetic maps of DNA markers. We have begun the detailed study of sequence organization in large (100-500 kb) segments of the nuclear genomes of maize, sorghum and rice. Our results indicate collinearity of genes in the regions homoeologous to the maize adh1 and sh2-a1 genes. Comparable genes were found to be physically closer to each other in grasses with small genomes (rice and sorghum) than they are in maize. In several instances, we have found evidence of tandem and 'distantly tandem' duplications of segments containing maize and sorghum genes. These duplications complicate characterizations of microcollinearity and could also interfere with some map-based approaches to gene isolation.
Collapse
Affiliation(s)
- J L Bennetzen
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907-1392, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Avramova Z, SanMiguel P, Georgieva E, Bennetzen JL. Matrix attachment regions and transcribed sequences within a long chromosomal continuum containing maize Adh1. Plant Cell 1995; 7:1667-80. [PMID: 7580257 PMCID: PMC161028 DOI: 10.1105/tpc.7.10.1667] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
We provide evidence for the location of matrix attachment sites along a contiguous region of 280 kb on maize chromosome 1. We define nine potential loops that vary in length from 6 kb to > 75 kb. The distribution of the different classes of DNA within this continuum with respect to the predicted structural loops reveals an interesting correlation: the long stretches of mixed classes of highly repetitive DNAs are often segregated into topologically sequestered units, whereas low-copy-number DNAs (including the alcohol dehydrogenase1 [adh1] gene) are positioned in separate loops. Contrary to expectations, several classes of highly repeated elements with representatives in this region were found to be transcribed, and some of these exhibited tissue-specific patterns of expression.
Collapse
Affiliation(s)
- Z Avramova
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana 47907, USA
| | | | | | | |
Collapse
|
12
|
Bennetzen JL, Schrick K, Springer PS, Brown WE, SanMiguel P. Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome 1994; 37:565-76. [PMID: 7958822 DOI: 10.1139/g94-081] [Citation(s) in RCA: 144] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
We have characterized the copy number, organization, and genomic modification of DNA sequences within and flanking several maize genes. We found that highly repetitive DNA sequences were tightly linked to most of these genes. The highly repetitive sequences were not found within the coding regions but could be found within 6 kb either 3' or 5' to the structural genes. These highly repetitive regions were each composed of unique combinations of different short repetitive sequences. Highly repetitive DNA blocks were not interrupted by any detected single copy DNA. The 13 classes of highly repetitive DNA identified were found to vary little between diverse Zea isolates. The level of DNA methylation in and near these genes was determined by scoring the digestibility of 63 recognition/cleavage sites with restriction enzymes that were sensitive to 5-methylation of cytosines in the sequences 5'-CG-3' and 5'-CNG-3'. All but four of these sites were digestible in chromosomal DNA. The four undigested sites were localized to extragenic DNA within or near highly repetitive DNA, while the other 59 sites were in low copy number DNAs. Pulsed field gel analysis indicated that the majority of cytosine modified tracts range from 20 to 200 kb in size. Single copy sequences hybridized to the unmodified domains, while highly repetitive sequences hybridized to the modified regions. Middle repetitive sequences were found in both domains.
Collapse
Affiliation(s)
- J L Bennetzen
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907
| | | | | | | | | |
Collapse
|