101
|
Lin H, Xia P, A Wing R, Zhang Q, Luo M. Dynamic intra-japonica subspecies variation and resource application. MOLECULAR PLANT 2012; 5:218-30. [PMID: 21984334 DOI: 10.1093/mp/ssr085] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
We constructed a physical map of O. sativa ssp. japonica cv. ZH11 and compared it and its random sample sequences with the Nipponbare RefSeq derived from the same subspecies. This comparison showed that the two japonica genomes were highly syntenic but revealed substantial differences in terms of structural variations, rates of substitutions and indels, and transposable element content. For example, contractions/expansions as large as 450 kb and repeat sequences that were present in high copy numbers only in ZH11 were detected. In tri-alignment regions using the indica variety 93-11 sequence as an outgroup, we found that: (1) the substitution rates of the two japonica-indica inter-subspecies comparison combinations were close but almost a magnitude higher than the substitution rate between the japonica rice varieties ZH11 and Nipponbare; (2) of the substitutions found between ZH11 and Nipponbare, 47.2% occurred in ZH11 and 52.6% in Nipponbare; (3) of the indels found between ZH11 and Nipponbare, the indels that occurred in ZH11 were 15.8 times of those in Nipponbare. Of the indels that occurred in ZH11, 75.67% were insertions and 24.33% deletions. Of the indels that occurred in Nipponbare, 48.23% were insertions and 51.77% were deletions. The ZH11 comparative map covered four Nipponbare physical gaps, detected assembly errors in the Nipponbare sequence, and was integrated with the FSTs of a large ZH11 T-DNA insertion mutant library. ZH11 BAC clones can be browsed, searched, and obtained at our website, http://GResource.hzau.edu.cn.
Collapse
Affiliation(s)
- Haiyan Lin
- National Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China
| | | | | | | | | |
Collapse
|
102
|
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K. i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets. Nucleic Acids Res 2011; 40:e11. [PMID: 22102584 PMCID: PMC3258164 DOI: 10.1093/nar/gkr955] [Citation(s) in RCA: 138] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein–protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.
Collapse
|
103
|
Saski CA, Feltus FA, Staton ME, Blackmon BP, Ficklin SP, Kuhn DN, Schnell RJ, Shapiro H, Motamayor JC. A genetically anchored physical framework for Theobroma cacao cv. Matina 1-6. BMC Genomics 2011; 12:413. [PMID: 21846342 PMCID: PMC3173454 DOI: 10.1186/1471-2164-12-413] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 08/16/2011] [Indexed: 12/16/2022] Open
Abstract
Background The fermented dried seeds of Theobroma cacao (cacao tree) are the main ingredient in chocolate. World cocoa production was estimated to be 3 million tons in 2010 with an annual estimated average growth rate of 2.2%. The cacao bean production industry is currently under threat from a rise in fungal diseases including black pod, frosty pod, and witches' broom. In order to address these issues, genome-sequencing efforts have been initiated recently to facilitate identification of genetic markers and genes that could be utilized to accelerate the release of robust T. cacao cultivars. However, problems inherent with assembly and resolution of distal regions of complex eukaryotic genomes, such as gaps, chimeric joins, and unresolvable repeat-induced compressions, have been unavoidably encountered with the sequencing strategies selected. Results Here, we describe the construction of a BAC-based integrated genetic-physical map of the T. cacao cultivar Matina 1-6 which is designed to augment and enhance these sequencing efforts. Three BAC libraries, each comprised of 10× coverage, were constructed and fingerprinted. 230 genetic markers from a high-resolution genetic recombination map and 96 Arabidopsis-derived conserved ortholog set (COS) II markers were anchored using pooled overgo hybridization. A dense tile path consisting of 29,383 BACs was selected and end-sequenced. The physical map consists of 154 contigs and 4,268 singletons. Forty-nine contigs are genetically anchored and ordered to chromosomes for a total span of 307.2 Mbp. The unanchored contigs (105) span 67.4 Mbp and therefore the estimated genome size of T. cacao is 374.6 Mbp. A comparative analysis with A. thaliana, V. vinifera, and P. trichocarpa suggests that comparisons of the genome assemblies of these distantly related species could provide insights into genome structure, evolutionary history, conservation of functional sites, and improvements in physical map assembly. A comparison between the two T. cacao cultivars Matina 1-6 and Criollo indicates a high degree of collinearity in their genomes, yet rearrangements were also observed. Conclusions The results presented in this study are a stand-alone resource for functional exploitation and enhancement of Theobroma cacao but are also expected to complement and augment ongoing genome-sequencing efforts. This resource will serve as a template for refinement of the T. cacao genome through gap-filling, targeted re-sequencing, and resolution of repetitive DNA arrays.
Collapse
Affiliation(s)
- Christopher A Saski
- Subtropical Horticulture Research Station, USDA-ARS, 13601 Old Culter Road, Miami, FL 33158, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
104
|
Jacquemin J, Chaparro C, Laudié M, Berger A, Gavory F, Goicoechea JL, Wing RA, Cooke R. Long-range and targeted ectopic recombination between the two homeologous chromosomes 11 and 12 in Oryza species. Mol Biol Evol 2011; 28:3139-50. [PMID: 21616911 DOI: 10.1093/molbev/msr144] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Whole genome duplication (WGD) and subsequent evolution of gene pairs have been shown to have shaped the present day genomes of most, if not all, plants and to have played an essential role in the evolution of many eukaryotic genomes. Analysis of the rice (Oryza sativa ssp. japonica) genome sequence suggested an ancestral WGD ∼50-70 Ma common to all cereals and a segmental duplication between chromosomes 11 and 12 as recently as 5 Ma. More recent studies based on coding sequences have demonstrated that gene conversion is responsible for the high sequence conservation which suggested such a recent duplication. We previously showed that gene conversion has been a recurrent process throughout the Oryza genus and in closely related species and that orthologous duplicated regions are also highly conserved in other cereal genomes. We have extended these studies to compare megabase regions of genomic (coding and noncoding) sequences between two cultivated (O. sativa, Oryza glaberrima) and one wild (Oryza brachyantha) rice species using a novel approach of topological incongruency. The high levels of intraspecies conservation of both gene and nongene sequences, particularly in O. brachyantha, indicate long-range conversion events less than 4 Ma in all three species. These observations demonstrate megabase-scale conversion initiated within a highly rearranged region located at ∼2.1 Mb from the chromosome termini and emphasize the importance of gene conversion in cereal genome evolution.
Collapse
Affiliation(s)
- J Jacquemin
- Laboratoire Génome et Développement des Plantes, Unité Mixte de Recherche Centre National de la Recherche Scientifique/Institut de Recherche pour le Développement/Université de Perpignan Via Domitia, Université de Perpignan, Perpignan-Cedex, France.
| | | | | | | | | | | | | | | |
Collapse
|
105
|
Tang H, Lyons E, Pedersen B, Schnable JC, Paterson AH, Freeling M. Screening synteny blocks in pairwise genome comparisons through integer programming. BMC Bioinformatics 2011; 12:102. [PMID: 21501495 PMCID: PMC3088904 DOI: 10.1186/1471-2105-12-102] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Accepted: 04/18/2011] [Indexed: 12/01/2022] Open
Abstract
Background It is difficult to accurately interpret chromosomal correspondences such as true orthology and paralogy due to significant divergence of genomes from a common ancestor. Analyses are particularly problematic among lineages that have repeatedly experienced whole genome duplication (WGD) events. To compare multiple "subgenomes" derived from genome duplications, we need to relax the traditional requirements of "one-to-one" syntenic matchings of genomic regions in order to reflect "one-to-many" or more generally "many-to-many" matchings. However this relaxation may result in the identification of synteny blocks that are derived from ancient shared WGDs that are not of interest. For many downstream analyses, we need to eliminate weak, low scoring alignments from pairwise genome comparisons. Our goal is to objectively select subset of synteny blocks whose total scores are maximized while respecting the duplication history of the genomes in comparison. We call this "quota-based" screening of synteny blocks in order to appropriately fill a quota of syntenic relationships within one genome or between two genomes having WGD events. Results We have formulated the synteny block screening as an optimization problem known as "Binary Integer Programming" (BIP), which is solved using existing linear programming solvers. The computer program QUOTA-ALIGN performs this task by creating a clear objective function that maximizes the compatible set of synteny blocks under given constraints on overlaps and depths (corresponding to the duplication history in respective genomes). Such a procedure is useful for any pairwise synteny alignments, but is most useful in lineages affected by multiple WGDs, like plants or fish lineages. For example, there should be a 1:2 ploidy relationship between genome A and B if genome B had an independent WGD subsequent to the divergence of the two genomes. We show through simulations and real examples using plant genomes in the rosid superorder that the quota-based screening can eliminate ambiguous synteny blocks and focus on specific genomic evolutionary events, like the divergence of lineages (in cross-species comparisons) and the most recent WGD (in self comparisons). Conclusions The QUOTA-ALIGN algorithm screens a set of synteny blocks to retain only those compatible with a user specified ploidy relationship between two genomes. These blocks, in turn, may be used for additional downstream analyses such as identifying true orthologous regions in interspecific comparisons. There are two major contributions of QUOTA-ALIGN: 1) reducing the block screening task to a BIP problem, which is novel; 2) providing an efficient software pipeline starting from all-against-all BLAST to the screened synteny blocks with dot plot visualizations. Python codes and full documentations are publicly available http://github.com/tanghaibao/quota-alignment. QUOTA-ALIGN program is also integrated as a major component in SynMap http://genomevolution.com/CoGe/SynMap.pl, offering easier access to thousands of genomes for non-programmers.
Collapse
Affiliation(s)
- Haibao Tang
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
| | | | | | | | | | | |
Collapse
|
106
|
Soderlund C, Bomhoff M, Nelson WM. SyMAP v3.4: a turnkey synteny system with application to plant genomes. Nucleic Acids Res 2011; 39:e68. [PMID: 21398631 PMCID: PMC3105427 DOI: 10.1093/nar/gkr123] [Citation(s) in RCA: 215] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
SyMAP (Synteny Mapping and Analysis Program) was originally developed to compute synteny blocks between a sequenced genome and a FPC map, and has been extended to support pairs of sequenced genomes. SyMAP uses MUMmer to compute the raw hits between the two genomes, which are then clustered and filtered using the optional gene annotation. The filtered hits are input to the synteny algorithm, which was designed to discover duplicated regions and form larger-scale synteny blocks, where intervening micro-rearrangements are allowed. SyMAP provides extensive interactive Java displays at all levels of resolution along with simultaneous displays of multiple aligned pairs. The synteny blocks from multiple chromosomes may be displayed in a high-level dot plot or three-dimensional view, and the user may then drill down to see the details of a region, including the alignments of the hits to the gene annotation. These capabilities are illustrated by showing their application to the study of genome duplication, differential gene loss and transitive homology between sorghum, maize and rice. The software may be used from a website or standalone for the best performance. A project manager is provided to organize and automate the analysis of multi-genome groups. The software is freely distributed at http://www.agcol.arizona.edu/software/symap.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, 1657 Helen Street, University of Arizona, Tucson, AZ 85721, USA.
| | | | | |
Collapse
|
107
|
Fang GC, Blackmon BP, Henry DC, Staton ME, Saski CA, Hodges SA, Tomkins JP, Luo H. Genomic tools development for Aquilegia: construction of a BAC-based physical map. BMC Genomics 2010; 11:621. [PMID: 21059242 PMCID: PMC3091760 DOI: 10.1186/1471-2164-11-621] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2010] [Accepted: 11/08/2010] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND The genus Aquilegia, consisting of approximately 70 taxa, is a member of the basal eudicot lineage, Ranuculales, which is evolutionarily intermediate between monocots and core eudicots, and represents a relatively unstudied clade in the angiosperm phylogenetic tree that bridges the gap between these two major plant groups. Aquilegia species are closely related and their distribution covers highly diverse habitats. These provide rich resources to better understand the genetic basis of adaptation to different pollinators and habitats that in turn leads to rapid speciation. To gain insights into the genome structure and facilitate gene identification, comparative genomics and whole-genome shotgun sequencing assembly, BAC-based genomics resources are of crucial importance. RESULTS BAC-based genomic resources, including two BAC libraries, a physical map with anchored markers and BAC end sequences, were established from A. formosa. The physical map was composed of a total of 50,155 BAC clones in 832 contigs and 3939 singletons, covering 21X genome equivalents. These contigs spanned a physical length of 689.8 Mb (~2.3X of the genome) suggesting the complex heterozygosity of the genome. A set of 197 markers was developed from ESTs induced by drought-stress, or involved in anthocyanin biosynthesis or floral development, and was integrated into the physical map. Among these were 87 genetically mapped markers that anchored 54 contigs, spanning 76.4 Mb (25.5%) across the genome. Analysis of a selection of 12,086 BAC end sequences (BESs) from the minimal tiling path (MTP) allowed a preview of the Aquilegia genome organization, including identification of transposable elements, simple sequence repeats and gene content. Common repetitive elements previously reported in both monocots and core eudicots were identified in Aquilegia suggesting the value of this genome in connecting the two major plant clades. Comparison with sequenced plant genomes indicated a higher similarity to grapevine (Vitis vinifera) than to rice and Arabidopsis in the transcriptomes. CONCLUSIONS The A. formosa BAC-based genomic resources provide valuable tools to study Aquilegia genome. Further integration of other existing genomics resources, such as ESTs, into the physical map should enable better understanding of the molecular mechanisms underlying adaptive radiation and elaboration of floral morphology.
Collapse
Affiliation(s)
- Guang-Chen Fang
- Department of Genetics and Biochemistry, Clemson University, SC 29634, USA
| | | | | | | | | | | | | | | |
Collapse
|
108
|
González VM, Rodríguez-Moreno L, Centeno E, Benjak A, Garcia-Mas J, Puigdomènech P, Aranda MA. Genome-wide BAC-end sequencing of Cucumis melo using two BAC libraries. BMC Genomics 2010; 11:618. [PMID: 21054843 PMCID: PMC3091759 DOI: 10.1186/1471-2164-11-618] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Accepted: 11/05/2010] [Indexed: 11/10/2022] Open
Abstract
Background Although melon (Cucumis melo L.) is an economically important fruit crop, no genome-wide sequence information is openly available at the current time. We therefore sequenced BAC-ends representing a total of 33,024 clones, half of them from a previously described melon BAC library generated with restriction endonucleases and the remainder from a new random-shear BAC library. Results We generated a total of 47,140 high-quality BAC-end sequences (BES), 91.7% of which were paired-BES. Both libraries were assembled independently and then cross-assembled to obtain a final set of 33,372 non-redundant, high-quality sequences. These were grouped into 6,411 contigs (4.5 Mb) and 26,961 non-assembled BES (14.4 Mb), representing ~4.2% of the melon genome. The sequences were used to screen genomic databases, identifying 7,198 simple sequence repeats (corresponding to one microsatellite every 2.6 kb) and 2,484 additional repeats of which 95.9% represented transposable elements. The sequences were also used to screen expressed sequence tag (EST) databases, revealing 11,372 BES that were homologous to ESTs. This suggests that ~30% of the melon genome consists of coding DNA. We observed regions of microsynteny between melon paired-BES and six other dicotyledonous plant genomes. Conclusion The analysis of nearly 50,000 BES from two complementary genomic libraries covered ~4.2% of the melon genome, providing insight into properties such as microsatellite and transposable element distribution, and the percentage of coding DNA. The observed synteny between melon paired-BES and six other plant genomes showed that useful comparative genomic data can be derived through large scale BAC-end sequencing by anchoring a small proportion of the melon genome to other sequenced genomes.
Collapse
Affiliation(s)
- Víctor M González
- Molecular Genetics Department, Center for Research in Agricultural Genomics CRAG (CSIC-IRTA-UAB), Jordi Girona, 18-26, 08034 Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
109
|
Febrer M, Goicoechea JL, Wright J, McKenzie N, Song X, Lin J, Collura K, Wissotski M, Yu Y, Ammiraju JSS, Wolny E, Idziak D, Betekhtin A, Kudrna D, Hasterok R, Wing RA, Bevan MW. An integrated physical, genetic and cytogenetic map of Brachypodium distachyon, a model system for grass research. PLoS One 2010; 5:e13461. [PMID: 20976139 PMCID: PMC2956642 DOI: 10.1371/journal.pone.0013461] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Accepted: 08/27/2010] [Indexed: 11/18/2022] Open
Abstract
The pooid subfamily of grasses includes some of the most important crop, forage and turf species, such as wheat, barley and Lolium. Developing genomic resources, such as whole-genome physical maps, for analysing the large and complex genomes of these crops and for facilitating biological research in grasses is an important goal in plant biology. We describe a bacterial artificial chromosome (BAC)-based physical map of the wild pooid grass Brachypodium distachyon and integrate this with whole genome shotgun sequence (WGS) assemblies using BAC end sequences (BES). The resulting physical map contains 26 contigs spanning the 272 Mb genome. BES from the physical map were also used to integrate a genetic map. This provides an independent vaildation and confirmation of the published WGS assembly. Mapped BACs were used in Fluorescence In Situ Hybridisation (FISH) experiments to align the integrated physical map and sequence assemblies to chromosomes with high resolution. The physical, genetic and cytogenetic maps, integrated with whole genome shotgun sequence assemblies, enhance the accuracy and durability of this important genome sequence and will directly facilitate gene isolation.
Collapse
Affiliation(s)
| | - Jose Luis Goicoechea
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | | | | | - Xiang Song
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Jinke Lin
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Kristi Collura
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Marina Wissotski
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Yeisoo Yu
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Jetty S. S. Ammiraju
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Elzbieta Wolny
- Department of Plant Anatomy and Cytology, University of Silesia, Katowice, Poland
| | - Dominika Idziak
- Department of Plant Anatomy and Cytology, University of Silesia, Katowice, Poland
| | - Alexander Betekhtin
- Department of Plant Anatomy and Cytology, University of Silesia, Katowice, Poland
| | - Dave Kudrna
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | - Robert Hasterok
- Department of Plant Anatomy and Cytology, University of Silesia, Katowice, Poland
| | - Rod A. Wing
- The Arizona Genomics Institute, School of Plant Sciences and the BIO5 Institute for Collaborative Research, The University of Arizona, Tucson, Arizona, United States of America
| | | |
Collapse
|
110
|
Vergara IA, Chen N. Large synteny blocks revealed between Caenorhabditis elegans and Caenorhabditis briggsae genomes using OrthoCluster. BMC Genomics 2010; 11:516. [PMID: 20868500 PMCID: PMC2997010 DOI: 10.1186/1471-2164-11-516] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Accepted: 09/24/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Accurate identification of synteny blocks is an important step in comparative genomics towards the understanding of genome architecture and expression. Most computer programs developed in the last decade for identifying synteny blocks have limitations. To address these limitations, we recently developed a robust program called OrthoCluster, and an online database OrthoClusterDB. In this work, we have demonstrated the application of OrthoCluster in identifying synteny blocks between the genomes of Caenorhabditis elegans and Caenorhabditis briggsae, two closely related hermaphrodite nematodes. RESULTS Initial identification and analysis of synteny blocks using OrthoCluster enabled us to systematically improve the genome annotation of C. elegans and C. briggsae, identifying 52 potential novel genes in C. elegans, 582 in C. briggsae, and 949 novel orthologous relationships between these two species. Using the improved annotation, we have detected 3,058 perfect synteny blocks that contain no mismatches between C. elegans and C. briggsae. Among these synteny blocks, the majority are mapped to homologous chromosomes, as previously reported. The largest perfect synteny block contains 42 genes, which spans 201.2 kb in Chromosome V of C. elegans. On average, perfect synteny blocks span 18.8 kb in length. When some mismatches (interruptions) are allowed, synteny blocks ("imperfect synteny blocks") that are much larger in size are identified. We have shown that the majority (80%) of the C. elegans and C. briggsae genomes are covered by imperfect synteny blocks. The largest imperfect synteny block spans 6.14 Mb in Chromosome X of C. elegans and there are 11 synteny blocks that are larger than 1 Mb in size. On average, imperfect synteny blocks span 63.6 kb in length, larger than previously reported. CONCLUSIONS We have demonstrated that OrthoCluster can be used to accurately identify synteny blocks and have found that synteny blocks between C. elegans and C. briggsae are almost three-folds larger than previously identified.
Collapse
Affiliation(s)
- Ismael A Vergara
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada
| | | |
Collapse
|
111
|
Hurwitz BL, Kudrna D, Yu Y, Sebastian A, Zuccolo A, Jackson SA, Ware D, Wing RA, Stein L. Rice structural variation: a comparative analysis of structural variation between rice and three of its closest relatives in the genus Oryza. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2010; 63:990-1003. [PMID: 20626650 DOI: 10.1111/j.1365-313x.2010.04293.x] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Rapid progress in comparative genomics among the grasses has revealed similar gene content and order despite exceptional differences in chromosome size and number. Large- and small-scale genomic variations are of particular interest, especially among cultivated and wild species, as they encode rapidly evolving features that may be important in adaptation to particular environments. We present a genome-wide study of intermediate-sized structural variation (SV) among rice (Oryza sativa) and three of its closest relatives in the genus Oryza (Oryza nivara, Oryza rufipogon and Oryza glaberrima). We computationally identified regional expansions, contractions and inversions in the Oryza species genomes relative to O. sativa by combining data from paired-end clone alignments to the O. sativa reference genome and physical maps. A subset of the computational predictions was validated using a new approach for BAC size determination. The result was a confirmed catalog of 674 expansions (25-38 Mb) and 611 (4-19 Mb) contractions, and 140 putative inversions (14-19 Mb) between the three Oryza species and O. sativa. In the expanded regions unique to O. sativa we found enrichment in transposable elements (TEs): long terminal repeats (LTRs) were randomly located across the chromosomes, and their insertion times corresponded to the date of the A genome radiation. Also, rice-expanded regions contained an over-representation of single-copy genes related to defense factors in the environment. This catalog of confirmed SV in reference to O. sativa provides an entry point for future research in genome evolution, speciation, domestication and novel gene discovery.
Collapse
Affiliation(s)
- Bonnie L Hurwitz
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
112
|
Pham SK, Pevzner PA. DRIMM-Synteny: decomposing genomes into evolutionary conserved segments. ACTA ACUST UNITED AC 2010; 26:2509-16. [PMID: 20736338 DOI: 10.1093/bioinformatics/btq465] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
MOTIVATION The rapidly increasing set of sequenced genomes highlights the importance of identifying the synteny blocks in multiple and/or highly duplicated genomes. Most synteny block reconstruction algorithms use genes shared over all genomes to construct the synteny blocks for multiple genomes. However, the number of genes shared among all genomes quickly decreases with the increase in the number of genomes. RESULTS We propose the Duplications and Rearrangements In Multiple Mammals (DRIMM)-Synteny algorithm to address this bottleneck and apply it to analyzing genomic architectures of yeast, plant and mammalian genomes. We further combine synteny block generation with rearrangement analysis to reconstruct the ancestral preduplicated yeast genome. CONTACT kspham@cs.ucsd.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Son K Pham
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA.
| | | |
Collapse
|
113
|
Holding DR, Meeley RB, Hazebroek J, Selinger D, Gruis F, Jung R, Larkins BA. Identification and characterization of the maize arogenate dehydrogenase gene family. JOURNAL OF EXPERIMENTAL BOTANY 2010; 61:3663-73. [PMID: 20558569 PMCID: PMC2921203 DOI: 10.1093/jxb/erq179] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2010] [Revised: 05/27/2010] [Accepted: 05/28/2010] [Indexed: 05/18/2023]
Abstract
In plants, the amino acids tyrosine and phenylalanine are synthesized from arogenate by arogenate dehydrogenase and arogenate dehydratase, respectively, with the relative flux to each being tightly controlled. Here the characterization of a maize opaque endosperm mutant (mto140), which also shows retarded vegetative growth, is described The opaque phenotype co-segregates with a Mutator transposon insertion in an arogenate dehydrogenase gene (zmAroDH-1) and this led to the characterization of the four-member family of maize arogenate dehydrogenase genes (zmAroDH-1-zmAroDH-4) which share highly similar sequences. A Mutator insertion at an equivalent position in AroDH-3, the most closely related family member to AroDH-1, is also associated with opaque endosperm and stunted vegetative growth phenotypes. Overlapping but differential expression patterns as well as subtle mutant effects on the accumulation of tyrosine and phenylalanine in endosperm, embryo, and leaf tissues suggest that the functional redundancy of this gene family provides metabolic plasticity for the synthesis of these important amino acids. mto140/arodh-1 seeds shows a general reduction in zein storage protein accumulation and an elevated lysine phenotype typical of other opaque endosperm mutants, but it is distinct because it does not result from quantitative or qualitative defects in the accumulation of specific zeins but rather from a disruption in amino acid biosynthesis.
Collapse
Affiliation(s)
- David R Holding
- Center for Plant Science Innovation, University of Nebraska, 1901 Vine St., Lincoln, NE 68588, USA.
| | | | | | | | | | | | | |
Collapse
|
114
|
Li Q, Li L, Yang X, Warburton ML, Bai G, Dai J, Li J, Yan J. Relationship, evolutionary fate and function of two maize co-orthologs of rice GW2 associated with kernel size and weight. BMC PLANT BIOLOGY 2010; 10:143. [PMID: 20626916 PMCID: PMC3017803 DOI: 10.1186/1471-2229-10-143] [Citation(s) in RCA: 124] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2010] [Accepted: 07/14/2010] [Indexed: 05/18/2023]
Abstract
BACKGROUND In rice, the GW2 gene, found on chromosome 2, controls grain width and weight. Two homologs of this gene, ZmGW2-CHR4 and ZmGW2-CHR5, have been found in maize. In this study, we investigated the relationship, evolutionary fate and putative function of these two maize genes. RESULTS The two genes are located on duplicated maize chromosomal regions that show co-orthologous relationships with the rice region containing GW2. ZmGW2-CHR5 is more closely related to the sorghum counterpart than to ZmGW2-CHR4. Sequence comparisons between the two genes in eight diverse maize inbred lines revealed that the functional protein domain of both genes is completely conserved, with no non-synonymous polymorphisms identified. This suggests that both genes may have conserved functions, a hypothesis that was further confirmed through linkage, association, and expression analyses. Linkage analysis showed that ZmGW2-CHR4 is located within a consistent quantitative trait locus (QTL) for one-hundred kernel weight (HKW). Association analysis with a diverse panel of 121 maize inbred lines identified one single nucleotide polymorphism (SNP) in the promoter region of ZmGW2-CHR4 that was significantly associated with kernel width (KW) and HKW across all three field experiments examined in this study. SNPs or insertion/deletion polymorphisms (InDels) in other regions of ZmGW2-CHR4 and ZmGW2-CHR5 were also found to be significantly associated with at least one of the four yield-related traits (kernel length (KL), kernel thickness (KT), KW and HKW). None of the polymorphisms in either maize gene are similar to each other or to the 1 bp InDel causing phenotypic variation in rice. Expression levels of both maize genes vary over ear and kernel developmental stages, and the expression level of ZmGW2-CHR4 is significantly negatively correlated with KW. CONCLUSIONS The sequence, linkage, association and expression analyses collectively showed that the two maize genes represent chromosomal duplicates, both of which function to control some of the phenotypic variation for kernel size and weight in maize, as does their counterpart in rice. However, the different polymorphisms identified in the two maize genes and in the rice gene indicate that they may cause phenotypic variation through different mechanisms.
Collapse
Affiliation(s)
- Qing Li
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
| | - Lin Li
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
| | - Xiaohong Yang
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
| | - Marilyn L Warburton
- USDA-ARS Corn Host Plant Resistance Research Unit Box 9555 Mississippi State, MS 39762
| | - Guanghong Bai
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
- College of Agriculture, Xinjiang Agricultural University, Urumqi, 830052 Xinjiang, China
| | - Jingrui Dai
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
| | - Jiansheng Li
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
| | - Jianbing Yan
- National Maize Improvement Center of China, Key Laboratory of Crop Genomics and Genetic Improvement (Ministry of Agriculture), China Agricultural University, 100193 Beijing, China
- International Maize and Wheat Improvement Center (CIMMYT), Apdo. Postal 6-641, 06600 Mexico, D.F., Mexico
| |
Collapse
|
115
|
Mahmood K, Konagurthu AS, Song J, Buckle AM, Webb GI, Whisstock JC. EGM: encapsulated gene-by-gene matching to identify gene orthologs and homologous segments in genomes. Bioinformatics 2010; 26:2076-84. [DOI: 10.1093/bioinformatics/btq339] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
|
116
|
Shen C, Bai Y, Wang S, Zhang S, Wu Y, Chen M, Jiang D, Qi Y. Expression profile of PIN, AUX/LAX and PGP auxin transporter gene families in Sorghum bicolor under phytohormone and abiotic stress. FEBS J 2010; 277:2954-69. [PMID: 20528920 DOI: 10.1111/j.1742-4658.2010.07706.x] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Auxin is transported by the influx carriers auxin resistant 1/like aux1 (AUX/LAX), and the efflux carriers pin-formed (PIN) and P-glycoprotein (PGP), which play a major role in polar auxin transport. Several auxin transporter genes have been characterized in dicotyledonous Arabidopsis, but most are unknown in monocotyledons, especially in sorghum. Here, we analyze the chromosome distribution, gene duplication and intron/exon of SbPIN, SbLAX and SbPGP gene families, and examine their phylogenic relationships in Arabidopsis, rice and sorghum. Real-time PCR analysis demonstrated that most of these genes were differently expressed in the organs of sorghum. SbPIN3 and SbPIN9 were highly expressed in flowers, SbLAX2 and SbPGP17 were mainly expressed in stems, and SbPGP7 was strongly expressed in roots. This suggests that individual genes might participate in specific organ development. The expression profiles of these gene families were analyzed after treatment with: (a) the phytohormones indole-3-acetic acid and brassinosteroid; (b) the polar auxin transport inhibitors 1-naphthoxyacetic acids, 1-naphthylphthalamic acid and 2,3,5-triiodobenzoic acid; and (c) abscissic acid and the abiotic stresses of high salinity and drought. Most of the auxin transporter genes were strongly induced by indole-3-acetic acid and brassinosteroid, providing new evidence for the synergism of these phytohormones. Interestingly, most genes showed similar trends in expression under polar auxin transport inhibitors and each also responded to abscissic acid, salt and drought. This study provides new insights into the auxin transporters of sorghum.
Collapse
Affiliation(s)
- ChenJia Shen
- State Key Laboratory of Plant Physiology and Biochemistry, Zhejiang University, Hangzhou, China
| | | | | | | | | | | | | | | |
Collapse
|
117
|
Wang S, Bai Y, Shen C, Wu Y, Zhang S, Jiang D, Guilfoyle TJ, Chen M, Qi Y. Auxin-related gene families in abiotic stress response in Sorghum bicolor. Funct Integr Genomics 2010; 10:533-46. [PMID: 20499123 DOI: 10.1007/s10142-010-0174-3] [Citation(s) in RCA: 174] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2010] [Revised: 04/20/2010] [Accepted: 04/27/2010] [Indexed: 10/19/2022]
Abstract
Sorghum, a C4 model plant, has been studied to develop an understanding of the molecular mechanism of resistance to stress. The auxin-response genes, auxin/indole-3-acetic acid (Aux/IAA), auxin-response factor (ARF), Gretchen Hagen3 (GH3), small auxin-up RNAs, and lateral organ boundaries (LBD), are involved in growth/development and stress/defense responses in Arabidopsis and rice, but they have not been studied in sorghum. In the present paper, the chromosome distribution, gene duplication, promoters, intron/exon, and phylogenic relationships of Aux/IAA, ARF, GH3, and LBD genes in sorghum are presented. Furthermore, real-time PCR analysis demonstrated these genes are differently expressed in leaf/root of sorghum and indicated the expression profile of these gene families under IAA, brassinosteroid (BR), salt, and drought treatments. The SbGH3 and SbLBD genes, expressed in low level under natural condition, were highly induced by salt and drought stress consistent with their products being involved in both abiotic stresses. Three genes, SbIAA1, SbGH3-13, and SbLBD32, were highly induced under all the four treatments, IAA, BR, salt, and drought. The analysis provided new evidence for role of auxin in stress response, implied there are cross talk between auxin, BR and abiotic stress signaling pathways.
Collapse
Affiliation(s)
- SuiKang Wang
- State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Zijingang Campus, Hangzhou 310058, China
| | | | | | | | | | | | | | | | | |
Collapse
|
118
|
Rödelsperger C, Dieterich C. CYNTENATOR: progressive gene order alignment of 17 vertebrate genomes. PLoS One 2010; 5:e8861. [PMID: 20126624 PMCID: PMC2812507 DOI: 10.1371/journal.pone.0008861] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Accepted: 12/23/2009] [Indexed: 01/21/2023] Open
Abstract
Whole genome gene order evolution in higher eukaryotes was initially considered as a random process. Gene order conservation or conserved synteny was seen as a feature of common descent and did not imply the existence of functional constraints. This view had to be revised in the light of results from sequencing dozens of vertebrate genomes. It became apparent that other factors exist that constrain gene order in some genomic regions over long evolutionary time periods. Outside of these regions, genomes diverge more rapidly in terms of gene content and order. We have developed CYNTENATOR, a progressive gene order alignment software, to identify genomic regions of conserved synteny over a large set of diverging species. CYNTENATOR does not depend on nucleotide-level alignments and a priori homology assignment. Our software implements an improved scoring function that utilizes the underlying phylogeny. In this manuscript, we report on our progressive gene order alignment approach, a and give a comparison to previous software and an analysis of 17 vertebrate genomes for conservation in gene order. CYNTENATOR has a runtime complexity of and a space complexity of with being the gene number in a genome. CYNTENATOR performs as good as state-of-the-art software on simulated pairwise gene order comparisons, but is the only algorithm that works in practice for aligning dozens of vertebrate-sized gene orders. Lineage-specific characterization of gene order across 17 vertebrate genomes revealed mechanisms for maintaining conserved synteny such as enhancers and coregulation by bidirectional promoters. Genes outside conserved synteny blocks show enrichments for genes involved in responses to external stimuli, stimuli such as immunity and olfactory response in primate genome comparisons. We even see significant gene ontology term enrichments for breakpoint regions of ancestral nodes close to the root of the phylogeny. Additionally, our analysis of transposable elements has revealed a significant accumulation of LINE-1 elements in mammalian breakpoint regions. In summary, CYNTENATOR is a flexible and scalable tool for the identification of conserved gene orders across multiple species over long evolutionary distances.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Institute for Medical Genetics, Charité-Universitätsmedizin, Berlin, Germany
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Christoph Dieterich
- Bioinformatics in Quantitative Biology, Berlin Institute for Medical Systems Biology, Berlin, Germany
- * E-mail:
| |
Collapse
|
119
|
Ammiraju JS, Song X, Luo M, Sisneros N, Angelova A, Kudrna D, Kim H, Yu Y, Goicoechea JL, Lorieux M, Kurata N, Brar D, Ware D, Jackson S, Wing RA. The Oryza BAC resource: a genus-wide and genome scale tool for exploring rice genome evolution and leveraging useful genetic diversity from wild relatives. BREEDING SCIENCE 2010; 60:536-543. [PMID: 0 DOI: 10.1270/jsbbs.60.536] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Affiliation(s)
- Jetty S.S. Ammiraju
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - Xiang Song
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - Meizhong Luo
- College of Life Sciences and Technology, Huazhong Agricultural University
| | - Nicholas Sisneros
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - Angelina Angelova
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - David Kudrna
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - HyeRan Kim
- Plant Genomics Institute, Chungnam National University
| | - Yeisoo Yu
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - Jose Luis Goicoechea
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| | - Mathias Lorieux
- Agrobiodiversity and Biotechnology Project, International Center for Tropical Agriculture (CIAT)
| | | | - Darshan Brar
- Department of Plant Breeding and Genetics, International Rice Research Institute (IRRI)
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor
- USDA-ARS NAA Plant, Soil and Nutrition Laboratory Research Unit
| | | | - Rod A. Wing
- Arizona Genomics Institute, School of Plant Sciences, BIO5 Institute, University of Arizona
| |
Collapse
|
120
|
Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs. PLoS Genet 2009; 5:e1000740. [PMID: 19936069 PMCID: PMC2774520 DOI: 10.1371/journal.pgen.1000740] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2009] [Accepted: 10/24/2009] [Indexed: 11/29/2022] Open
Abstract
Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5′ and 3′ UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org). To complement the completion of sequencing the maize B73 genome, we sequenced 27,455 full-length cDNAs (FLcDNA) from two maize B73 libraries representing the gene transcripts from most tissues and common abiotic stress conditions. The FLcDNAs are beneficial in determining the exon/intron structure of genes by aligning them to the sequenced genome; 94% of our FLcDNAs aligned to the maize genome. The 27,455 FLcDNAs were compared to gene sequences for rice, sorghum, Arabidopsis, and poplar; 22,874 were found in all four sets, and 1,737 were unique to maize. Two-thirds of the maize genome is composed of a type of repetitive sequence called “transposable elements”; only 5.6% of the FLcDNA sequence contained any segment homologous to these repeats. In addition to our set, there are three other sets of maize FLcDNAs for a total of 69,306 gene transcripts, where many of them are from different maize lines (i.e. FLcDNAs often have only slight differences reflecting divergence). We assembled these together using parameters that would allow most alleles and recently diverged gene transcripts to align together, resulting in 46,739 unique gene transcripts.
Collapse
|
121
|
Wei F, Stein JC, Liang C, Zhang J, Fulton RS, Baucom RS, De Paoli E, Zhou S, Yang L, Han Y, Pasternak S, Narechania A, Zhang L, Yeh CT, Ying K, Nagel DH, Collura K, Kudrna D, Currie J, Lin J, Kim H, Angelova A, Scara G, Wissotski M, Golser W, Courtney L, Kruchowski S, Graves TA, Rock SM, Adams S, Fulton LA, Fronick C, Courtney W, Kramer M, Spiegel L, Nascimento L, Kalyanaraman A, Chaparro C, Deragon JM, Miguel PS, Jiang N, Wessler SR, Green PJ, Yu Y, Schwartz DC, Meyers BC, Bennetzen JL, Martienssen RA, McCombie WR, Aluru S, Clifton SW, Schnable PS, Ware D, Wilson RK, Wing RA. Detailed analysis of a contiguous 22-Mb region of the maize genome. PLoS Genet 2009; 5:e1000728. [PMID: 19936048 PMCID: PMC2773423 DOI: 10.1371/journal.pgen.1000728] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2009] [Accepted: 10/16/2009] [Indexed: 12/20/2022] Open
Abstract
Most of our understanding of plant genome structure and evolution has come from the careful annotation of small (e.g., 100 kb) sequenced genomic regions or from automated annotation of complete genome sequences. Here, we sequenced and carefully annotated a contiguous 22 Mb region of maize chromosome 4 using an improved pseudomolecule for annotation. The sequence segment was comprehensively ordered, oriented, and confirmed using the maize optical map. Nearly 84% of the sequence is composed of transposable elements (TEs) that are mostly nested within each other, of which most families are low-copy. We identified 544 gene models using multiple levels of evidence, as well as five miRNA genes. Gene fragments, many captured by TEs, are prevalent within this region. Elimination of gene redundancy from a tetraploid maize ancestor that originated a few million years ago is responsible in this region for most disruptions of synteny with sorghum and rice. Consistent with other sub-genomic analyses in maize, small RNA mapping showed that many small RNAs match TEs and that most TEs match small RNAs. These results, performed on approximately 1% of the maize genome, demonstrate the feasibility of refining the B73 RefGen_v1 genome assembly by incorporating optical map, high-resolution genetic map, and comparative genomic data sets. Such improvements, along with those of gene and repeat annotation, will serve to promote future functional genomic and phylogenomic research in maize and other grasses.
Collapse
Affiliation(s)
- Fusheng Wei
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Joshua C. Stein
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Chengzhi Liang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Jianwei Zhang
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Robert S. Fulton
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Regina S. Baucom
- Department of Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Emanuele De Paoli
- Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, Delaware, United States of America
| | - Shiguo Zhou
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin Madison, Madison, Wisconsin, United States of America
| | - Lixing Yang
- Department of Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Yujun Han
- Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Shiran Pasternak
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Apurva Narechania
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Lifang Zhang
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Cheng-Ting Yeh
- Department of Agronomy and Center for Plant Genomics, Iowa State University, Ames, Iowa, United States of America
| | - Kai Ying
- Department of Agronomy and Center for Plant Genomics, Iowa State University, Ames, Iowa, United States of America
| | - Dawn H. Nagel
- Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Kristi Collura
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - David Kudrna
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Jennifer Currie
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Jinke Lin
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - HyeRan Kim
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Angelina Angelova
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Gabriel Scara
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Marina Wissotski
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Wolfgang Golser
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - Laura Courtney
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Scott Kruchowski
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Tina A. Graves
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Susan M. Rock
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Stephanie Adams
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Lucinda A. Fulton
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Catrina Fronick
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - William Courtney
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Melissa Kramer
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Lori Spiegel
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Lydia Nascimento
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Ananth Kalyanaraman
- School of Electrical Engineering and Computer Science, Washington State University, Pullman, Washington, United States of America
| | - Cristian Chaparro
- Université de Perpignan Via Domitia, CNRS UMR 5096, Perpignan, France
| | - Jean-Marc Deragon
- Université de Perpignan Via Domitia, CNRS UMR 5096, Perpignan, France
| | - Phillip San Miguel
- Department of Horticulture and Landscape Architecture, Purdue University, West Lafayette, Indiana, United States of America
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, Michigan, United States of America
| | - Susan R. Wessler
- Department of Plant Biology, University of Georgia, Athens, Georgia, United States of America
| | - Pamela J. Green
- Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, Delaware, United States of America
| | - Yeisoo Yu
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| | - David C. Schwartz
- Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, University of Wisconsin Madison, Madison, Wisconsin, United States of America
| | - Blake C. Meyers
- Department of Plant and Soil Sciences and Delaware Biotechnology Institute, University of Delaware, Newark, Delaware, United States of America
| | - Jeffrey L. Bennetzen
- Department of Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Robert A. Martienssen
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - W. Richard McCombie
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Srinivas Aluru
- Department of Electrical and Computer Engineering, Iowa State University, Ames, Iowa, United States of America
| | - Sandra W. Clifton
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Patrick S. Schnable
- Department of Agronomy and Center for Plant Genomics, Iowa State University, Ames, Iowa, United States of America
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Richard K. Wilson
- The Genome Center and Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Rod A. Wing
- Arizona Genomics Institute, School of Plant Sciences and Department of Ecology and Evolutionary Biology, BIO5 Institute for Collaborative Research, University of Arizona, Tucson, Arizona, United States of America
| |
Collapse
|
122
|
Gu YQ, Ma Y, Huo N, Vogel JP, You FM, Lazo GR, Nelson WM, Soderlund C, Dvorak J, Anderson OD, Luo MC. A BAC-based physical map of Brachypodium distachyon and its comparative analysis with rice and wheat. BMC Genomics 2009; 10:496. [PMID: 19860896 PMCID: PMC2774330 DOI: 10.1186/1471-2164-10-496] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Accepted: 10/27/2009] [Indexed: 11/13/2022] Open
Abstract
Background Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence. Results A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics. Conclusion The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at .
Collapse
Affiliation(s)
- Yong Q Gu
- 1Genomics and Gene Discovery Research Unit, USDA-ARS, Western Regional Research Center, 800 Buchanan Street, Albany, CA 94710,USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
123
|
Ng MP, Vergara IA, Frech C, Chen Q, Zeng X, Pei J, Chen N. OrthoClusterDB: an online platform for synteny blocks. BMC Bioinformatics 2009; 10:192. [PMID: 19549318 PMCID: PMC2711082 DOI: 10.1186/1471-2105-10-192] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 06/23/2009] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND The recent availability of an expanding collection of genome sequences driven by technological advances has facilitated comparative genomics and in particular the identification of synteny among multiple genomes. However, the development of effective and easy-to-use methods for identifying such conserved gene clusters among multiple genomes-synteny blocks-as well as databases, which host synteny blocks from various groups of species (especially eukaryotes) and also allow users to run synteny-identification programs, lags behind. DESCRIPTIONS OrthoClusterDB is a new online platform for the identification and visualization of synteny blocks. OrthoClusterDB consists of two key web pages: Run OrthoCluster and View Synteny. The Run OrthoCluster page serves as web front-end to OrthoCluster, a recently developed program for synteny block detection. Run OrthoCluster offers full control over the functionalities of OrthoCluster, such as specifying synteny block size, considering order and strandedness of genes within synteny blocks, including or excluding nested synteny blocks, handling one-to-many orthologous relationships, and comparing multiple genomes. In contrast, the View Synteny page gives access to perfect and imperfect synteny blocks precomputed for a large number of genomes, without the need for users to retrieve and format input data. Additionally, genes are cross-linked with public databases for effective browsing. For both Run OrthoCluster and View Synteny, identified synteny blocks can be browsed at the whole genome, chromosome, and individual gene level. OrthoClusterDB is freely accessible. CONCLUSION We have developed an online system for the identification and visualization of synteny blocks among multiple genomes. The system is freely available at (http://genome.sfu.ca/orthoclusterdb/).
Collapse
Affiliation(s)
- Man-Ping Ng
- Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, Canada.
| | | | | | | | | | | | | |
Collapse
|
124
|
Hachiya T, Osana Y, Popendorf K, Sakakibara Y. Accurate identification of orthologous segments among multiple genomes. Bioinformatics 2009; 25:853-60. [DOI: 10.1093/bioinformatics/btp070] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
125
|
Abstract
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.
Collapse
Affiliation(s)
- William Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, AZ, USA
| | | |
Collapse
|
126
|
Peng Q, Alekseyev MA, Tesler G, Pevzner PA. Decoding Synteny Blocks and Large-Scale Duplications in Mammalian and Plant Genomes. LECTURE NOTES IN COMPUTER SCIENCE 2009. [DOI: 10.1007/978-3-642-04241-6_19] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
127
|
Ling X, He X, Xin D, Han J, Han J. Efficiently identifying max-gap clusters in pairwise genome comparison. J Comput Biol 2008; 15:593-609. [PMID: 18631023 DOI: 10.1089/cmb.2008.0010] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The spatial clustering of genes across different genomes has been used to study important problems in comparative genomics, from identification of operons to detection of homologous regions. A set of formal models and algorithms of so-called max-gap clusters have been proposed recently. These algorithms guarantee the completeness of the results, and the simplicity of the model enables a rigorous statistical test of significance. These features overcome the weakness of many previous methods, which are often heuristic in nature. We developed a very efficient algorithm to compute max-gap clusters in pairwise genome comparison. Our algorithm is an order-of-magnitude faster than the previous algorithm based on the same model under a number of different settings. In our evaluation on two bacterial genomes, we showed that our method could identify known operons as well as some novel structures in the genome. We also demonstrated that the current framework for conserved spatial clustering of genes can be used to detect homologous regions in higher organisms, through the comparison of human and mouse genomes.
Collapse
Affiliation(s)
- Xu Ling
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA.
| | | | | | | | | |
Collapse
|
128
|
Fan C, Zhang Y, Yu Y, Rounsley S, Long M, Wing RA. The subtelomere of Oryza sativa chromosome 3 short arm as a hot bed of new gene origination in rice. MOLECULAR PLANT 2008; 1:839-50. [PMID: 19825586 PMCID: PMC2902912 DOI: 10.1093/mp/ssn050] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2008] [Accepted: 07/15/2008] [Indexed: 05/22/2023]
Abstract
Despite general observations of non-random genomic distribution of new genes, it is unclear whether or not new genes preferentially occur in certain genomic regions driven by related molecular mechanisms. Using 1.5 Mb of genomic sequences from short arms of chromosome 3 of Oryza glaberrima and O. punctata, we conducted a comparative genomic analysis with the reference O. sativa ssp. japonica genome. We identified a 60-kb segment located in the middle of the subtelomeric region of chromosome 3, which is unique to the species O. sativa. The region contained gene duplicates that occurred in Asian cultivated rice species that diverged from the ancestor of Asian and African cultivated rice one million years ago (MYA). For the 12 genes and one complete retrotransposon identified in this segment in O. sativa ssp. japonica, we searched for their parental genes. The high similarity between duplicated paralogs further supports the recent origination of these genes. We found that this segment was recently generated through multiple independent gene recombination and transposon insertion events. Among the 12 genes, we found that five had chimeric gene structures derived from multiple parental genes. Nine out of the 12 new genes seem to be functional, as suggested by Ka/Ks analysis and the presence of cDNA and/or MPSS data. Furthermore, for the eight transcribed genes, at least two genes could be classified as defense or stress response-related genes. Given these findings, and the fact that subtelomeres are associated with high rates of recombination and transcription, it is likely that subtelomeres may facilitate gene recombination and transposon insertions and serve as hot spots for new gene origination in rice genomes.
Collapse
Affiliation(s)
- Chuanzhu Fan
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Yong Zhang
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| | - Yeisoo Yu
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Steve Rounsley
- BIO5 Institute for Collaborative Research, University of Arizona, Tucson, AZ 85721, USA
| | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
- To whom correspondence should be addressed. E-mail , fax 773-702-9740, tel. 773-702-0557. E-mail , fax 520-621-1259, tel. 520-626-9595
| | - Rod A. Wing
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
- To whom correspondence should be addressed. E-mail , fax 773-702-9740, tel. 773-702-0557. E-mail , fax 520-621-1259, tel. 520-626-9595
| |
Collapse
|
129
|
Lehmann J, Stadler PF, Prohaska SJ. SynBlast: assisting the analysis of conserved synteny information. BMC Bioinformatics 2008; 9:351. [PMID: 18721485 PMCID: PMC2543028 DOI: 10.1186/1471-2105-9-351] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Accepted: 08/24/2008] [Indexed: 01/06/2023] Open
Abstract
Motivation In the last years more than 20 vertebrate genomes have been sequenced, and the rate at which genomic DNA information becomes available is rapidly accelerating. Gene duplication and gene loss events inherently limit the accuracy of orthology detection based on sequence similarity alone. Fully automated methods for orthology annotation do exist but often fail to identify individual members in cases of large gene families, or to distinguish missing data from traceable gene losses. This situation can be improved in many cases by including conserved synteny information. Results Here we present the SynBlast pipeline that is designed to construct and evaluate local synteny information. SynBlast uses the genomic region around a focal reference gene to retrieve candidates for homologous regions from a collection of target genomes and ranks them in accord with the available evidence for homology. The pipeline is intended as a tool to aid high quality manual annotation in particular in those cases where automatic procedures fail. We demonstrate how SynBlast is applied to retrieving orthologous and paralogous clusters using the vertebrate Hox and ParaHox clusters as examples. Software The SynBlast package written in Perl is available under the GNU General Public License at .
Collapse
Affiliation(s)
- Jörg Lehmann
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, D-04107 Leipzig, Germany.
| | | | | |
Collapse
|
130
|
Wei F, Coe E, Nelson W, Bharti AK, Engler F, Butler E, Kim H, Goicoechea JL, Chen M, Lee S, Fuks G, Sanchez-Villeda H, Schroeder S, Fang Z, McMullen M, Davis G, Bowers JE, Paterson AH, Schaeffer M, Gardiner J, Cone K, Messing J, Soderlund C, Wing RA. Physical and genetic structure of the maize genome reflects its complex evolutionary history. PLoS Genet 2008; 3:e123. [PMID: 17658954 PMCID: PMC1934398 DOI: 10.1371/journal.pgen.0030123] [Citation(s) in RCA: 228] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/11/2007] [Indexed: 11/21/2022] Open
Abstract
Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes. As a cash crop and a model biological system, maize is of great public interest. To facilitate maize molecular breeding and its basic biology research, we built a high-resolution physical map with two different fingerprinting methods on the same set of bacterial artificial chromosome clones. The physical map was integrated to a high-density genetic map and further serves as a framework for the maize genome-sequencing project. Comparative genomics showed that the euchromatic regions between rice and maize are very conserved. Physically we delimited these conserved regions and thus detected many genome rearrangements. We defined extensively the duplication blocks within the maize genome. These blocks allowed us to reconstruct the chromosomes of the maize progenitor. We detected that maize genome has experienced two rounds of genome duplications, an ancient one before maize–rice divergence and a recent one after tetraploidization.
Collapse
Affiliation(s)
- Fusheng Wei
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Coe
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - William Nelson
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Arvind K Bharti
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Fred Engler
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Butler
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Jose Luis Goicoechea
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Mingsheng Chen
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Seunghee Lee
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Galina Fuks
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Hector Sanchez-Villeda
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Steven Schroeder
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Zhiwei Fang
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Michael McMullen
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Georgia Davis
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - John E Bowers
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Andrew H Paterson
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Mary Schaeffer
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Jack Gardiner
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Karen Cone
- Division of Biological Sciences, University of Missouri, Columbia, Missouri, Arizona, United States of America
| | - Joachim Messing
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Carol Soderlund
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| | - Rod A Wing
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| |
Collapse
|
131
|
Kim H, Hurwitz B, Yu Y, Collura K, Gill N, SanMiguel P, Mullikin JC, Maher C, Nelson W, Wissotski M, Braidotti M, Kudrna D, Goicoechea JL, Stein L, Ware D, Jackson SA, Soderlund C, Wing RA. Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza. Genome Biol 2008; 9:R45. [PMID: 18304353 PMCID: PMC2374706 DOI: 10.1186/gb-2008-9-2-r45] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2007] [Revised: 02/12/2008] [Accepted: 02/28/2008] [Indexed: 01/31/2023] Open
Abstract
Bacterial artificial chromosome (BAC) fingerprint and end-sequenced physical maps representing the ten genome types of Oryza are presented We describe the establishment and analysis of a genus-wide comparative framework composed of 12 bacterial artificial chromosome fingerprint and end-sequenced physical maps representing the 10 genome types of Oryza aligned to the O. sativa ssp. japonica reference genome sequence. Over 932 Mb of end sequence was analyzed for repeats, simple sequence repeats, miRNA and single nucleotide variations, providing the most extensive analysis of Oryza sequence to date.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
132
|
Ammiraju JSS, Zuccolo A, Yu Y, Song X, Piegu B, Chevalier F, Walling JG, Ma J, Talag J, Brar DS, SanMiguel PJ, Jiang N, Jackson SA, Panaud O, Wing RA. Evolutionary dynamics of an ancient retrotransposon family provides insights into evolution of genome size in the genus Oryza. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2007; 52:342-51. [PMID: 17764506 DOI: 10.1111/j.1365-313x.2007.03242.x] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Long terminal repeat (LTR) retrotransposons constitute a significant portion of most eukaryote genomes and can dramatically change genome size and organization. Although LTR retrotransposon content variation is well documented, the dynamics of genomic flux caused by their activity are poorly understood on an evolutionary time scale. This is primarily because of the lack of an experimental system composed of closely related species whose divergence times are within the limits of the ability to detect ancestrally related retrotransposons. The genus Oryza, with 24 species, ten genome types, different ploidy levels and over threefold genome size variation, constitutes an ideal experimental system to explore genus-level transposon dynamics. Here we present data on the discovery and characterization of an LTR retrotransposon family named RWG in the genus Oryza. Comparative analysis of transposon content (approximately 20 to 27,000 copies) and transpositional history of this family across the genus revealed a broad spectrum of independent and lineage-specific changes that have implications for the evolution of genome size and organization. In particular, we provide evidence that the basal GG genome of Oryza (O. granulata) has expanded by nearly 25% by a burst of the RWG lineage Gran3 subsequent to speciation. Finally we describe the recent evolutionary origin of Dasheng, a large retrotransposon derivative of the RWG family, specifically found in the A, B and C genome lineages of Oryza.
Collapse
Affiliation(s)
- Jetty S S Ammiraju
- Arizona Genomics Institute, Department of Plant Sciences, BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
133
|
Shultz JL, Ali S, Ballard L, Lightfoot DA. Development of a physical map of the soybean pathogen Fusarium virguliforme based on synteny with Fusarium graminearum genomic DNA. BMC Genomics 2007; 8:262. [PMID: 17683537 PMCID: PMC1978504 DOI: 10.1186/1471-2164-8-262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2007] [Accepted: 08/03/2007] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Reference genome sequences within the major taxa can be used to assist the development of genomic tools for related organisms. A major constraint in the use of these sequenced and annotated genomes is divergent evolution. Divergence of organisms from a common ancestor may have occurred millions of years ago, leading to apparently un-related and un-syntenic genomes when sequence alignment is attempted. RESULTS A series of programs were written to prepare 36 Mbp of Fusarium graminearum sequence in 19 scaffolds as a reference genome. Exactly 4,152 Bacterial artificial chromosome (BAC) end sequences from 2,178 large-insert Fusarium virguliforme clones were tested against this sequence. A total of 94 maps of F. graminearum sequence scaffolds, annotated exonic fragments and associated F. virguliforme sequences resulted. CONCLUSION Developed here was a technique that allowed the comparison of genomes based on small, 15 bp regions of shared identity. The main power of this method lay in its ability to align diverged sequences. This work is unique in that discontinuous sequences were used for the analysis and information not readily apparent, such as match direction, are presented. The 94 maps and JAVA programs are freely available on the Web and by request.
Collapse
Affiliation(s)
- Jeffry L Shultz
- USDA-ARS, Crop Genetics and Production Research Unit, PO Box 345, Stoneville, MS 38776, USA.
| | | | | | | |
Collapse
|
134
|
Kim H, San Miguel P, Nelson W, Collura K, Wissotski M, Walling JG, Kim JP, Jackson SA, Soderlund C, Wing RA. Comparative physical mapping between Oryza sativa (AA genome type) and O. punctata (BB genome type). Genetics 2007; 176:379-90. [PMID: 17339227 PMCID: PMC1893071 DOI: 10.1534/genetics.106.068783] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 02/09/2007] [Indexed: 11/18/2022] Open
Abstract
A comparative physical map of the AA genome (Oryza sativa) and the BB genome (O. punctata) was constructed by aligning a physical map of O. punctata, deduced from 63,942 BAC end sequences (BESs) and 34,224 fingerprints, onto the O. sativa genome sequence. The level of conservation of each chromosome between the two species was determined by calculating a ratio of BES alignments. The alignment result suggests more divergence of intergenic and repeat regions in comparison to gene-rich regions. Further, this characteristic enabled localization of heterochromatic and euchromatic regions for each chromosome of both species. The alignment identified 16 locations containing expansions, contractions, inversions, and transpositions. By aligning 40% of the punctata BES on the map, 87% of the punctata FPC map covered 98% of the O. sativa genome sequence. The genome size of O. punctata was estimated to be 8% larger than that of O. sativa with individual chromosome differences of 1.5-16.5%. The sum of expansions and contractions observed in regions >500 kb were similar, suggesting that most of the contractions/expansions contributing to the genome size difference between the two species are small, thus preserving the macro-collinearity between these species, which diverged approximately 2 million years ago.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
135
|
Valdivia ER, Sampedro J, Lamb JC, Chopra S, Cosgrove DJ. Recent proliferation and translocation of pollen group 1 allergen genes in the maize genome. PLANT PHYSIOLOGY 2007; 143:1269-81. [PMID: 17220362 PMCID: PMC1820917 DOI: 10.1104/pp.106.092544] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
The dominant allergenic components of grass pollen are known by immunologists as group 1 allergens. These constitute a set of closely related proteins from the beta-expansin family and have been shown to have cell wall-loosening activity. Group 1 allergens may facilitate the penetration of pollen tubes through the grass stigma and style. In maize (Zea mays), group 1 allergens are divided into two classes, A and B. We have identified 15 genes encoding group 1 allergens in maize, 11 genes in class A and four genes in class B, as well as seven pseudogenes. The genes in class A can be divided by sequence relatedness into two complexes, whereas the genes in class B constitute a single complex. Most of the genes identified are represented in pollen-specific expressed sequence tag libraries and are under purifying selection, despite the presence of multiple copies that are nearly identical. Group 1 allergen genes are clustered in at least six different genomic locations. The single class B location and one of the class A locations show synteny with the rice (Oryza sativa) regions where orthologous genes are found. Both classes are expressed at high levels in mature pollen but at low levels in immature flowers. The set of genes encoding maize group 1 allergens is more complex than originally anticipated. If this situation is common in grasses, it may account for the large number of protein variants, or group 1 isoallergens, identified previously in turf grass pollen by immunologists.
Collapse
Affiliation(s)
- Elene R Valdivia
- Department of Biology, Penn State University, University Park, Pennsylvania 16802, USA
| | | | | | | | | |
Collapse
|