1
|
Vaughan MJ, Nelson W, Soderlund C, Maier RM, Pryor BM. Assessing fungal community structure from mineral surfaces in Kartchner Caverns using multiplexed 454 pyrosequencing. Microb Ecol 2015; 70:175-187. [PMID: 25608778 DOI: 10.1007/s00248-014-0560-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2014] [Accepted: 12/23/2014] [Indexed: 06/04/2023]
Abstract
Research on the distribution and structure of fungal communities in caves is lacking. Kartchner Caverns is a wet and mineralogically diverse carbonate cave located in an escarpment of Mississippian Escabrosa limestone in the Whetstone Mountains, Arizona, USA. Fungal diversity from speleothem and rock wall surfaces was examined with 454 FLX Titanium sequencing technology using the Internal Transcribed Spacer 1 as a fungal barcode marker. Fungal diversity was estimated and compared between speleothem and rock wall surfaces, and its variation with distance from the natural entrance of the cave was quantified. Effects of environmental factors and nutrient concentrations in speleothem drip water at different sample sites on fungal diversity were also examined. Sequencing revealed 2,219 fungal operational taxonomic units (OTUs) at the 95% similarity level. Speleothems supported a higher fungal richness and diversity than rock walls. However, community membership and the taxonomic distribution of fungal OTUs at the class level did not differ significantly between speleothems and rock walls. Both OTU richness and diversity decreased significantly with increasing distance from the natural cave entrance. Community membership and taxonomic distribution of fungal OTUs also differed significantly between the sampling sites closest to the entrance and those furthest away. There was no significant effect of temperature, CO2 concentration, or drip water nutrient concentration on fungal community structure on either speleothems or rock walls. Together, these results suggest that proximity to the natural entrance is a critical factor in determining fungal community structure on mineral surfaces in Kartchner Caverns.
Collapse
|
2
|
Abstract
BACKGROUND The analysis of transcriptome data involves many steps and various programs, along with organization of large amounts of data and results. Without a methodical approach for storage, analysis and query, the resulting ad hoc analysis can lead to human error, loss of data and results, inefficient use of time, and lack of verifiability, repeatability, and extensibility. METHODOLOGY The Transcriptome Computational Workbench (TCW) provides Java graphical interfaces for methodical analysis for both single and comparative transcriptome data without the use of a reference genome (e.g. for non-model organisms). The singleTCW interface steps the user through importing transcript sequences (e.g. Illumina) or assembling long sequences (e.g. Sanger, 454, transcripts), annotating the sequences, and performing differential expression analysis using published statistical programs in R. The data, metadata, and results are stored in a MySQL database. The multiTCW interface builds a comparison database by importing sequence and annotation from one or more single TCW databases, executes the ESTscan program to translate the sequences into proteins, and then incorporates one or more clusterings, where the clustering options are to execute the orthoMCL program, compute transitive closure, or import clusters. Both singleTCW and multiTCW allow extensive query and display of the results, where singleTCW displays the alignment of annotation hits to transcript sequences, and multiTCW displays multiple transcript alignments with MUSCLE or pairwise alignments. The query programs can be executed on the desktop for fastest analysis, or from the web for sharing the results. CONCLUSION It is now affordable to buy a multi-processor machine, and easy to install Java and MySQL. By simply downloading the TCW, the user can interactively analyze, query and view their data. The TCW allows in-depth data mining of the results, which can lead to a better understanding of the transcriptome. TCW is freely available from www.agcol.arizona.edu/software/tcw.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, University of Arizona, Tucson, Arizona, USA.
| | | | | | | |
Collapse
|
3
|
Büchel K, McDowell E, Nelson W, Descour A, Gershenzon J, Hilker M, Soderlund C, Gang DR, Fenning T, Meiners T. An elm EST database for identifying leaf beetle egg-induced defense genes. BMC Genomics 2012; 13:242. [PMID: 22702658 PMCID: PMC3439254 DOI: 10.1186/1471-2164-13-242] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Accepted: 05/15/2012] [Indexed: 01/07/2023] Open
Abstract
Background Plants can defend themselves against herbivorous insects prior to the onset of larval feeding by responding to the eggs laid on their leaves. In the European field elm (Ulmus minor), egg laying by the elm leaf beetle ( Xanthogaleruca luteola) activates the emission of volatiles that attract specialised egg parasitoids, which in turn kill the eggs. Little is known about the transcriptional changes that insect eggs trigger in plants and how such indirect defense mechanisms are orchestrated in the context of other biological processes. Results Here we present the first large scale study of egg-induced changes in the transcriptional profile of a tree. Five cDNA libraries were generated from leaves of (i) untreated control elms, and elms treated with (ii) egg laying and feeding by elm leaf beetles, (iii) feeding, (iv) artificial transfer of egg clutches, and (v) methyl jasmonate. A total of 361,196 ESTs expressed sequence tags (ESTs) were identified which clustered into 52,823 unique transcripts (Unitrans) and were stored in a database with a public web interface. Among the analyzed Unitrans, 73% could be annotated by homology to known genes in the UniProt (Plant) database, particularly to those from Vitis, Ricinus, Populus and Arabidopsis. Comparative in silico analysis among the different treatments revealed differences in Gene Ontology term abundances. Defense- and stress-related gene transcripts were present in high abundance in leaves after herbivore egg laying, but transcripts involved in photosynthesis showed decreased abundance. Many pathogen-related genes and genes involved in phytohormone signaling were expressed, indicative of jasmonic acid biosynthesis and activation of jasmonic acid responsive genes. Cross-comparisons between different libraries based on expression profiles allowed the identification of genes with a potential relevance in egg-induced defenses, as well as other biological processes, including signal transduction, transport and primary metabolism. Conclusion Here we present a dataset for a large-scale study of the mechanisms of plant defense against insect eggs in a co-evolved, natural ecological plant–insect system. The EST database analysis provided here is a first step in elucidating the transcriptional responses of elm to elm leaf beetle infestation, and adds further to our knowledge on insect egg-induced transcriptomic changes in plants. The sequences identified in our comparative analysis give many hints about novel defense mechanisms directed towards eggs.
Collapse
Affiliation(s)
- Kerstin Büchel
- Freie Universität Berlin, Applied Zoology / Animal Ecology, Berlin, Germany
| | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Kour A, Greer K, Valent B, Orbach MJ, Soderlund C. MGOS: development of a community annotation database for Magnaporthe oryzae. Mol Plant Microbe Interact 2012; 25:271-278. [PMID: 22074346 DOI: 10.1094/mpmi-07-11-0183] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Magnaporthe oryzae causes rice blast disease, which is the most serious disease of cultivated rice worldwide. We previously developed the Magnaporthe grisea-Orzya sativa (MGOS) database as a repository for the M. oryzae and rice genome sequences together with a comprehensive set of functional interaction data generated by a major consortium of U.S. researchers. The MGOS database has now undergone a major redesign to include data from the international blast research community, accessible with a new intuitive, easy-to-use interface. Registered database users can manually annotate gene sequences and features as well as add mutant data and literature on individual gene pages. Over 900 genes have been manually curated based on various biological databases and the scientific literature. Gene names and descriptions, gene ontology annotations, published and unpublished information on mutants and their phenotypes, responses in diverse microarray analyses, and related literature have been incorporated. Thus far, 362 M. oryzae genes have associated information on mutants. MGOS is now poised to become a one-stop repository for all structural and functional data available on all genes of this critically important rice pathogen.
Collapse
Affiliation(s)
- Anupreet Kour
- School of Plant Sciences, Division of Plant Pathology and Microbiology, The University of Arizona, Tucson 85721, USA
| | | | | | | | | |
Collapse
|
5
|
Abstract
SyMAP (Synteny Mapping and Analysis Program) was originally developed to compute synteny blocks between a sequenced genome and a FPC map, and has been extended to support pairs of sequenced genomes. SyMAP uses MUMmer to compute the raw hits between the two genomes, which are then clustered and filtered using the optional gene annotation. The filtered hits are input to the synteny algorithm, which was designed to discover duplicated regions and form larger-scale synteny blocks, where intervening micro-rearrangements are allowed. SyMAP provides extensive interactive Java displays at all levels of resolution along with simultaneous displays of multiple aligned pairs. The synteny blocks from multiple chromosomes may be displayed in a high-level dot plot or three-dimensional view, and the user may then drill down to see the details of a region, including the alignments of the hits to the gene annotation. These capabilities are illustrated by showing their application to the study of genome duplication, differential gene loss and transitive homology between sorghum, maize and rice. The software may be used from a website or standalone for the best performance. A project manager is provided to organize and automate the analysis of multi-genome groups. The software is freely distributed at http://www.agcol.arizona.edu/software/symap.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, 1657 Helen Street, University of Arizona, Tucson, AZ 85721, USA.
| | | | | |
Collapse
|
6
|
Soderlund C. Computational techniques for elucidating plant-pathogen interactions from large-scale experiments on fungi and oomycetes. Brief Bioinform 2009; 10:654-63. [PMID: 19933211 DOI: 10.1093/bib/bbp053] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Eukaryotic plant pathogens are responsible for the destruction of billions of dollars worth of crops each year. With large-scale genomics of both pathogens and hosts and the corresponding computational analysis, biologists are now able to gain knowledge about many pathogenic and defense genes concurrently. To study the interactions between these two organism groups, it is necessary to design experiments to elucidate the genes being expressed during the invasion of the pathogen into the host. For the most part, this does not require new software development, though it does require the use of existing software in novel ways. This article provides a broad overview of several key and illustrative experiments and the corresponding computational analyses, outlining the knowledge gained in each. It goes on to describe databases for plant-pathogen data and important initiatives such as Plant-Associated Microbe Gene Ontology. It discusses how various emerging approaches will increase the power of computers in host-pathogen interaction studies.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, 1657 Helen Street, University of Arizona, Tucson AZ 85721, USA.
| |
Collapse
|
7
|
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B, Chen W, Yan L, Higginbotham J, Cardenas M, Waligorski J, Applebaum E, Phelps L, Falcone J, Kanchi K, Thane T, Scimone A, Thane N, Henke J, Wang T, Ruppert J, Shah N, Rotter K, Hodges J, Ingenthron E, Cordes M, Kohlberg S, Sgro J, Delgado B, Mead K, Chinwalla A, Leonard S, Crouse K, Collura K, Kudrna D, Currie J, He R, Angelova A, Rajasekar S, Mueller T, Lomeli R, Scara G, Ko A, Delaney K, Wissotski M, Lopez G, Campos D, Braidotti M, Ashley E, Golser W, Kim H, Lee S, Lin J, Dujmic Z, Kim W, Talag J, Zuccolo A, Fan C, Sebastian A, Kramer M, Spiegel L, Nascimento L, Zutavern T, Miller B, Ambroise C, Muller S, Spooner W, Narechania A, Ren L, Wei S, Kumari S, Faga B, Levy MJ, McMahan L, Van Buren P, Vaughn MW, Ying K, Yeh CT, Emrich SJ, Jia Y, Kalyanaraman A, Hsia AP, Barbazuk WB, Baucom RS, Brutnell TP, Carpita NC, Chaparro C, Chia JM, Deragon JM, Estill JC, Fu Y, Jeddeloh JA, Han Y, Lee H, Li P, Lisch DR, Liu S, Liu Z, Nagel DH, McCann MC, SanMiguel P, Myers AM, Nettleton D, Nguyen J, Penning BW, Ponnala L, Schneider KL, Schwartz DC, Sharma A, Soderlund C, Springer NM, Sun Q, Wang H, Waterman M, Westerman R, Wolfgruber TK, Yang L, Yu Y, Zhang L, Zhou S, Zhu Q, Bennetzen JL, Dawe RK, Jiang J, Jiang N, Presting GG, Wessler SR, Aluru S, Martienssen RA, Clifton SW, McCombie WR, Wing RA, Wilson RK. The B73 Maize Genome: Complexity, Diversity, and Dynamics. Science 2009; 326:1112-5. [PMID: 19965430 DOI: 10.1126/science.1178534] [Citation(s) in RCA: 2467] [Impact Index Per Article: 164.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
|
8
|
Gu YQ, Ma Y, Huo N, Vogel JP, You FM, Lazo GR, Nelson WM, Soderlund C, Dvorak J, Anderson OD, Luo MC. A BAC-based physical map of Brachypodium distachyon and its comparative analysis with rice and wheat. BMC Genomics 2009; 10:496. [PMID: 19860896 PMCID: PMC2774330 DOI: 10.1186/1471-2164-10-496] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Accepted: 10/27/2009] [Indexed: 11/13/2022] Open
Abstract
Background Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence. Results A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics. Conclusion The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at .
Collapse
Affiliation(s)
- Yong Q Gu
- 1Genomics and Gene Discovery Research Unit, USDA-ARS, Western Regional Research Center, 800 Buchanan Street, Albany, CA 94710,USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
Background New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs. Results The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs. Conclusion The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.
Collapse
Affiliation(s)
- Carol Soderlund
- BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA.
| | | | | | | |
Collapse
|
10
|
Abstract
Recent advances in both clone fingerprinting and draft sequencing technology have made it increasingly common for species to have a bacterial artificial clone (BAC) fingerprint map, BAC end sequences (BESs) and draft genomic sequence. The FPC (fingerprinted contigs) software package contains three modules that maximize the value of these resources. The BSS (blast some sequence) module provides a way to easily view the results of aligning draft sequence to the BESs, and integrates the results with the following two modules. The MTP (minimal tiling path) module uses sequence and fingerprints to determine a minimal tiling path of clones. The DSI (draft sequence integration) module aligns draft sequences to FPC contigs, displays them alongside the contigs and identifies potential discrepancies; the alignment can be based on either individual BES alignments to the draft, or on the locations of BESs that have been assembled into the draft. FPC also supports high-throughput fingerprint map generation as its time-intensive functions have been parallelized for Unix-based desktops or servers with multiple CPUs. Simulation results are provided for the MTP, DSI and parallelization. These features are in the FPC V9.3 software package, which is freely available.
Collapse
Affiliation(s)
- William Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, AZ, USA
| | | |
Collapse
|
11
|
Nelson W, Luo M, Ma J, Estep M, Estill J, He R, Talag J, Sisneros N, Kudrna D, Kim H, Ammiraju JSS, Collura K, Bharti AK, Messing J, Wing RA, SanMiguel P, Bennetzen JL, Soderlund C. Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains. BMC Genomics 2008; 9:621. [PMID: 19099592 PMCID: PMC2628917 DOI: 10.1186/1471-2164-9-621] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2008] [Accepted: 12/19/2008] [Indexed: 11/30/2022] Open
Abstract
Background Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends. Results A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary. Conclusion MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.
Collapse
Affiliation(s)
- William Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, Arizona, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Wei F, Coe E, Nelson W, Bharti AK, Engler F, Butler E, Kim H, Goicoechea JL, Chen M, Lee S, Fuks G, Sanchez-Villeda H, Schroeder S, Fang Z, McMullen M, Davis G, Bowers JE, Paterson AH, Schaeffer M, Gardiner J, Cone K, Messing J, Soderlund C, Wing RA. Physical and genetic structure of the maize genome reflects its complex evolutionary history. PLoS Genet 2008; 3:e123. [PMID: 17658954 PMCID: PMC1934398 DOI: 10.1371/journal.pgen.0030123] [Citation(s) in RCA: 228] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/11/2007] [Indexed: 11/21/2022] Open
Abstract
Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes. As a cash crop and a model biological system, maize is of great public interest. To facilitate maize molecular breeding and its basic biology research, we built a high-resolution physical map with two different fingerprinting methods on the same set of bacterial artificial chromosome clones. The physical map was integrated to a high-density genetic map and further serves as a framework for the maize genome-sequencing project. Comparative genomics showed that the euchromatic regions between rice and maize are very conserved. Physically we delimited these conserved regions and thus detected many genome rearrangements. We defined extensively the duplication blocks within the maize genome. These blocks allowed us to reconstruct the chromosomes of the maize progenitor. We detected that maize genome has experienced two rounds of genome duplications, an ancient one before maize–rice divergence and a recent one after tetraploidization.
Collapse
Affiliation(s)
- Fusheng Wei
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Coe
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - William Nelson
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Arvind K Bharti
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Fred Engler
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
| | - Ed Butler
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Jose Luis Goicoechea
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Mingsheng Chen
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Seunghee Lee
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
| | - Galina Fuks
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Hector Sanchez-Villeda
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Steven Schroeder
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Zhiwei Fang
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Michael McMullen
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Georgia Davis
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - John E Bowers
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Andrew H Paterson
- Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
| | - Mary Schaeffer
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
- Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
| | - Jack Gardiner
- Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Karen Cone
- Division of Biological Sciences, University of Missouri, Columbia, Missouri, Arizona, United States of America
| | - Joachim Messing
- Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
| | - Carol Soderlund
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| | - Rod A Wing
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
- Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America
- BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
- * To whom correspondence should be addressed. E-mail: (CS); (RAW)
| |
Collapse
|
13
|
Kim H, Hurwitz B, Yu Y, Collura K, Gill N, SanMiguel P, Mullikin JC, Maher C, Nelson W, Wissotski M, Braidotti M, Kudrna D, Goicoechea JL, Stein L, Ware D, Jackson SA, Soderlund C, Wing RA. Construction, alignment and analysis of twelve framework physical maps that represent the ten genome types of the genus Oryza. Genome Biol 2008; 9:R45. [PMID: 18304353 PMCID: PMC2374706 DOI: 10.1186/gb-2008-9-2-r45] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2007] [Revised: 02/12/2008] [Accepted: 02/28/2008] [Indexed: 01/31/2023] Open
Abstract
Bacterial artificial chromosome (BAC) fingerprint and end-sequenced physical maps representing the ten genome types of Oryza are presented We describe the establishment and analysis of a genus-wide comparative framework composed of 12 bacterial artificial chromosome fingerprint and end-sequenced physical maps representing the 10 genome types of Oryza aligned to the O. sativa ssp. japonica reference genome sequence. Over 932 Mb of end sequence was analyzed for repeats, simple sequence repeats, miRNA and single nucleotide variations, providing the most extensive analysis of Oryza sequence to date.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, Department of Plant Sciences, University of Arizona, Tucson, Arizona 85721, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Kim H, San Miguel P, Nelson W, Collura K, Wissotski M, Walling JG, Kim JP, Jackson SA, Soderlund C, Wing RA. Comparative physical mapping between Oryza sativa (AA genome type) and O. punctata (BB genome type). Genetics 2007; 176:379-90. [PMID: 17339227 PMCID: PMC1893071 DOI: 10.1534/genetics.106.068783] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 02/09/2007] [Indexed: 11/18/2022] Open
Abstract
A comparative physical map of the AA genome (Oryza sativa) and the BB genome (O. punctata) was constructed by aligning a physical map of O. punctata, deduced from 63,942 BAC end sequences (BESs) and 34,224 fingerprints, onto the O. sativa genome sequence. The level of conservation of each chromosome between the two species was determined by calculating a ratio of BES alignments. The alignment result suggests more divergence of intergenic and repeat regions in comparison to gene-rich regions. Further, this characteristic enabled localization of heterochromatic and euchromatic regions for each chromosome of both species. The alignment identified 16 locations containing expansions, contractions, inversions, and transpositions. By aligning 40% of the punctata BES on the map, 87% of the punctata FPC map covered 98% of the O. sativa genome sequence. The genome size of O. punctata was estimated to be 8% larger than that of O. sativa with individual chromosome differences of 1.5-16.5%. The sum of expansions and contractions observed in regions >500 kb were similar, suggesting that most of the contractions/expansions contributing to the genome size difference between the two species are small, thus preserving the macro-collinearity between these species, which diverged approximately 2 million years ago.
Collapse
Affiliation(s)
- HyeRan Kim
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Gowda M, Venu RC, Jia Y, Stahlberg E, Pampanwar V, Soderlund C, Wang GL. Use of robust-long serial analysis of gene expression to identify novel fungal and plant genes involved in host-pathogen interactions. Methods Mol Biol 2007; 354:131-44. [PMID: 17172751 DOI: 10.1385/1-59259-966-4:131] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Identification of important transcripts from fungal pathogens and host plants is indispensable for full understanding the molecular events occurring during fungal-plant interactions. Recently, we developed an improved LongSAGE method called robust-long serial analysis of gene expression (RL-SAGE) for deep transcriptome analysis of fungal and plant genomes. Using this method, we made 10 RL-SAGE libraries from two plant species (Oryza sativa and Zea maize) and one fungal pathogen (Magnaporthe grisea). Many of the transcripts identified from these libraries were novel in comparison with their corresponding EST collections. Bioinformatic tools and databases for analyzing the RL-SAGE data were developed. Our results demonstrate that RL-SAGE is an effective approach for large-scale identification of expressed genes in fungal and plant genomes.
Collapse
Affiliation(s)
- Malali Gowda
- Department of Plant Pathology, The Ohio State University, Columbus, USA
| | | | | | | | | | | | | |
Collapse
|
16
|
Abstract
Previous approaches to comparing gene and chromosome organization between two genomes have been based on genetic maps or genomic sequences. We have developed a system to align an FPC-based physical map to a genomic sequence based on BAC end sequences and sequence-tagged hybridization markers and to align two FPC maps to one another based on shared markers and fingerprints. The system, called SyMAP (Synteny Mapping and Analysis Program), consists of an algorithm to compute synteny blocks and Web-based graphics to visualize the results. The approach to calculating the anchors (corresponding elements on the respective maps) maximizes the inclusion of anchors with different rates of divergence. Chains (putative syntenic sets of anchors) are computed using a dynamic programming algorithm, which includes off-diagonal anchors that result from map coordinate errors and small inversions. As the gap parameters (the distances allowed between anchors in a chain) can vary over different data sets and be difficult to set manually, they are automatically computed per data set. The criterion for a chain to be acceptable is based on the number of anchors and the Pearson correlation coefficient. Neighboring chains are merged into synteny blocks for display. This algorithm has been tested with three data sets that vary in the number of BACs, BAC end sequences, hybridization markers, distance between anchors, and number and antiquity of genome duplication events. The Web-based graphics uses Java for a highly interactive display that allows the user to interrogate the evidence of synteny.
Collapse
Affiliation(s)
- Carol Soderlund
- Arizona Genomics Computational Laboratory, The Bio5 Institute, University of Arizona, Tucson, Arizona 85721, USA.
| | | | | | | |
Collapse
|
17
|
Nelson WM, Dvorak J, Luo MC, Messing J, Wing RA, Soderlund C. Efficacy of clone fingerprinting methodologies. Genomics 2006; 89:160-5. [PMID: 17011744 DOI: 10.1016/j.ygeno.2006.08.008] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2006] [Revised: 08/15/2006] [Accepted: 08/18/2006] [Indexed: 10/24/2022]
Abstract
With the development of new high-information content fingerprinting techniques for constructing BAC-based physical maps, physical map construction is accelerating and it is important to determine which methodologies work best. In a recent publication (Z. Xu et al., 2004, Genomics 84:941-951), Xu et al. evaluated five different techniques (one agarose-based and four using multiple enzymes) and concluded that a two-enzyme technique was superior. In addition, they found that no benefit was gained from fingerprinting more than 10x coverage. In this paper we report our own extensive simulation results, which lead to contrasting conclusions. Our data indicate that the five-enzyme method known as SNaPshot is the most effective and that the assembly can in fact be significantly improved with greater than 10x coverage.
Collapse
Affiliation(s)
- William M Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, AZ 85721, USA
| | | | | | | | | | | |
Collapse
|
18
|
Soderlund C, Haller K, Pampanwar V, Ebbole D, Farman M, Orbach MJ, Wang GL, Wing R, Xu JR, Brown D, Mitchell T, Dean R. MGOS: A resource for studying Magnaporthe grisea and Oryza sativa interactions. Mol Plant Microbe Interact 2006; 19:1055-61. [PMID: 17022169 DOI: 10.1094/mpmi-19-1055] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The MGOS (Magnaporthe grisea Oryza sativa) web-based database contains data from Oryza sativa and Magnaporthe grisea interaction experiments in which M. grisea is the fungal pathogen that causes the rice blast disease. In order to study the interactions, a consortium of fungal and rice geneticists was formed to construct a comprehensive set of experiments that would elucidate information about the gene expression of both rice and M. grisea during the infection cycle. These experiments included constructing and sequencing cDNA and robust long-serial analysis gene expression libraries from both host and pathogen during different stages of infection in both resistant and susceptible interactions, generating >50,000 M. grisea mutants and applying them to susceptible rice strains to test for pathogenicity, and constructing a dual O. sativa-M. grisea microarray. MGOS was developed as a central web-based repository for all the experimental data along with the rice and M. grisea genomic sequence. Community-based annotation is available for the M. grisea genes to aid in the study of the interactions.
Collapse
Affiliation(s)
- Carol Soderlund
- Arizona Genomics Computational Laboratory, Bio5 Institute, University of Arizona, Tucson 85721, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Vital J, García Suárez A, Sauri Barraza J, Soderlund C, Gangnet N, Gille O. Equilibrio sagital y su aplicación en patologías de columna vertebral. Rev Esp Cir Ortop Traumatol (Engl Ed) 2006. [DOI: 10.1016/s1888-4415(06)76431-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
|
20
|
Ammiraju JSS, Luo M, Goicoechea JL, Wang W, Kudrna D, Mueller C, Talag J, Kim H, Sisneros NB, Blackmon B, Fang E, Tomkins JB, Brar D, MacKill D, McCouch S, Kurata N, Lambert G, Galbraith DW, Arumuganathan K, Rao K, Walling JG, Gill N, Yu Y, SanMiguel P, Soderlund C, Jackson S, Wing RA. The Oryza bacterial artificial chromosome library resource: construction and analysis of 12 deep-coverage large-insert BAC libraries that represent the 10 genome types of the genus Oryza. Genome Res 2005; 16:140-7. [PMID: 16344555 PMCID: PMC1356138 DOI: 10.1101/gr.3766306] [Citation(s) in RCA: 181] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Rice (Oryza sativa L.) is the most important food crop in the world and a model system for plant biology. With the completion of a finished genome sequence we must now functionally characterize the rice genome by a variety of methods, including comparative genomic analysis between cereal species and within the genus Oryza. Oryza contains two cultivated and 22 wild species that represent 10 distinct genome types. The wild species contain an essentially untapped reservoir of agriculturally important genes that must be harnessed if we are to maintain a safe and secure food supply for the 21st century. As a first step to functionally characterize the rice genome from a comparative standpoint, we report the construction and analysis of a comprehensive set of 12 BAC libraries that represent the 10 genome types of Oryza. To estimate the number of clones required to generate 10 genome equivalent BAC libraries we determined the genome sizes of nine of the 12 species using flow cytometry. Each library represents a minimum of 10 genome equivalents, has an average insert size range between 123 and 161 kb, an average organellar content of 0.4%-4.1% and nonrecombinant content between 0% and 5%. Genome coverage was estimated mathematically and empirically by hybridization and extensive contig and BAC end sequence analysis. A preliminary analysis of BAC end sequences of clones from these libraries indicated that LTR retrotransposons are the predominant class of repeat elements in Oryza and a roughly linear relationship of these elements with genome size was observed.
Collapse
Affiliation(s)
- Jetty S S Ammiraju
- Arizona Genomics Institute, University of Arizona, Tucson, Arizona 85721 USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Nelson WM, Bharti AK, Butler E, Wei F, Fuks G, Kim H, Wing RA, Messing J, Soderlund C. Whole-genome validation of high-information-content fingerprinting. Plant Physiol 2005; 139:27-38. [PMID: 16166258 PMCID: PMC1203355 DOI: 10.1104/pp.105.061978] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Fluorescent-based high-information-content fingerprinting (HICF) techniques have recently been developed for physical mapping. These techniques make use of automated capillary DNA sequencing instruments to enable both high-resolution and high-throughput fingerprinting. In this article, we report the construction of a whole-genome HICF FPC map for maize (Zea mays subsp. mays cv B73), using a variant of HICF in which a type IIS restriction enzyme is used to generate the fluorescently labeled fragments. The HICF maize map was constructed from the same three maize bacterial artificial chromosome libraries as previously used for the whole-genome agarose FPC map, providing a unique opportunity for direct comparison of the agarose and HICF methods; as a result, it was found that HICF has substantially greater sensitivity in forming contigs. An improved assembly procedure is also described that uses automatic end-merging of contigs to reduce the effects of contamination and repetitive bands. Several new features in FPC v7.2 are presented, including shared-memory multiprocessing, which allows dramatically faster assemblies, and automatic end-merging, which permits more accurate assemblies. It is further shown that sequenced clones may be digested in silico and located accurately on the HICF assembly, despite size deviations that prevent the precise prediction of experimental fingerprints. Finally, repetitive bands are isolated, and their effect on the assembly is studied.
Collapse
Affiliation(s)
- William M Nelson
- Arizona Genomics Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, 85721, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Buell CR, Yuan Q, Ouyang S, Liu J, Zhu W, Wang A, Maiti R, Haas B, Wortman J, Pertea M, Jones KM, Kim M, Overton L, Tsitrin T, Fadrosh D, Bera J, Weaver B, Jin S, Johri S, Reardon M, Webb K, Hill J, Moffat K, Tallon L, Van Aken S, Lewis M, Utterback T, Feldblyum T, Zismann V, Iobst S, Hsiao J, de Vazeille AR, Salzberg SL, White O, Fraser C, Yu Y, Kim H, Rambo T, Currie J, Collura K, Kernodle-Thompson S, Wei F, Kudrna K, Ammiraju JSS, Luo M, Goicoechea JL, Wing RA, Henry D, Oates R, Palmer M, Pries G, Saski C, Simmons J, Soderlund C, Nelson W, de la Bastide M, Spiegel L, Nascimento L, Huang E, Preston R, Zutavern T, Palmer L, O'Shaughnessy A, Dike S, McCombie WR, Minx P, Cordum H, Wilson R, Jin W, Lee HR, Jiang J, Jackson S. Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species. Genome Res 2005; 15:1284-91. [PMID: 16109971 PMCID: PMC1199543 DOI: 10.1101/gr.3869505] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Rice (Oryza sativa L.) chromosome 3 is evolutionarily conserved across the cultivated cereals and shares large blocks of synteny with maize and sorghum, which diverged from rice more than 50 million years ago. To begin to completely understand this chromosome, we sequenced, finished, and annotated 36.1 Mb ( approximately 97%) from O. sativa subsp. japonica cv Nipponbare. Annotation features of the chromosome include 5915 genes, of which 913 are related to transposable elements. A putative function could be assigned to 3064 genes, with another 757 genes annotated as expressed, leaving 2094 that encode hypothetical proteins. Similarity searches against the proteome of Arabidopsis thaliana revealed putative homologs for 67% of the chromosome 3 proteins. Further searches of a nonredundant amino acid database, the Pfam domain database, plant Expressed Sequence Tags, and genomic assemblies from sorghum and maize revealed only 853 nontransposable element related proteins from chromosome 3 that lacked similarity to other known sequences. Interestingly, 426 of these have a paralog within the rice genome. A comparative physical map of the wild progenitor species, Oryza nivara, with japonica chromosome 3 revealed a high degree of sequence identity and synteny between these two species, which diverged approximately 10,000 years ago. Although no major rearrangements were detected, the deduced size of the O. nivara chromosome 3 was 21% smaller than that of japonica. Synteny between rice and other cereals using an integrated maize physical map and wheat genetic map was strikingly high, further supporting the use of rice and, in particular, chromosome 3, as a model for comparative studies among the cereals.
Collapse
Affiliation(s)
- C Robin Buell
- The Institute for Genomic Research, Rockville, Maryland 20850, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Jantasuriyarat C, Gowda M, Haller K, Hatfield J, Lu G, Stahlberg E, Zhou B, Li H, Kim H, Yu Y, Dean RA, Wing RA, Soderlund C, Wang GL. Large-scale identification of expressed sequence tags involved in rice and rice blast fungus interaction. Plant Physiol 2005; 138:105-15. [PMID: 15888683 PMCID: PMC1104166 DOI: 10.1104/pp.104.055624] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
To better understand the molecular basis of the defense response against the rice blast fungus (Magnaporthe grisea), a large-scale expressed sequence tag (EST) sequencing approach was used to identify genes involved in the early infection stages in rice (Oryza sativa). Six cDNA libraries were constructed using infected leaf tissues harvested from 6 conditions: resistant, partially resistant, and susceptible reactions at both 6 and 24 h after inoculation. Two additional libraries were constructed using uninoculated leaves and leaves from the lesion mimic mutant spl11. A total of 68,920 ESTs were generated from 8 libraries. Clustering and assembly analyses resulted in 13,570 unique sequences from 10,934 contigs and 2,636 singletons. Gene function classification showed that 42% of the ESTs were predicted to have putative gene function. Comparison of the pathogen-challenged libraries with the uninoculated control library revealed an increase in the percentage of genes in the functional categories of defense and signal transduction mechanisms and cell cycle control, cell division, and chromosome partitioning. In addition, hierarchical clustering analysis grouped the eight libraries based on their disease reactions. A total of 7,748 new and unique ESTs were identified from our collection compared with the KOME full-length cDNA collection. Interestingly, we found that rice ESTs are more closely related to sorghum (Sorghum bicolor) ESTs than to barley (Hordeum vulgare), wheat (Triticum aestivum), and maize (Zea mays) ESTs. The large cataloged collection of rice ESTs in this study provides a solid foundation for further characterization of the rice defense response and is a useful public genomic resource for rice functional genomics studies.
Collapse
|
24
|
Abstract
Many clone-based physical maps have been built with the FingerPrinted Contig (FPC) software, which is written in C and runs locally for fast and flexible analysis. If the maps were viewable only from FPC, they would not be as useful to the whole community since FPC must be installed on the user machine and the database downloaded. Hence, we have created a set of Web tools so users can easily view the FPC data and perform salient queries with standard browsers. This set includes the following four programs: WebFPC, a view of the contigs; WebChrom, the location of the contigs and genetic markers along the chromosome; WebBSS, locating user-supplied sequence on the map; and WebFCmp, comparing fingerprints. For additional FPC support, we have developed an FPC module for BioPerl and an FPC browser using the Generic Model Organism Project (GMOD) genome browser (GBrowse), where the FPC BioPerl module generates the data files for input into GBrowse. This provides an alternative to the WebChrom/WebFPC view. These tools are available to download along with documentation. The tools have been implemented for both the rice (Oryza sativa) and maize (Zea mays) FPC maps, which both contain the locations of clones, markers, genetic markers, and sequenced clone (along with links to sites that contain additional information).
Collapse
Affiliation(s)
- Vishal Pampanwar
- Arizona Genomic Computational Laboratory, BIO5 Institute, University of Arizona, Tucson, Arizona 85721, USA
| | | | | | | | | | | |
Collapse
|
25
|
Gardiner J, Schroeder S, Polacco ML, Sanchez-Villeda H, Fang Z, Morgante M, Landewe T, Fengler K, Useche F, Hanafey M, Tingey S, Chou H, Wing R, Soderlund C, Coe EH. Anchoring 9,371 maize expressed sequence tagged unigenes to the bacterial artificial chromosome contig map by two-dimensional overgo hybridization. Plant Physiol 2004; 134:1317-26. [PMID: 15020742 PMCID: PMC419808 DOI: 10.1104/pp.103.034538] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/17/2023]
Abstract
Our goal is to construct a robust physical map for maize (Zea mays) comprehensively integrated with the genetic map. We have used a two-dimensional 24 x 24 overgo pooling strategy to anchor maize expressed sequence tagged (EST) unigenes to 165,888 bacterial artificial chromosomes (BACs) on high-density filters. A set of 70,716 public maize ESTs seeded derivation of 10,723 EST unigene assemblies. From these assemblies, 10,642 overgo sequences of 40 bp were applied as hybridization probes. BAC addresses were obtained for 9,371 overgo probes, representing an 88% success rate. More than 96% of the successful overgo probes identified two or more BACs, while 5% identified more than 50 BACs. The majority of BACs identified (79%) were hybridized with one or two overgos. A small number of BACs hybridized with eight or more overgos, suggesting that these BACs must be gene rich. Approximately 5,670 overgos identified BACs assembled within one contig, indicating that these probes are highly locus specific. A total of 1,795 megabases (Mb; 87%) of the total 2,050 Mb in BAC contigs were associated with one or more overgos, which are serving as sequence-tagged sites for single nucleotide polymorphism development. Overgo density ranged from less than one overgo per megabase to greater than 20 overgos per megabase. The majority of contigs (52%) hit by overgos contained three to nine overgos per megabase. Analysis of approximately 1,022 Mb of genetically anchored BAC contigs indicates that 9,003 of the total 13,900 overgo-contig sites are genetically anchored. Our results indicate overgos are a powerful approach for generating gene-specific hybridization probes that are facilitating the assembly of an integrated genetic and physical map for maize.
Collapse
Affiliation(s)
- Jack Gardiner
- Department of Agronomy, University of Missouri, Columbia, Missouri 65211, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Chen M, Presting G, Barbazuk WB, Goicoechea JL, Blackmon B, Fang G, Kim H, Frisch D, Yu Y, Sun S, Higingbottom S, Phimphilai J, Phimphilai D, Thurmond S, Gaudette B, Li P, Liu J, Hatfield J, Main D, Farrar K, Henderson C, Barnett L, Costa R, Williams B, Walser S, Atkins M, Hall C, Budiman MA, Tomkins JP, Luo M, Bancroft I, Salse J, Regad F, Mohapatra T, Singh NK, Tyagi AK, Soderlund C, Dean RA, Wing RA. An integrated physical and genetic map of the rice genome. Plant Cell 2002; 14:537-45. [PMID: 11910002 PMCID: PMC150577 DOI: 10.1105/tpc.010485] [Citation(s) in RCA: 239] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2001] [Accepted: 02/05/2002] [Indexed: 05/17/2023]
Abstract
Rice was chosen as a model organism for genome sequencing because of its economic importance, small genome size, and syntenic relationship with other cereal species. We have constructed a bacterial artificial chromosome fingerprint-based physical map of the rice genome to facilitate the whole-genome sequencing of rice. Most of the rice genome ( approximately 90.6%) was anchored genetically by overgo hybridization, DNA gel blot hybridization, and in silico anchoring. Genome sequencing data also were integrated into the rice physical map. Comparison of the genetic and physical maps reveals that recombination is suppressed severely in centromeric regions as well as on the short arms of chromosomes 4 and 10. This integrated high-resolution physical map of the rice genome will greatly facilitate whole-genome sequencing by helping to identify a minimum tiling path of clones to sequence. Furthermore, the physical map will aid map-based cloning of agronomically important genes and will provide an important tool for the comparative analysis of grass genomes.
Collapse
Affiliation(s)
- Mingsheng Chen
- Clemson University Genomics Institute, 100 Jordan Hall, Clemson, South Carolina 29634-5727, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Deloukas P, Matthews LH, Ashurst J, Burton J, Gilbert JG, Jones M, Stavrides G, Almeida JP, Babbage AK, Bagguley CL, Bailey J, Barlow KF, Bates KN, Beard LM, Beare DM, Beasley OP, Bird CP, Blakey SE, Bridgeman AM, Brown AJ, Buck D, Burrill W, Butler AP, Carder C, Carter NP, Chapman JC, Clamp M, Clark G, Clark LN, Clark SY, Clee CM, Clegg S, Cobley VE, Collier RE, Connor R, Corby NR, Coulson A, Coville GJ, Deadman R, Dhami P, Dunn M, Ellington AG, Frankland JA, Fraser A, French L, Garner P, Grafham DV, Griffiths C, Griffiths MN, Gwilliam R, Hall RE, Hammond S, Harley JL, Heath PD, Ho S, Holden JL, Howden PJ, Huckle E, Hunt AR, Hunt SE, Jekosch K, Johnson CM, Johnson D, Kay MP, Kimberley AM, King A, Knights A, Laird GK, Lawlor S, Lehvaslaiho MH, Leversha M, Lloyd C, Lloyd DM, Lovell JD, Marsh VL, Martin SL, McConnachie LJ, McLay K, McMurray AA, Milne S, Mistry D, Moore MJ, Mullikin JC, Nickerson T, Oliver K, Parker A, Patel R, Pearce TA, Peck AI, Phillimore BJ, Prathalingam SR, Plumb RW, Ramsay H, Rice CM, Ross MT, Scott CE, Sehra HK, Shownkeen R, Sims S, Skuce CD, Smith ML, Soderlund C, Steward CA, Sulston JE, Swann M, Sycamore N, Taylor R, Tee L, Thomas DW, Thorpe A, Tracey A, Tromans AC, Vaudin M, Wall M, Wallis JM, Whitehead SL, Whittaker P, Willey DL, Williams L, Williams SA, Wilming L, Wray PW, Hubbard T, Durbin RM, Bentley DR, Beck S, Rogers J. The DNA sequence and comparative analysis of human chromosome 20. Nature 2001; 414:865-71. [PMID: 11780052 DOI: 10.1038/414865a] [Citation(s) in RCA: 148] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The finished sequence of human chromosome 20 comprises 59,187,298 base pairs (bp) and represents 99.4% of the euchromatic DNA. A single contig of 26 megabases (Mb) spans the entire short arm, and five contigs separated by gaps totalling 320 kb span the long arm of this metacentric chromosome. An additional 234,339 bp of sequence has been determined within the pericentromeric region of the long arm. We annotated 727 genes and 168 pseudogenes in the sequence. About 64% of these genes have a 5' and a 3' untranslated region and a complete open reading frame. Comparative analysis of the sequence of chromosome 20 to whole-genome shotgun-sequence data of two other vertebrates, the mouse Mus musculus and the puffer fish Tetraodon nigroviridis, provides an independent measure of the efficiency of gene annotation, and indicates that this analysis may account for more than 95% of all coding exons and almost all genes.
Collapse
Affiliation(s)
- P Deloukas
- The Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Bentley DR, Deloukas P, Dunham A, French L, Gregory SG, Humphray SJ, Mungall AJ, Ross MT, Carter NP, Dunham I, Scott CE, Ashcroft KJ, Atkinson AL, Aubin K, Beare DM, Bethel G, Brady N, Brook JC, Burford DC, Burrill WD, Burrows C, Butler AP, Carder C, Catanese JJ, Clee CM, Clegg SM, Cobley V, Coffey AJ, Cole CG, Collins JE, Conquer JS, Cooper RA, Culley KM, Dawson E, Dearden FL, Durbin RM, de Jong PJ, Dhami PD, Earthrowl ME, Edwards CA, Evans RS, Gillson CJ, Ghori J, Green L, Gwilliam R, Halls KS, Hammond S, Harper GL, Heathcott RW, Holden JL, Holloway E, Hopkins BL, Howard PJ, Howell GR, Huckle EJ, Hughes J, Hunt PJ, Hunt SE, Izmajlowicz M, Jones CA, Joseph SS, Laird G, Langford CF, Lehvaslaiho MH, Leversha MA, McCann OT, McDonald LM, McDowall J, Maslen GL, Mistry D, Moschonas NK, Neocleous V, Pearson DM, Phillips KJ, Porter KM, Prathalingam SR, Ramsey YH, Ranby SA, Rice CM, Rogers J, Rogers LJ, Sarafidou T, Scott DJ, Sharp GJ, Shaw-Smith CJ, Smink LJ, Soderlund C, Sotheran EC, Steingruber HE, Sulston JE, Taylor A, Taylor RG, Thorpe AA, Tinsley E, Warry GL, Whittaker A, Whittaker P, Williams SH, Wilmer TE, Wooster R, Wright CL. The physical maps for sequencing human chromosomes 1, 6, 9, 10, 13, 20 and X. Nature 2001; 409:942-3. [PMID: 11237015 DOI: 10.1038/35057165] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We constructed maps for eight chromosomes (1, 6, 9, 10, 13, 20, X and (previously) 22), representing one-third of the genome, by building landmark maps, isolating bacterial clones and assembling contigs. By this approach, we could establish the long-range organization of the maps early in the project, and all contig extension, gap closure and problem-solving was simplified by containment within local regions. The maps currently represent more than 94% of the euchromatic (gene-containing) regions of these chromosomes in 176 contigs, and contain 96% of the chromosome-specific markers in the human gene map. By measuring the remaining gaps, we can assess chromosome length and coverage in sequenced clones.
Collapse
MESH Headings
- Chromosomes, Human, Pair 1
- Chromosomes, Human, Pair 10
- Chromosomes, Human, Pair 13
- Chromosomes, Human, Pair 20
- Chromosomes, Human, Pair 6
- Contig Mapping
- Genome, Human
- Humans
- X Chromosome
Collapse
|
29
|
McPherson JD, Marra M, Hillier L, Waterston RH, Chinwalla A, Wallis J, Sekhon M, Wylie K, Mardis ER, Wilson RK, Fulton R, Kucaba TA, Wagner-McPherson C, Barbazuk WB, Gregory SG, Humphray SJ, French L, Evans RS, Bethel G, Whittaker A, Holden JL, McCann OT, Dunham A, Soderlund C, Scott CE, Bentley DR, Schuler G, Chen HC, Jang W, Green ED, Idol JR, Maduro VV, Montgomery KT, Lee E, Miller A, Emerling S, Gibbs R, Scherer S, Gorrell JH, Sodergren E, Clerc-Blankenburg K, Tabor P, Naylor S, Garcia D, de Jong PJ, Catanese JJ, Nowak N, Osoegawa K, Qin S, Rowen L, Madan A, Dors M, Hood L, Trask B, Friedman C, Massa H, Cheung VG, Kirsch IR, Reid T, Yonescu R, Weissenbach J, Bruls T, Heilig R, Branscomb E, Olsen A, Doggett N, Cheng JF, Hawkins T, Myers RM, Shang J, Ramirez L, Schmutz J, Velasquez O, Dixon K, Stone NE, Cox DR, Haussler D, Kent WJ, Furey T, Rogic S, Kennedy S, Jones S, Rosenthal A, Wen G, Schilhabel M, Gloeckner G, Nyakatura G, Siebert R, Schlegelberger B, Korenberg J, Chen XN, Fujiyama A, Hattori M, Toyoda A, Yada T, Park HS, Sakaki Y, Shimizu N, Asakawa S, Kawasaki K, Sasaki T, Shintani A, Shimizu A, Shibuya K, Kudoh J, Minoshima S, Ramser J, Seranski P, Hoff C, Poustka A, Reinhardt R, Lehrach H. A physical map of the human genome. Nature 2001; 409:934-41. [PMID: 11237014 DOI: 10.1038/35057157] [Citation(s) in RCA: 549] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The human genome is by far the largest genome to be sequenced, and its size and complexity present many challenges for sequence assembly. The International Human Genome Sequencing Consortium constructed a map of the whole genome to enable the selection of clones for sequencing and for the accurate assembly of the genome sequence. Here we report the construction of the whole-genome bacterial artificial chromosome (BAC) map and its integration with previous landmark maps and information from mapping efforts focused on specific chromosomal regions. We also describe the integration of sequence data with the map.
Collapse
Affiliation(s)
- J D McPherson
- Washington University School of Medicine, Genome Sequencing Center, Department of Genetics, St. Louis, Missouri 63108, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Abstract
Contigs have been assembled, and over 2800 clones selected for sequencing for human chromosomes 9, 10 and 13. Using the FPC (FingerPrinted Contig) software, the contigs are assembled with markers and complete digest fingerprints, and the contigs are ordered and localised by a global framework. Publicly available resources have been used, such as, the 1998 International Gene Map for the framework and the GSC Human BAC fingerprint database for the majority of the fingerprints. Additional markers and fingerprints are generated in-house to supplement this data. To support the scale up of building maps, FPC V4.7 has been extended to use markers with the fingerprints for assembly of contigs, new clones and markers can be automatically added to existing contigs, and poorly assembled contigs are marked accordingly. To test the automatic assembly, a simulated complete digest of 110 Mb of concatenated human sequence was used to create datasets with varying coverage, length of clones, and types of error. When no error was introduced and a tolerance of 7 was used in assembly, the largest contig with no false positive overlaps has 9534 clones with 37 out-of-order clones, that is, the starting coordinates of adjacent clones are in the wrong order. This paper describes the new features in FPC, the scenario for building the maps of chromosomes 9, 10 and 13, and the results from the simulation.
Collapse
Affiliation(s)
- C Soderlund
- Clemson University Genomic Institute, Clemson, South Carolina 29634-5808, USA.
| | | | | | | |
Collapse
|
31
|
Abstract
Contigs have been assembled, and over 2800 clones selected for sequencing for human chromosomes 9, 10 and 13. Using the FPC (FingerPrinted Contig) software, the contigs are assembled with markers and complete digest fingerprints, and the contigs are ordered and localised by a global framework. Publicly available resources have been used, such as, the 1998 International Gene Map for the framework and the GSC Human BAC fingerprint database for the majority of the fingerprints. Additional markers and fingerprints are generated in-house to supplement this data. To support the scale up of building maps, FPC V4.7 has been extended to use markers with the fingerprints for assembly of contigs, new clones and markers can be automatically added to existing contigs, and poorly assembled contigs are marked accordingly. To test the automatic assembly, a simulated complete digest of 110 Mb of concatenated human sequence was used to create datasets with varying coverage, length of clones, and types of error. When no error was introduced and a tolerance of 7 was used in assembly, the largest contig with no false positive overlaps has 9534 clones with 37 out-of-order clones, that is, the starting coordinates of adjacent clones are in the wrong order. This paper describes the new features in FPC, the scenario for building the maps of chromosomes 9, 10 and 13, and the results from the simulation.
Collapse
Affiliation(s)
- C Soderlund
- Clemson University Genomic Institute, Clemson, South Carolina 29634-5808, USA.
| | | | | | | |
Collapse
|
32
|
Steingruber HE, Dunham A, Coffey AJ, Clegg SM, Howell GR, Maslen GL, Scott CE, Gwilliam R, Hunt PJ, Sotheran EC, Huckle EJ, Hunt SE, Dhami P, Soderlund C, Leversha MA, Bentley DR, Ross MT. High-resolution landmark framework for the sequence-ready mapping of Xq23-q26.1. Genome Res 1999; 9:751-62. [PMID: 10447510 PMCID: PMC310799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
Abstract
We have established a landmark framework map over 20-25 Mb of the long arm of the human X chromosome using yeast artificial chromosome (YAC) clones. The map has approximately one landmark per 45 kb of DNA and stretches from DXS7531 in proximal Xq23 to DXS895 in proximal Xq26, connecting to published framework maps on its proximal and distal sides. There are three gaps in the framework map resulting from the failure to obtain clone coverage from the YAC resources available. Estimates of the maximum sizes of these gaps have been obtained. The four YAC contigs have been positioned and oriented using somatic-cell hybrids and fluorescence in situ hybridization, and the largest is estimated to cover approximately 15 Mb of DNA. The framework map is being used to assemble a sequence-ready map in large-insert bacterial clones, as part of an international effort to complete the sequence of the X chromosome. PAC and BAC contigs currently cover 18 Mb of the region, and from these, 12 Mb of finished sequence is available.
Collapse
Affiliation(s)
- H E Steingruber
- The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Deloukas P, Schuler GD, Gyapay G, Beasley EM, Soderlund C, Rodriguez-Tomé P, Hui L, Matise TC, McKusick KB, Beckmann JS, Bentolila S, Bihoreau M, Birren BB, Browne J, Butler A, Castle AB, Chiannilkulchai N, Clee C, Day PJ, Dehejia A, Dibling T, Drouot N, Duprat S, Fizames C, Fox S, Gelling S, Green L, Harrison P, Hocking R, Holloway E, Hunt S, Keil S, Lijnzaad P, Louis-Dit-Sully C, Ma J, Mendis A, Miller J, Morissette J, Muselet D, Nusbaum HC, Peck A, Rozen S, Simon D, Slonim DK, Staples R, Stein LD, Stewart EA, Suchard MA, Thangarajah T, Vega-Czarny N, Webber C, Wu X, Hudson J, Auffray C, Nomura N, Sikela JM, Polymeropoulos MH, James MR, Lander ES, Hudson TJ, Myers RM, Cox DR, Weissenbach J, Boguski MS, Bentley DR. A physical map of 30,000 human genes. Science 1998; 282:744-6. [PMID: 9784132 DOI: 10.1126/science.282.5389.744] [Citation(s) in RCA: 434] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
A map of 30,181 human gene-based markers was assembled and integrated with the current genetic map by radiation hybrid mapping. The new gene map contains nearly twice as many genes as the previous release, includes most genes that encode proteins of known function, and is twofold to threefold more accurate than the previous version. A redesigned, more informative and functional World Wide Web site (www.ncbi.nlm.nih.gov/genemap) provides the mapping information and associated data and annotations. This resource constitutes an important infrastructure and tool for the study of complex genetic traits, the positional cloning of disease genes, the cross-referencing of mammalian genomes, and validated human transcribed sequences for large-scale studies of gene expression.
Collapse
Affiliation(s)
- P Deloukas
- Sanger Centre, Hinxton Hall, Hinxton, Cambridge CB10 1SA UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Abstract
MOTIVATION Extensions have been made to the RHMAPPER-1.1 package. One set of extensions computes the totally linked markers and uses the results as input to the salient RHMAPPER functions. The second set of extensions uses TKperl to provide an interactive interface for ease of querying the database and displaying maps. AVAILABILITY The extensions can be obtained via ftp.sanger.ac.uk/pub/zmapper. SUPPLEMENTARY INFORMATION The User's Manual can be viewed from http:/www.sanger.ac.uk/Users/cari/Z.shtml. CONTACT cari@sanger.ac.uk
Collapse
Affiliation(s)
- C Soderlund
- The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.
| | | | | |
Collapse
|
35
|
Abstract
MOTIVATION To meet the demands of large-scale sequencing, thousands of clones must be fingerprinted and assembled into contigs. To determine the order of clones, a typical experiment is to digest the clones with one or more restriction enzymes and measure the resulting fragments. The probability of two clones overlapping is based on the similarity of their fragments. A contig contains two or more overlapping clones and a minimal tiling path of clones is selected to be sequenced. Interactive software with algorithmic support is necessary to assemble the clones into contigs quickly. RESULTS FPC (fingerprinted contigs) is an interactive program for building contigs from restriction fingerprinted clones. FPC uses an algorithm to cluster clones into contigs based on their probability of coincidence score. For each contig, it builds a consensus band (CB) map which is similar to a restriction map; but it does not try to resolve all the errors. The CB map is used to assign coordinates to the clones based on their alignment to the map and to provide a detailed visualization of the clone overlap. FPC has editing facilities for the user to refine the coordinates and to remove poorly fingerprinted clones. Functions are available for updating an FPC database with new clones. Contigs can easily be merged, split or deleted. Markers can be added to clones and are displayed with the appropriate contig. Sequence-ready clones can be selected and their sequencing status displayed. As such, FPC is an integrated program for the assembly of sequence-ready clones for large-scale sequencing projects.
Collapse
|
36
|
Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tom P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJR, Dibling T, East C, Drouot N, Dunham I, Duprat S, Edwards C, Fan JB, Fang N, Fizames C, Garrett C, Green L, Hadley D, Harris M, Harrison P, Brady S, Hicks A, Holloway E, Hui L, Hussain S, Louis-Dit-Sully C, Ma J, MacGilvery A, Mader C, Maratukulam A, Matise TC, McKusick KB, Morissette J, Mungall A, Muselet D, Nusbaum HC, Page DC, Peck A, Perkins S, Piercy M, Qin F, Quackenbush J, Ranby S, Reif T, Rozen S, Sanders C, She X, Silva J, Slonim DK, Soderlund C, Sun WL, Tabar P, Thangarajah T, Vega-Czarny N, Vollrath D, Voyticky S, Wilmer T, Wu X, Adams MD, Auffray C, Walter NAR, Brandon R, Dehejia A, Goodfellow PN, Houlgatte R, Hudson JR, Ide SE, Iorio KR, Lee WY, Seki N, Nagase T, Ishikawa K, Nomura N, Phillips C, Polymeropoulos MH, Sandusky M, Schmitt K, Berry R, Swanson K, Torres R, Venter JC, Sikela JM, Beckmann JS, Weissenbach J, Myers RM, Cox DR, James MR, Bentley D, Deloukas P, Lander ES, Hudson TJ. A Gene Map of the Human Genome. Science 1996. [DOI: 10.1126/science.274.5287.540] [Citation(s) in RCA: 717] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
37
|
Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez-Tomé P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJ, Dibling T, Drouot N, Dunham I, Duprat S, East C, Edwards C, Fan JB, Fang N, Fizames C, Garrett C, Green L, Hadley D, Harris M, Harrison P, Brady S, Hicks A, Holloway E, Hui L, Hussain S, Louis-Dit-Sully C, Ma J, MacGilvery A, Mader C, Maratukulam A, Matise TC, McKusick KB, Morissette J, Mungall A, Muselet D, Nusbaum HC, Page DC, Peck A, Perkins S, Piercy M, Qin F, Quackenbush J, Ranby S, Reif T, Rozen S, Sanders C, She X, Silva J, Slonim DK, Soderlund C, Sun WL, Tabar P, Thangarajah T, Vega-Czarny N, Vollrath D, Voyticky S, Wilmer T, Wu X, Adams MD, Auffray C, Walter NA, Brandon R, Dehejia A, Goodfellow PN, Houlgatte R, Hudson JR, Ide SE, Iorio KR, Lee WY, Seki N, Nagase T, Ishikawa K, Nomura N, Phillips C, Polymeropoulos MH, Sandusky M, Schmitt K, Berry R, Swanson K, Torres R, Venter JC, Sikela JM, Beckmann JS, Weissenbach J, Myers RM, Cox DR, James MR, Bentley D, Deloukas P, Lander ES, Hudson TJ. A gene map of the human genome. Science 1996; 274:540-6. [PMID: 8849440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The human genome is thought to harbor 50,000 to 100,000 genes, of which about half have been sampled to date in the form of expressed sequence tags. An international consortium was organized to develop and map gene-based sequence tagged site markers on a set of two radiation hybrid panels and a yeast artificial chromosome library. More than 16,000 human genes have been mapped relative to a framework map that contains about 1000 polymorphic genetic markers. The gene map unifies the existing genetic and physical maps with the nucleotide and protein sequence databases in a fashion that should speed the discovery of genes underlying inherited human disease. The integrated resource is available through a site on the World Wide Web at http://www.ncbi.nlm.nih.gov/SCIENCE96/.
Collapse
Affiliation(s)
- G D Schuler
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Abstract
SAM (system for assembling markers) is a system which supports man-machine problem solving for iteratively ordering a set of markers. SAM aids the user in partially ordering a set of markers based on incomplete and uncertain data. As data is added and modified, SAM aids the user in updating the previously assembled maps. The input is a file of clones and for each clone, a list of the markers contained within it. The objective is to order the set of markers such that the markers contained in each clone are consecutive. The user directs the map building by selecting functions to assemble a region of markers, order the clones to fit the order of the markers and position new markers within an ordered set of markers. The user can edit the input data, edit the assembled map and add clones to the map based on their marker content. The results are displayed graphically and can be saved in a solution file. Based on the partial map, the user designs new experiments or edits the existing data to fill gaps and resolve ambiguities. When a previously assembled map is loaded into SAM, it is automatically updated with the new or altered data. SAM treats all markers as points, but has special features for multiple copy and long markers so that they can be used in the map building process. This system has supported the building of a YAC map of human chromosome 22 at the Sanger Centre, where use of Alu-PCR product markers is a major component in determining clone overlap and where we have an on-going effort to accumulate data from various sources. SAM is also being used at various other laboratories.
Collapse
|
39
|
Abstract
GRAM (Genomic Restriction map AsseMbly) takes as input single-digest restriction fragments for a set of overlapping clones and outputs one or more plausible partially ordered restriction maps. For each restriction map, GRAM shows the corresponding alignment of the input clone fragments. Due to the error and uncertainty in experimental data, this problem is computationally difficult to solve; therefore, the principle objective in the design of GRAM is to facilitate man-machine collaborative problem solving. GRAM quickly approximates a solution, as follows. (i) A clustering algorithm determines a probable set of restriction fragments. (ii) An assembly algorithm permutes the set of restriction fragments such that the maximal number of clone fragments are contiguous. The output of the GRAM algorithm is displayed for the user to query and edit. This paper describes the stochastic assembly algorithm and shows how it works with the interactive graphics to support man-machine problem solving. In order to test and verify the performance of GRAM, we have developed a program called genfragII to simulate the digestion of clones and fragments; this program is described and results are presented. GRAM is also being used for a number of genome mapping projects.
Collapse
Affiliation(s)
- C Soderlund
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, NM 87545
| | | |
Collapse
|
40
|
White O, Soderlund C, Shanmugan P, Fields C. Information contents and dinucleotide compositions of plant intron sequences vary with evolutionary origin. Plant Mol Biol 1992; 19:1057-64. [PMID: 1511130 DOI: 10.1007/bf00040537] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
The DNA sequence composition of 526 dicot and 345 monocot intron sequences have been characterized using computational methods. Splice site information content and bulk intron and exon dinucleotide composition were determined. Positions 4 and 5 of 5' splice sites contain different statistically significant levels of information in the two groups. Basal levels of information in introns are higher in dicots than in monocots. Two dinucleotide groups, WW (AA, AU, UA, UU) and SS (CC, CG, GC, GG) have significantly different frequencies in exons and introns of the two plant groups. These results suggest that the mechanisms of splice-site recognition and binding may differ between dicot and monocot plants.
Collapse
Affiliation(s)
- O White
- Computing Research Laboratory, New Mexico State University, Las Cruces 88003-0001
| | | | | | | |
Collapse
|