Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Nelson WM, Bharti AK, Butler E, Wei F, Fuks G, Kim H, Wing RA, Messing J, Soderlund C. Whole-genome validation of high-information-content fingerprinting. Plant Physiol 2005;139:27-38. [PMID: 16166258 PMCID: PMC1203355 DOI: 10.1104/pp.105.061978] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

For:	Nelson WM, Bharti AK, Butler E, Wei F, Fuks G, Kim H, Wing RA, Messing J, Soderlund C. Whole-genome validation of high-information-content fingerprinting. Plant Physiol 2005;139:27-38. [PMID: 16166258 PMCID: PMC1203355 DOI: 10.1104/pp.105.061978] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Number

Cited by Other Article(s)

Alaux M, Rogers J, Letellier T, Flores R, Alfama F, Pommier C, Mohellibi N, Durand S, Kimmel E, Michotey C, Guerche C, Loaec M, Lainé M, Steinbach D, Choulet F, Rimbert H, Leroy P, Guilhot N, Salse J, Feuillet C, Paux E, Eversole K, Adam-Blondon AF, Quesneville H. Linking the International Wheat Genome Sequencing Consortium bread wheat reference genome sequence to wheat genetic and phenomic data. Genome Biol 2018;19:111. [PMID: 30115101 PMCID: PMC6097284 DOI: 10.1186/s13059-018-1491-4] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 07/23/2018] [Indexed: 01/24/2023] Open

Affiliation(s)

Michael Alaux URGI, INRA, Université Paris-Saclay, 78026, Versailles, France.
Jane Rogers International Wheat Genome Sequencing Consortium (IWGSC), 18 High Street, Little Eversden, Cambridge, CB23 1HE, UK
Thomas Letellier URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Raphaël Flores URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Françoise Alfama URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Cyril Pommier URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Nacer Mohellibi URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Sophie Durand URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Erik Kimmel URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Célia Michotey URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Claire Guerche URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Mikaël Loaec URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Mathilde Lainé URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Delphine Steinbach URGI, INRA, Université Paris-Saclay, 78026, Versailles, France Present address: GQE-Le Moulon UMR 320, INRA, Université Paris-Sud, Université Paris-Saclay, CNRS, AgroParisTech, Ferme du Moulon, 91190, Gif-sur-Yvette, France
Frédéric Choulet GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Hélène Rimbert GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Philippe Leroy GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Nicolas Guilhot GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Jérôme Salse GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Catherine Feuillet GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France Present address: Inari Agriculture, 200 Sydney Street, Cambridge, MA, 02139, USA
Etienne Paux GDEC, INRA, Université Clermont Auvergne, 63000, Clermont-Ferrand, France
Kellye Eversole International Wheat Genome Sequencing Consortium (IWGSC), 5207 Wyoming Road, Bethesda, Maryland, 20816, USA
Anne-Françoise Adam-Blondon URGI, INRA, Université Paris-Saclay, 78026, Versailles, France
Hadi Quesneville URGI, INRA, Université Paris-Saclay, 78026, Versailles, France

Collapse

Wei X, Xu Z, Wang G, Hou J, Ma X, Liu H, Liu J, Chen B, Luo M, Xie B, Li R, Ruan J, Liu X. pBACode: a random-barcode-based high-throughput approach for BAC paired-end sequencing and physical clone mapping. Nucleic Acids Res 2017;45:e52. [PMID: 27980066 PMCID: PMC5397170 DOI: 10.1093/nar/gkw1261] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2016] [Accepted: 12/09/2016] [Indexed: 12/14/2022] Open

Affiliation(s)

Xiaolin Wei MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.,PTN (Peking University-Tsinghua University-National Institute of Biological Sciences) Joint Graduate Program, Beijing 100084, China.,School of Life Sciences, Peking University, Beijing 100084, China
Zhichao Xu MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.,PTN (Peking University-Tsinghua University-National Institute of Biological Sciences) Joint Graduate Program, Beijing 100084, China
Guixing Wang Beidaihe Central Experiment Station, Chinese Academy of Fishery Sciences, Qinhuangdao 066100, China
Jilun Hou Beidaihe Central Experiment Station, Chinese Academy of Fishery Sciences, Qinhuangdao 066100, China
Xiaopeng Ma MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.,PTN (Peking University-Tsinghua University-National Institute of Biological Sciences) Joint Graduate Program, Beijing 100084, China
Haijin Liu Beidaihe Central Experiment Station, Chinese Academy of Fishery Sciences, Qinhuangdao 066100, China
Jiadong Liu National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
Bo Chen National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
Meizhong Luo National Key Laboratory of Crop Genetic Improvement and College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
Bingyan Xie Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
Ruiqiang Li Novogene Bioinformatics Institute, Beijing 100083, China
Jue Ruan Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
Xiao Liu MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China

Collapse

Zhang J, Kudrna D, Mu T, Li W, Copetti D, Yu Y, Goicoechea JL, Lei Y, Wing RA. Genome puzzle master (GPM): an integrated pipeline for building and editing pseudomolecules from fragmented sequences. Bioinformatics 2016;32:3058-3064. [PMID: 27318200 PMCID: PMC5048067 DOI: 10.1093/bioinformatics/btw370] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Accepted: 06/06/2016] [Indexed: 12/16/2022] Open

Akpinar BA, Magni F, Yuce M, Lucas SJ, Šimková H, Šafář J, Vautrin S, Bergès H, Cattonaro F, Doležel J, Budak H. The physical map of wheat chromosome 5DS revealed gene duplications and small rearrangements. BMC Genomics 2015;16:453. [PMID: 26070810 PMCID: PMC4465308 DOI: 10.1186/s12864-015-1641-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 05/19/2015] [Indexed: 11/24/2022] Open

Zhang J, Shao C, Zhang L, Liu K, Gao F, Dong Z, Xu P, Chen S. A first generation BAC-based physical map of the half-smooth tongue sole (Cynoglossus semilaevis) genome. BMC Genomics 2014;15:215. [PMID: 24650389 PMCID: PMC3998196 DOI: 10.1186/1471-2164-15-215] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2013] [Accepted: 03/10/2014] [Indexed: 02/06/2023] Open

Abstract

Background

Half-smooth tongue sole (Cynoglossus semilaevis Günther) has been exploited as a commercially important cultured marine flatfish, and female grows 2–3 times faster than male. Genetic studies, especially on the chromosomal sex-determining system of this species, have been carried out in the last decade. Although the genome of half-smooth tongue sole was relatively small (626.9 Mb), there are still some difficulties in the high-quality assembly of the next generation genome sequencing reads without the assistance of a physical map, especially for the W chromosome of this fish due to abundance of repetitive sequences. The objective of this study is to construct a bacterial artificial chromosome (BAC)-based physical map for half-smooth tongue sole with the method of high information content fingerprinting (HICF).

Results

A physical map of half-smooth tongue sole was constructed with 30, 294 valid fingerprints (7.5 × genome coverage) with a tolerance of 4 and an initial cutoff of 1e-60. A total of 29,709 clones were assembled into 1,485 contigs with an average length of 539 kb and a N50 length of 664 kb. There were 394 contigs longer than the N50 length, and these contigs will be a useful resource for future integration with linkage map and whole genome sequence assembly. The estimated physical length of the assembled contigs was 797 Mb, representing approximately 1.27 coverage of the half-smooth tongue sole genome. The largest contig contained 410 BAC clones with a physical length of 3.48 Mb. Almost all of the 676 BAC clones (99.9%) in the 21 randomly selected contigs were positively validated by PCR assays, thereby confirming the reliability of the assembly.

Conclusions

A first generation BAC-based physical map of half-smooth tongue sole was constructed with high reliability. The map will promote genetic improvement programs of this fish, especially integration of physical and genetic maps, fine-mappings of important gene and/or QTL, comparative and evolutionary genomics studies, as well as whole genome sequence assembly.

Collapse

Varshney RK, Mir RR, Bhatia S, Thudi M, Hu Y, Azam S, Zhang Y, Jaganathan D, You FM, Gao J, Riera-Lizarazu O, Luo MC. Integrated physical, genetic and genome map of chickpea (Cicer arietinum L.). Funct Integr Genomics 2014;14:59-73. [PMID: 24610029 PMCID: PMC4273598 DOI: 10.1007/s10142-014-0363-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2013] [Revised: 01/27/2014] [Accepted: 01/31/2014] [Indexed: 10/25/2022]

Wang X, Liu Q, Wang H, Luo CX, Wang G, Luo M. A BAC based physical map and genome survey of the rice false smut fungus Villosiclava virens. BMC Genomics 2013;14:883. [PMID: 24341590 PMCID: PMC3878662 DOI: 10.1186/1471-2164-14-883] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2013] [Accepted: 12/04/2013] [Indexed: 01/08/2023] Open

Abstract

Background

Rice false smut caused by Villosiclava virens is a devastating fungal disease that spreads in major rice-growing regions throughout the world. However, the genomic information for this fungal pathogen is limited and the pathogenic mechanism of this disease is still not clear. To facilitate genetic, molecular and genomic studies of this fungal pathogen, we constructed the first BAC-based physical map and performed the first genome survey for this species.

Results

High molecular weight genomic DNA was isolated from young mycelia of the Villosiclava virens strain UV-8b and a high-quality, large-insert and deep-coverage Bacterial Artificial Chromosome (BAC) library was constructed with the restriction enzyme HindIII. The BAC library consisted of 5,760 clones, which covers 22.7-fold of the UV-8b genome, with an average insert size of 140 kb and an empty clone rate of lower than 1%. BAC fingerprinting generated successful fingerprints for 2,290 BAC clones. Using the fingerprints, a whole genome-wide BAC physical map was constructed that contained 194 contigs (2,035 clones) spanning 51.2 Mb in physical length. Bidirectional-end sequencing of 4,512 BAC clones generated 6,560 high quality BAC end sequences (BESs), with a total length of 3,030,658 bp, representing 8.54% of the genome sequence. Analysis of the BESs revealed general genome information, including 51.52% GC content, 22.51% repetitive sequences, 376.12/Mb simple sequence repeat (SSR) density and approximately 36.01% coding regions. Sequence comparisons to other available fungal genome sequences through BESs showed high similarities to Metarhizium anisopliae, Trichoderma reesei, Nectria haematococca and Cordyceps militaris, which were generally in agreement with the 18S rRNA gene analysis results.

Conclusion

This study provides the first BAC-based physical map and genome information for the important rice fungal pathogen Villosiclava virens. The BAC clones, physical map and genome information will serve as fundamental resources to accelerate the genetic, molecular and genomic studies of this pathogen, including positional cloning, comparative genomic analysis and whole genome sequencing. The BAC library and physical map have been opened to researchers as public genomic resources (http://gresource.hzau.edu.cn/resource/resource.html).

Collapse

Breen J, Wicker T, Shatalina M, Frenkel Z, Bertin I, Philippe R, Spielmeyer W, Šimková H, Šafář J, Cattonaro F, Scalabrin S, Magni F, Vautrin S, Bergès H, Paux E, Fahima T, Doležel J, Korol A, Feuillet C, Keller B. A physical map of the short arm of wheat chromosome 1A. PLoS One 2013;8:e80272. [PMID: 24278269 PMCID: PMC3836966 DOI: 10.1371/journal.pone.0080272] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Accepted: 10/11/2013] [Indexed: 12/31/2022] Open

Genomic resources for gene discovery, functional genome annotation, and evolutionary studies of maize and its close relatives. Genetics 2013;195:723-37. [PMID: 24037269 DOI: 10.1534/genetics.113.157115] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

A 4-gigabase physical map unlocks the structure and evolution of the complex genome of Aegilops tauschii, the wheat D-genome progenitor. Proc Natl Acad Sci U S A 2013;110:7940-5. [PMID: 23610408 DOI: 10.1073/pnas.1219082110] [Citation(s) in RCA: 178] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Physical mapping integrated with syntenic analysis to characterize the gene space of the long arm of wheat chromosome 1A. PLoS One 2013;8:e59542. [PMID: 23613713 PMCID: PMC3628912 DOI: 10.1371/journal.pone.0059542] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 02/15/2013] [Indexed: 12/02/2022] Open

Navabi ZK, Huebert T, Sharpe AG, O’Neill CM, Bancroft I, Parkin IAP. Conserved microstructure of the Brassica B Genome of Brassica nigra in relation to homologous regions of Arabidopsis thaliana, B. rapa and B. oleracea. BMC Genomics 2013;14:250. [PMID: 23586706 PMCID: PMC3765694 DOI: 10.1186/1471-2164-14-250] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Accepted: 04/04/2013] [Indexed: 11/17/2022] Open

Bozdag S, Close TJ, Lonardi S. A graph-theoretical approach to the selection of the minimum tiling path from a physical map. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013;10:352-360. [PMID: 23929859 DOI: 10.1109/tcbb.2013.26] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

Palti Y, Genet C, Gao G, Hu Y, You FM, Boussaha M, Rexroad CE, Luo MC. A second generation integrated map of the rainbow trout (Oncorhynchus mykiss) genome: analysis of conserved synteny with model fish genomes. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2012;14:343-357. [PMID: 22101344 DOI: 10.1007/s10126-011-9418-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2011] [Accepted: 10/18/2011] [Indexed: 05/31/2023]

Advances in BAC-based physical mapping and map integration strategies in plants. J Biomed Biotechnol 2012;2012:184854. [PMID: 22500080 PMCID: PMC3303678 DOI: 10.1155/2012/184854] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2011] [Revised: 10/26/2011] [Accepted: 11/11/2011] [Indexed: 12/29/2022] Open

de Boer JM, Borm TJA, Jesse T, Brugmans B, Wiggers-Perebolte L, de Leeuw L, Tang X, Bryan GJ, Bakker J, van Eck HJ, Visser RGF. A hybrid BAC physical map of potato: a framework for sequencing a heterozygous genome. BMC Genomics 2011;12:594. [PMID: 22142254 PMCID: PMC3261212 DOI: 10.1186/1471-2164-12-594] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2011] [Accepted: 12/05/2011] [Indexed: 12/15/2022] Open

Abstract

BACKGROUND

Potato is the world's third most important food crop, yet cultivar improvement and genomic research in general remain difficult because of the heterozygous and tetraploid nature of its genome. The development of physical map resources that can facilitate genomic analyses in potato has so far been very limited. Here we present the methods of construction and the general statistics of the first two genome-wide BAC physical maps of potato, which were made from the heterozygous diploid clone RH89-039-16 (RH).

RESULTS

First, a gel electrophoresis-based physical map was made by AFLP fingerprinting of 64478 BAC clones, which were aligned into 4150 contigs with an estimated total length of 1361 Mb. Screening of BAC pools, followed by the KeyMaps in silico anchoring procedure, identified 1725 AFLP markers in the physical map, and 1252 BAC contigs were anchored the ultradense potato genetic map. A second, sequence-tag-based physical map was constructed from 65919 whole genome profiling (WGP) BAC fingerprints and these were aligned into 3601 BAC contigs spanning 1396 Mb. The 39733 BAC clones that overlap between both physical maps provided anchors to 1127 contigs in the WGP physical map, and reduced the number of contigs to around 2800 in each map separately. Both physical maps were 1.64 times longer than the 850 Mb potato genome. Genome heterozygosity and incomplete merging of BAC contigs are two factors that can explain this map inflation. The contig information of both physical maps was united in a single table that describes hybrid potato physical map.

CONCLUSIONS

The AFLP physical map has already been used by the Potato Genome Sequencing Consortium for sequencing 10% of the heterozygous genome of clone RH on a BAC-by-BAC basis. By layering a new WGP physical map on top of the AFLP physical map, a genetically anchored genome-wide framework of 322434 sequence tags has been created. This reference framework can be used for anchoring and ordering of genomic sequences of clone RH (and other potato genotypes), and opens the possibility to finish sequencing of the RH genome in a more efficient way via high throughput next generation approaches.

Collapse

Zuccolo A, Bowers JE, Estill JC, Xiong Z, Luo M, Sebastian A, Goicoechea JL, Collura K, Yu Y, Jiao Y, Duarte J, Tang H, Ayyampalayam S, Rounsley S, Kudrna D, Paterson AH, Pires JC, Chanderbali A, Soltis DE, Chamala S, Barbazuk B, Soltis PS, Albert VA, Ma H, Mandoli D, Banks J, Carlson JE, Tomkins J, dePamphilis CW, Wing RA, Leebens-Mack J. A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure. Genome Biol 2011;12:R48. [PMID: 21619600 PMCID: PMC3219971 DOI: 10.1186/gb-2011-12-5-r48] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2010] [Revised: 05/19/2011] [Accepted: 05/27/2011] [Indexed: 01/19/2023] Open

Palti Y, Genet C, Luo MC, Charlet A, Gao G, Hu Y, Castaño-Sánchez C, Tabet-Canale K, Krieg F, Yao J, Vallejo RL, Rexroad CE. A first generation integrated map of the rainbow trout genome. BMC Genomics 2011;12:180. [PMID: 21473775 PMCID: PMC3079668 DOI: 10.1186/1471-2164-12-180] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Accepted: 04/07/2011] [Indexed: 01/13/2023] Open

Abstract

Background

Rainbow trout (Oncorhynchus mykiss) are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL) have been identified for production and life-history traits in rainbow trout. An integrated physical and genetic map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS) programs for improving rainbow trout aquaculture production.

Results

The first generation integrated map of the rainbow trout genome is composed of 238 BAC contigs anchored to chromosomes of the genetic map. It covers more than 10% of the genome across segments from all 29 chromosomes. Anchoring of 203 contigs to chromosomes of the National Center for Cool and Cold Water Aquaculture (NCCCWA) genetic map was achieved through mapping of 288 genetic markers derived from BAC end sequences (BES), screening of the BAC library with previously mapped markers and matching of SNPs with BES reads. In addition, 35 contigs were anchored to linkage groups of the INRA (French National Institute of Agricultural Research) genetic map through markers that were not informative for linkage analysis in the NCCCWA mapping panel. The ratio of physical to genetic linkage distances varied substantially among chromosomes and BAC contigs with an average of 3,033 Kb/cM.

Conclusions

The integrated map described here provides a framework for a robust composite genome map for rainbow trout. This resource is needed for genomic analyses in this research model and economically important species and will facilitate comparative genome mapping with other salmonids and with model fish species. This resource will also facilitate efforts to assemble a whole-genome reference sequence for rainbow trout.

Collapse

van Oeveren J, de Ruiter M, Jesse T, van der Poel H, Tang J, Yalcin F, Janssen A, Volpin H, Stormo KE, Bogden R, van Eijk MJT, Prins M. Sequence-based physical mapping of complex genomes by whole genome profiling. Genome Res 2011;21:618-25. [PMID: 21324881 DOI: 10.1101/gr.112094.110] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Frenkel Z, Paux E, Mester D, Feuillet C, Korol A. LTC: a novel algorithm to improve the efficiency of contig assembly for physical mapping in complex genomes. BMC Bioinformatics 2010;11:584. [PMID: 21118513 PMCID: PMC3098104 DOI: 10.1186/1471-2105-11-584] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 11/30/2010] [Indexed: 11/25/2022] Open

Abstract

Background

Physical maps are the substrate of genome sequencing and map-based cloning and their construction relies on the accurate assembly of BAC clones into large contigs that are then anchored to genetic maps with molecular markers. High Information Content Fingerprinting has become the method of choice for large and repetitive genomes such as those of maize, barley, and wheat. However, the high level of repeated DNA present in these genomes requires the application of very stringent criteria to ensure a reliable assembly with the FingerPrinted Contig (FPC) software, which often results in short contig lengths (of 3-5 clones before merging) as well as an unreliable assembly in some difficult regions. Difficulties can originate from a non-linear topological structure of clone overlaps, low power of clone ordering algorithms, and the absence of tools to identify sources of gaps in Minimal Tiling Paths (MTPs).

Results

To address these problems, we propose a novel approach that: (i) reduces the rate of false connections and Q-clones by using a new cutoff calculation method; (ii) obtains reliable clusters robust to the exclusion of single clone or clone overlap; (iii) explores the topological contig structure by considering contigs as networks of clones connected by significant overlaps; (iv) performs iterative clone clustering combined with ordering and order verification using re-sampling methods; and (v) uses global optimization methods for clone ordering and Band Map construction. The elements of this new analytical framework called Linear Topological Contig (LTC) were applied on datasets used previously for the construction of the physical map of wheat chromosome 3B with FPC. The performance of LTC vs. FPC was compared also on the simulated BAC libraries based on the known genome sequences for chromosome 1 of rice and chromosome 1 of maize.

Conclusions

The results show that compared to other methods, LTC enables the construction of highly reliable and longer contigs (5-12 clones before merging), the detection of "weak" connections in contigs and their "repair", and the elongation of contigs obtained by other assembly methods.

Collapse

A first generation BAC-based physical map of the Asian seabass (Lates calcarifer). PLoS One 2010;5:e11974. [PMID: 20700486 PMCID: PMC2916840 DOI: 10.1371/journal.pone.0011974] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2010] [Accepted: 07/12/2010] [Indexed: 11/19/2022] Open

Scalabrin S, Troggio M, Moroldo M, Pindo M, Felice N, Coppola G, Prete G, Malacarne G, Marconi R, Faes G, Jurman I, Grando S, Jesse T, Segala C, Valle G, Policriti A, Fontana P, Morgante M, Velasco R. Physical mapping in highly heterozygous genomes: a physical contig map of the Pinot Noir grapevine cultivar. BMC Genomics 2010;11:204. [PMID: 20346114 PMCID: PMC2865496 DOI: 10.1186/1471-2164-11-204] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2008] [Accepted: 03/26/2010] [Indexed: 11/10/2022] Open

Luo MC, Ma Y, You FM, Anderson OD, Kopecký D, Simková H, Safár J, Dolezel J, Gill B, McGuire PE, Dvorak J. Feasibility of physical map construction from fingerprinted bacterial artificial chromosome libraries of polyploid plant species. BMC Genomics 2010;11:122. [PMID: 20170511 PMCID: PMC2836288 DOI: 10.1186/1471-2164-11-122] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2009] [Accepted: 02/19/2010] [Indexed: 01/30/2023] Open

Abstract

BACKGROUND

The presence of closely related genomes in polyploid species makes the assembly of total genomic sequence from shotgun sequence reads produced by the current sequencing platforms exceedingly difficult, if not impossible. Genomes of polyploid species could be sequenced following the ordered-clone sequencing approach employing contigs of bacterial artificial chromosome (BAC) clones and BAC-based physical maps. Although BAC contigs can currently be constructed for virtually any diploid organism with the SNaPshot high-information-content-fingerprinting (HICF) technology, it is currently unknown if this is also true for polyploid species. It is possible that BAC clones from orthologous regions of homoeologous chromosomes would share numerous restriction fragments and be therefore included into common contigs. Because of this and other concerns, physical mapping utilizing the SNaPshot HICF of BAC libraries of polyploid species has not been pursued and the possibility of doing so has not been assessed. The sole exception has been in common wheat, an allohexaploid in which it is possible to construct single-chromosome or single-chromosome-arm BAC libraries from DNA of flow-sorted chromosomes and bypass the obstacles created by polyploidy.

RESULTS

The potential of the SNaPshot HICF technology for physical mapping of polyploid plants utilizing global BAC libraries was evaluated by assembling contigs of fingerprinted clones in an in silico merged BAC library composed of single-chromosome libraries of two wheat homoeologous chromosome arms, 3AS and 3DS, and complete chromosome 3B. Because the chromosome arm origin of each clone was known, it was possible to estimate the fidelity of contig assembly. On average 97.78% or more clones, depending on the library, were from a single chromosome arm. A large portion of the remaining clones was shown to be library contamination from other chromosomes, a feature that is unavoidable during the construction of single-chromosome BAC libraries.

CONCLUSIONS

The negligibly low level of incorporation of clones from homoeologous chromosome arms into a contig during contig assembly suggested that it is feasible to construct contigs and physical maps using global BAC libraries of wheat and almost certainly also of other plant polyploid species with genome sizes comparable to that of wheat. Because of the high purity of the resulting assembled contigs, they can be directly used for genome sequencing. It is currently unknown but possible that equally good BAC contigs can be also constructed for polyploid species containing smaller, more gene-rich genomes.

Collapse

Zhou S, Wei F, Nguyen J, Bechner M, Potamousis K, Goldstein S, Pape L, Mehan MR, Churas C, Pasternak S, Forrest DK, Wise R, Ware D, Wing RA, Waterman MS, Livny M, Schwartz DC. A single molecule scaffold for the maize genome. PLoS Genet 2009;5:e1000711. [PMID: 19936062 PMCID: PMC2774507 DOI: 10.1371/journal.pgen.1000711] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 10/05/2009] [Indexed: 11/18/2022] Open

Abstract

About 85% of the maize genome consists of highly repetitive sequences that are interspersed by low-copy, gene-coding sequences. The maize community has dealt with this genomic complexity by the construction of an integrated genetic and physical map (iMap), but this resource alone was not sufficient for ensuring the quality of the current sequence build. For this purpose, we constructed a genome-wide, high-resolution optical map of the maize inbred line B73 genome containing >91,000 restriction sites (averaging 1 site/∼23 kb) accrued from mapping genomic DNA molecules. Our optical map comprises 66 contigs, averaging 31.88 Mb in size and spanning 91.5% (2,103.93 Mb/∼2,300 Mb) of the maize genome. A new algorithm was created that considered both optical map and unfinished BAC sequence data for placing 60/66 (2,032.42 Mb) optical map contigs onto the maize iMap. The alignment of optical maps against numerous data sources yielded comprehensive results that proved revealing and productive. For example, gaps were uncovered and characterized within the iMap, the FPC (fingerprinted contigs) map, and the chromosome-wide pseudomolecules. Such alignments also suggested amended placements of FPC contigs on the maize genetic map and proactively guided the assembly of chromosome-wide pseudomolecules, especially within complex genomic regions. Lastly, we think that the full integration of B73 optical maps with the maize iMap would greatly facilitate maize sequence finishing efforts that would make it a valuable reference for comparative studies among cereals, or other maize inbred lines and cultivars.

The maize genome contains abundant repeats interspersed by low-copy, gene-coding sequences that make it a challenge to sequence; consequently, current BAC sequence assemblies average 11 contigs per clone. The iMap deals with such complexity by the judicious integration of IBM genetic and B73 physical maps, but the B73 genome structure could differ from the IBM population because of genetic recombination and subsequent rearrangements. Accordingly, we report a genome-wide, high-resolution optical map of maize B73 genome that was constructed from the direct analysis of genomic DNA molecules without using genetic markers. The integration of optical and iMap resources with comparisons to FPC maps enabled a uniquely comprehensive and scalable assessment of a given BAC's sequence assembly, its placement within a FPC contig, and the location of this FPC contig within a chromosome-wide pseudomolecule. As such, the overall utility of the maize optical map for the validation of sequence assemblies has been significant and demonstrates the inherent advantages of single molecule platforms. Construction of the maize optical map represents the first physical map of a eukaryotic genome larger than 400 Mb that was created de novo from individual genomic DNA molecules.

Collapse

Affiliation(s)

Shiguo Zhou Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Fusheng Wei Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
John Nguyen Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
Mike Bechner Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Konstantinos Potamousis Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Steve Goldstein Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Louise Pape Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Michael R. Mehan Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
Chris Churas Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Shiran Pasternak Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
Dan K. Forrest Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
Roger Wise Corn Insects and Crop Genetics Research, United States Department of Agriculture–Agricultural Research Service and Department of Plant Pathology, Iowa State University, Ames, Iowa, United States of America
Doreen Ware Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America Plant, Soil, and Nutrition Research, United States Department of Agriculture–Agricultural Research Service, Ithaca, New York, United States of America
Rod A. Wing Department of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America
Michael S. Waterman Departments of Mathematics, Biology, and Computer Science, University of Southern California, Los Angeles, California, United States of America
Miron Livny Computer Sciences Department, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
David C. Schwartz Laboratory for Molecular and Computational Genomics, Department of Chemistry, Laboratory of Genetics, UW Biotechnology Center, University of Wisconsin–Madison, Madison, Wisconsin, United States of America * E-mail:

Collapse

The physical and genetic framework of the maize B73 genome. PLoS Genet 2009;5:e1000715. [PMID: 19936061 PMCID: PMC2774505 DOI: 10.1371/journal.pgen.1000715] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2009] [Accepted: 10/12/2009] [Indexed: 11/19/2022] Open

Abstract

Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map (∼1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).

Maize has been a cultural icon and staple food crop of Americans since the discovery of the new world in 1492. Contemporary society is now faced with growing demands for food and fuel in the face of global climate change and the potential for increased disease pressure. To provide a comprehensive foundation to systematically understand maize biology with the goal of breeding higher yielding, disease-resistant, and drought-tolerant cultivars, our consortium sequenced the B73 genome of maize. In this study, we used a comprehensive physical and genetic framework map to develop a minimum tiling path (MTP) of over 16,000 BAC clones across the genome. The MTP was generated dynamically and integrated numerous data types, such as in-coming genome sequence, over 8,000 sequence-based genetic markers, and the maize optical map. This allowed us to genetically anchor, order, and orient the majority of the maize physical map and genome sequence to the genetic map. Post-genome sequencing, we constructed a golden path (AGP) of sequence-based pseudomolecules representing the ten chromosomes of the maize B73 genome (B73 RefGen_v1). This unprecedented integration of genetic, physical, and genomic sequence into one framework will greatly facilitate all aspects of plant biological research.

Collapse

Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs. PLoS Genet 2009;5:e1000740. [PMID: 19936069 PMCID: PMC2774520 DOI: 10.1371/journal.pgen.1000740] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2009] [Accepted: 10/24/2009] [Indexed: 11/29/2022] Open

Abstract

Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5′ and 3′ UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org).

To complement the completion of sequencing the maize B73 genome, we sequenced 27,455 full-length cDNAs (FLcDNA) from two maize B73 libraries representing the gene transcripts from most tissues and common abiotic stress conditions. The FLcDNAs are beneficial in determining the exon/intron structure of genes by aligning them to the sequenced genome; 94% of our FLcDNAs aligned to the maize genome. The 27,455 FLcDNAs were compared to gene sequences for rice, sorghum, Arabidopsis, and poplar; 22,874 were found in all four sets, and 1,737 were unique to maize. Two-thirds of the maize genome is composed of a type of repetitive sequence called “transposable elements”; only 5.6% of the FLcDNA sequence contained any segment homologous to these repeats. In addition to our set, there are three other sets of maize FLcDNAs for a total of 69,306 gene transcripts, where many of them are from different maize lines (i.e. FLcDNAs often have only slight differences reflecting divergence). We assembled these together using parameters that would allow most alleles and recently diverged gene transcripts to align together, resulting in 46,739 unique gene transcripts.

Collapse

Gu YQ, Ma Y, Huo N, Vogel JP, You FM, Lazo GR, Nelson WM, Soderlund C, Dvorak J, Anderson OD, Luo MC. A BAC-based physical map of Brachypodium distachyon and its comparative analysis with rice and wheat. BMC Genomics 2009;10:496. [PMID: 19860896 PMCID: PMC2774330 DOI: 10.1186/1471-2164-10-496] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Accepted: 10/27/2009] [Indexed: 11/13/2022] Open

Abstract

Background

Brachypodium distachyon (Brachypodium) has been recognized as a new model species for comparative and functional genomics of cereal and bioenergy crops because it possesses many biological attributes desirable in a model, such as a small genome size, short stature, self-pollinating habit, and short generation cycle. To maximize the utility of Brachypodium as a model for basic and applied research it is necessary to develop genomic resources for it. A BAC-based physical map is one of them. A physical map will facilitate analysis of genome structure, comparative genomics, and assembly of the entire genome sequence.

Results

A total of 67,151 Brachypodium BAC clones were fingerprinted with the SNaPshot HICF fingerprinting method and a genome-wide physical map of the Brachypodium genome was constructed. The map consisted of 671 contigs and 2,161 clones remained as singletons. The contigs and singletons spanned 414 Mb. A total of 13,970 gene-related sequences were detected in the BAC end sequences (BES). These gene tags aligned 345 contigs with 336 Mb of rice genome sequence, showing that Brachypodium and rice genomes are generally highly colinear. Divergent regions were mainly in the rice centromeric regions. A dot-plot of Brachypodium contigs against the rice genome sequences revealed remnants of the whole-genome duplication caused by paleotetraploidy, which were previously found in rice and sorghum. Brachypodium contigs were anchored to the wheat deletion bin maps with the BES gene-tags, opening the door to Brachypodium-Triticeae comparative genomics.

Conclusion

The construction of the Brachypodium physical map, and its comparison with the rice genome sequence demonstrated the utility of the SNaPshot-HICF method in the construction of BAC-based physical maps. The map represents an important genomic resource for the completion of Brachypodium genome sequence and grass comparative genomics. A draft of the physical map and its comparisons with rice and wheat are available at .

Collapse

Palti Y, Luo MC, Hu Y, Genet C, You FM, Vallejo RL, Thorgaard GH, Wheeler PA, Rexroad CE. A first generation BAC-based physical map of the rainbow trout genome. BMC Genomics 2009;10:462. [PMID: 19814815 PMCID: PMC2763887 DOI: 10.1186/1471-2164-10-462] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2009] [Accepted: 10/08/2009] [Indexed: 01/09/2023] Open

Abstract

Background

Rainbow trout (Oncorhynchus mykiss) are the most-widely cultivated cold freshwater fish in the world and an important model species for many research areas. Coupling great interest in this species as a research model with the need for genetic improvement of aquaculture production efficiency traits justifies the continued development of genomics research resources. Many quantitative trait loci (QTL) have been identified for production and life-history traits in rainbow trout. A bacterial artificial chromosome (BAC) physical map is needed to facilitate fine mapping of QTL and the selection of positional candidate genes for incorporation in marker-assisted selection (MAS) for improving rainbow trout aquaculture production. This resource will also facilitate efforts to obtain and assemble a whole-genome reference sequence for this species.

Results

The physical map was constructed from DNA fingerprinting of 192,096 BAC clones using the 4-color high-information content fingerprinting (HICF) method. The clones were assembled into physical map contigs using the finger-printing contig (FPC) program. The map is composed of 4,173 contigs and 9,379 singletons. The total number of unique fingerprinting fragments (consensus bands) in contigs is 1,185,157, which corresponds to an estimated physical length of 2.0 Gb. The map assembly was validated by 1) comparison with probe hybridization results and agarose gel fingerprinting contigs; and 2) anchoring large contigs to the microsatellite-based genetic linkage map.

Conclusion

The production and validation of the first BAC physical map of the rainbow trout genome is described in this paper. We are currently integrating this map with the NCCCWA genetic map using more than 200 microsatellites isolated from BAC end sequences and by identifying BACs that harbor more than 300 previously mapped markers. The availability of an integrated physical and genetic map will enable detailed comparative genome analyses, fine mapping of QTL, positional cloning, selection of positional candidate genes for economically important traits and the incorporation of MAS into rainbow trout breeding programs.

Collapse

Bozdag S, Close TJ, Lonardi S. A compartmentalized approach to the assembly of physical maps. BMC Bioinformatics 2009;10:217. [PMID: 19604400 PMCID: PMC2717093 DOI: 10.1186/1471-2105-10-217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 07/15/2009] [Indexed: 12/30/2022] Open

Aggarwal R, Benatti TR, Gill N, Zhao C, Chen MS, Fellers JP, Schemerhorn BJ, Stuart JJ. A BAC-based physical map of the Hessian fly genome anchored to polytene chromosomes. BMC Genomics 2009;10:293. [PMID: 19573234 PMCID: PMC2709663 DOI: 10.1186/1471-2164-10-293] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2009] [Accepted: 07/02/2009] [Indexed: 11/27/2022] Open

Abstract

Background

The Hessian fly (Mayetiola destructor) is an important insect pest of wheat. It has tractable genetics, polytene chromosomes, and a small genome (158 Mb). Investigation of the Hessian fly presents excellent opportunities to study plant-insect interactions and the molecular mechanisms underlying genome imprinting and chromosome elimination. A physical map is needed to improve the ability to perform both positional cloning and comparative genomic analyses with the fully sequenced genomes of other dipteran species.

Results

An FPC-based genome wide physical map of the Hessian fly was constructed and anchored to the insect's polytene chromosomes. Bacterial artificial chromosome (BAC) clones corresponding to 12-fold coverage of the Hessian fly genome were fingerprinted, using high information content fingerprinting (HIFC) methodology, and end-sequenced. Fluorescence in situ hybridization (FISH) co-localized two BAC clones from each of the 196 longest contigs on the polytene chromosomes. An additional 70 contigs were positioned using a single FISH probe. The 266 FISH mapped contigs were evenly distributed and covered 60% of the genome (95,668 kb). The ends of the fingerprinted BACs were then sequenced to develop the capacity to create sequenced tagged site (STS) markers on the BACs in the map. Only 3.64% of the BAC-end sequence was composed of transposable elements, helicases, ribosomal repeats, simple sequence repeats, and sequences of low complexity. A relatively large fraction (14.27%) of the BES was comprised of multi-copy gene sequences. Nearly 1% of the end sequence was composed of simple sequence repeats (SSRs).

Conclusion

This physical map provides the foundation for high-resolution genetic mapping, map-based cloning, and assembly of complete genome sequencing data. The results indicate that restriction fragment length heterogeneity in BAC libraries used to construct physical maps lower the length and the depth of the contigs, but is not an absolute barrier to the successful application of the technology. This map will serve as a genomic resource for accelerating gene discovery, genome sequencing, and the assembly of BAC sequences. The Hessian fly BAC-clone assembly, and the names and positions of the BAC clones used in the FISH experiments are publically available at .

Collapse

Scalabrin S, Morgante M, Policriti A. Automated FingerPrint Background removal: FPB. BMC Bioinformatics 2009;10:127. [PMID: 19405935 PMCID: PMC2689866 DOI: 10.1186/1471-2105-10-127] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2008] [Accepted: 04/30/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The construction of a whole-genome physical map has been an essential component of numerous genome projects initiated since the inception of the Human Genome Project. Its usefulness has been proved for whole-genome shotgun projects as a post-assembly validation and recently it has also been used in the assembly step to constrain on BACs positions. Fingerprinting is usually the method of choice for construction of physical maps. A clone fingerprint is composed of true peaks representing real fragments and background peaks, mainly composed of E. coli genomic DNA, partial digestions, star activity by-products, and machine background. High-throughput fingerprinting leads to the production of thousands of BAC clone fingerprints per day. That is why background peaks removal has become an important issue and needs to be automatized, especially in capillary electrophoresis based fingerprints.

RESULTS

At the moment, the only tools available for such a task are GenoProfiler and its descendant FPMiner. The large variation in the quality of fingerprints that is usually present in large fingerprinting projects represents a major difficulty in the correct removal of background peaks that has only been partially addressed by the methods so far adopted that all require a long manual optimization of parameters. Thus, we implemented a new data-independent tool, FPB (FingerPrint Background removal), suitable for large scale projects as well as mapping of few clones.

CONCLUSION

FPB is freely available at http://www.appliedgenomics.org/tools.php. FPB was used to remove the background from all fingerprints of three grapevine physical map projects. The first project consists of about 50,000 fingerprints, the second one consists of about 70,000 fingerprints, and the third one consists of about 45,000 fingerprints. In all cases a successful assembly was built.

Collapse

Nelson W, Soderlund C. Integrating sequence with FPC fingerprint maps. Nucleic Acids Res 2009;37:e36. [PMID: 19181701 PMCID: PMC2655663 DOI: 10.1093/nar/gkp034] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open

Messing J. Synergy of two reference genomes for the grass family. PLANT PHYSIOLOGY 2009;149:117-24. [PMID: 19126702 PMCID: PMC2613724 DOI: 10.1104/pp.108.128520] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2008] [Accepted: 10/10/2008] [Indexed: 05/19/2023]

Nelson W, Luo M, Ma J, Estep M, Estill J, He R, Talag J, Sisneros N, Kudrna D, Kim H, Ammiraju JSS, Collura K, Bharti AK, Messing J, Wing RA, SanMiguel P, Bennetzen JL, Soderlund C. Methylation-sensitive linking libraries enhance gene-enriched sequencing of complex genomes and map DNA methylation domains. BMC Genomics 2008;9:621. [PMID: 19099592 PMCID: PMC2628917 DOI: 10.1186/1471-2164-9-621] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2008] [Accepted: 12/19/2008] [Indexed: 11/30/2022] Open

Abstract

Background

Many plant genomes are resistant to whole-genome assembly due to an abundance of repetitive sequence, leading to the development of gene-rich sequencing techniques. Two such techniques are hypomethylated partial restriction (HMPR) and methylation spanning linker libraries (MSLL). These libraries differ from other gene-rich datasets in having larger insert sizes, and the MSLL clones are designed to provide reads localized to "epigenetic boundaries" where methylation begins or ends.

Results

A large-scale study in maize generated 40,299 HMPR sequences and 80,723 MSLL sequences, including MSLL clones exceeding 100 kb. The paired end reads of MSLL and HMPR clones were shown to be effective in linking existing gene-rich sequences into scaffolds. In addition, it was shown that the MSLL clones can be used for anchoring these scaffolds to a BAC-based physical map. The MSLL end reads effectively identified epigenetic boundaries, as indicated by their preferential alignment to regions upstream and downstream from annotated genes. The ability to precisely map long stretches of fully methylated DNA sequence is a unique outcome of MSLL analysis, and was also shown to provide evidence for errors in gene identification. MSLL clones were observed to be significantly more repeat-rich in their interiors than in their end reads, confirming the correlation between methylation and retroelement content. Both MSLL and HMPR reads were found to be substantially gene-enriched, with the SalI MSLL libraries being the most highly enriched (31% align to an EST contig), while the HMPR clones exhibited exceptional depletion of repetitive DNA (to ~11%). These two techniques were compared with other gene-enrichment methods, and shown to be complementary.

Conclusion

MSLL technology provides an unparalleled approach for mapping the epigenetic status of repetitive blocks and for identifying sequences mis-identified as genes. Although the types and natures of epigenetic boundaries are barely understood at this time, MSLL technology flags both approximate boundaries and methylated genes that deserve additional investigation. MSLL and HMPR sequences provide a valuable resource for maize genome annotation, and are a uniquely valuable complement to any plant genome sequencing project. In order to make these results fully accessible to the community, a web display was developed that shows the alignment of MSLL, HMPR, and other gene-rich sequences to the BACs; this display is continually updated with the latest ESTs and BAC sequences.

Collapse

Holding DR, Larkins BA. Zein Storage Proteins. MOLECULAR GENETIC APPROACHES TO MAIZE IMPROVEMENT 2008. [DOI: 10.1007/978-3-540-68922-5_19] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Kurtz S, Narechania A, Stein JC, Ware D. A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics 2008;9:517. [PMID: 18976482 PMCID: PMC2613927 DOI: 10.1186/1471-2164-9-517] [Citation(s) in RCA: 168] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2008] [Accepted: 10/31/2008] [Indexed: 12/02/2022] Open

Abstract

Background

The challenges of accurate gene prediction and enumeration are further aggravated in large genomes that contain highly repetitive transposable elements (TEs). Yet TEs play a substantial role in genome evolution and are themselves an important subject of study. Repeat annotation, based on counting occurrences of k-mers, has been previously used to distinguish TEs from low-copy genic regions; but currently available software solutions are impractical due to high memory requirements or specialization for specific user-tasks.

Results

Here we introduce the Tallymer software, a flexible and memory-efficient collection of programs for k-mer counting and indexing of large sequence sets. Unlike previous methods, Tallymer is based on enhanced suffix arrays. This gives a much larger flexibility concerning the choice of the k-mer size. Tallymer can process large data sizes of several billion bases. We used it in a variety of applications to study the genomes of maize and other plant species. In particular, Tallymer was used to index a set of whole genome shotgun sequences from maize (B73) (total size 10⁹bp.). We analyzed k-mer frequencies for a wide range of k. At this low genome coverage (≈ 0.45×) highly repetitive 20-mers constituted 44% of the genome but represented only 1% of all possible k-mers. Similar low-complexity was seen in the repeat fractions of sorghum and rice. When applying our method to other maize data sets, High-C₀t derived sequences showed the greatest enrichment for low-copy sequences. Among annotated TEs, the most highly repetitive were of the Ty3/gypsy class of retrotransposons, followed by the Ty1/copia class, and DNA transposons. Among expressed sequence tags (EST), a notable fraction contained high-copy k-mers, suggesting that transposons are still active in maize. Retrotransposons in Mo17 and McC cultivars were readily detected using the B73 20-mer frequency index, indicating their conservation despite extensive rearrangement across cultivars. Among one hundred annotated bacterial artificial chromosomes (BACs), k-mer frequency could be used to detect transposon-encoded genes with 92% sensitivity, compared to 96% using alignment-based repeat masking, while both methods showed 92% specificity.

Conclusion

The Tallymer software was effective in a variety of applications to aid genome annotation in maize, despite limitations imposed by the relatively low coverage of sequence available. For more information on the software, see .

Collapse

Moroldo M, Paillard S, Marconi R, Fabrice L, Canaguier A, Cruaud C, De Berardinis V, Guichard C, Brunaud V, Le Clainche I, Scalabrin S, Testolin R, Di Gaspero G, Morgante M, Adam-Blondon AF. A physical map of the heterozygous grapevine 'Cabernet Sauvignon' allows mapping candidate genes for disease resistance. BMC PLANT BIOLOGY 2008;8:66. [PMID: 18554400 PMCID: PMC2442077 DOI: 10.1186/1471-2229-8-66] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2008] [Accepted: 06/13/2008] [Indexed: 05/18/2023]

Abstract

BACKGROUND

Whole-genome physical maps facilitate genome sequencing, sequence assembly, mapping of candidate genes, and the design of targeted genetic markers. An automated protocol was used to construct a Vitis vinifera 'Cabernet Sauvignon' physical map. The quality of the result was addressed with regard to the effect of high heterozygosity on the accuracy of contig assembly. Its usefulness for the genome-wide mapping of genes for disease resistance, which is an important trait for grapevine, was then assessed.

RESULTS

The physical map included 29,727 BAC clones assembled into 1,770 contigs, spanning 715,684 kbp, and corresponding to 1.5-fold the genome size. Map inflation was due to high heterozygosity, which caused either the separation of allelic BACs in two different contigs, or local mis-assembly in contigs containing BACs from the two haplotypes. Genetic markers anchored 395 contigs or 255,476 kbp to chromosomes. The fully automated assembly and anchorage procedures were validated by BAC-by-BAC blast of the end sequences against the grape genome sequence, unveiling 7.3% of chimerical contigs. The distribution across the physical map of candidate genes for non-host and host resistance, and for defence signalling pathways was then studied. NBS-LRR and RLK genes for host resistance were found in 424 contigs, 133 of them (32%) were assigned to chromosomes, on which they are mostly organised in clusters. Non-host and defence signalling genes were found in 99 contigs dispersed without a discernable pattern across the genome.

CONCLUSION

Despite some limitations that interfere with the correct assembly of heterozygous clones into contigs, the 'Cabernet Sauvignon' physical map is a useful and reliable intermediary step between a genetic map and the genome sequence. This tool was successfully exploited for a quick mapping of complex families of genes, and it strengthened previous clues of co-localisation of major NBS-LRR clusters and disease resistance loci in grapevine.

Collapse

Affiliation(s)

Marco Moroldo UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France
Sophie Paillard UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France UMR118, INRA-Agrocampus, University of Rennes, Amélioration des Plantes et Biotechnologies Végétales, F-35650 Le Rheu, France
Raffaella Marconi Dipartimento di Scienze Agrarie e Ambientali, University of Udine, via delle Scienze 208, 33100 Udine, Italy
Legeai Fabrice Unité de Recherche Génomique-Info, URGI, Tour Evry 2, 523, Place des Terrasses de l'Agora, 91034 Evry Cedex, France
Aurelie Canaguier UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France
Corinne Cruaud Gnoscope, 2, rue Gaston Crémieux, CP5706, 91057 Evry Cedex, France
Veronique De Berardinis Gnoscope, 2, rue Gaston Crémieux, CP5706, 91057 Evry Cedex, France
Cecile Guichard UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France
Veronique Brunaud UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France
Isabelle Le Clainche UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France
Simone Scalabrin Dipartimento di Scienze Matematiche, University of Udine, via delle Scienze 208, 33100 Udine, Italy Istituto di Genomica Applicata, Parco Scientifico e Tecnologico Luigi Danieli, via Jacopo Linussio 51, 33100 Udine, Italy
Raffaele Testolin Dipartimento di Scienze Agrarie e Ambientali, University of Udine, via delle Scienze 208, 33100 Udine, Italy Istituto di Genomica Applicata, Parco Scientifico e Tecnologico Luigi Danieli, via Jacopo Linussio 51, 33100 Udine, Italy
Gabriele Di Gaspero Dipartimento di Scienze Agrarie e Ambientali, University of Udine, via delle Scienze 208, 33100 Udine, Italy Istituto di Genomica Applicata, Parco Scientifico e Tecnologico Luigi Danieli, via Jacopo Linussio 51, 33100 Udine, Italy
Michele Morgante Dipartimento di Scienze Agrarie e Ambientali, University of Udine, via delle Scienze 208, 33100 Udine, Italy Istituto di Genomica Applicata, Parco Scientifico e Tecnologico Luigi Danieli, via Jacopo Linussio 51, 33100 Udine, Italy
Anne-Francoise Adam-Blondon UMR de Génomique Végétale, INRA-CNRS-UEVE, 2, Rue Gaston Crémieux, CP5708, 91057 Evry Cedex, France

Collapse

Mun JH, Kwon SJ, Yang TJ, Kim HS, Choi BS, Baek S, Kim JS, Jin M, Kim JA, Lim MH, Lee SI, Kim HI, Kim H, Lim YP, Park BS. The first generation of a BAC-based physical map of Brassica rapa. BMC Genomics 2008;9:280. [PMID: 18549474 PMCID: PMC2432078 DOI: 10.1186/1471-2164-9-280] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Accepted: 06/12/2008] [Indexed: 11/30/2022] Open

Wei F, Coe E, Nelson W, Bharti AK, Engler F, Butler E, Kim H, Goicoechea JL, Chen M, Lee S, Fuks G, Sanchez-Villeda H, Schroeder S, Fang Z, McMullen M, Davis G, Bowers JE, Paterson AH, Schaeffer M, Gardiner J, Cone K, Messing J, Soderlund C, Wing RA. Physical and genetic structure of the maize genome reflects its complex evolutionary history. PLoS Genet 2008;3:e123. [PMID: 17658954 PMCID: PMC1934398 DOI: 10.1371/journal.pgen.0030123] [Citation(s) in RCA: 228] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2007] [Accepted: 06/11/2007] [Indexed: 11/21/2022] Open

Abstract

Maize (Zea mays L.) is one of the most important cereal crops and a model for the study of genetics, evolution, and domestication. To better understand maize genome organization and to build a framework for genome sequencing, we constructed a sequence-ready fingerprinted contig-based physical map that covers 93.5% of the genome, of which 86.1% is aligned to the genetic map. The fingerprinted contig map contains 25,908 genic markers that enabled us to align nearly 73% of the anchored maize genome to the rice genome. The distribution pattern of expressed sequence tags correlates to that of recombination. In collinear regions, 1 kb in rice corresponds to an average of 3.2 kb in maize, yet maize has a 6-fold genome size expansion. This can be explained by the fact that most rice regions correspond to two regions in maize as a result of its recent polyploid origin. Inversions account for the majority of chromosome structural variations during subsequent maize diploidization. We also find clear evidence of ancient genome duplication predating the divergence of the progenitors of maize and rice. Reconstructing the paleoethnobotany of the maize genome indicates that the progenitors of modern maize contained ten chromosomes.

As a cash crop and a model biological system, maize is of great public interest. To facilitate maize molecular breeding and its basic biology research, we built a high-resolution physical map with two different fingerprinting methods on the same set of bacterial artificial chromosome clones. The physical map was integrated to a high-density genetic map and further serves as a framework for the maize genome-sequencing project. Comparative genomics showed that the euchromatic regions between rice and maize are very conserved. Physically we delimited these conserved regions and thus detected many genome rearrangements. We defined extensively the duplication blocks within the maize genome. These blocks allowed us to reconstruct the chromosomes of the maize progenitor. We detected that maize genome has experienced two rounds of genome duplications, an ancient one before maize–rice divergence and a recent one after tetraploidization.

Collapse

Affiliation(s)

Fusheng Wei Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
Ed Coe Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
William Nelson Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
Arvind K Bharti Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
Fred Engler Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America
Ed Butler Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
HyeRan Kim Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
Jose Luis Goicoechea Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
Mingsheng Chen Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
Seunghee Lee Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America
Galina Fuks Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
Hector Sanchez-Villeda Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
Steven Schroeder Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
Zhiwei Fang Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
Michael McMullen Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
Georgia Davis Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
John E Bowers Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
Andrew H Paterson Plant Genome Mapping Laboratory, Departments of Crop and Soil Science, Plant Biology, and Genetics, University of Georgia, Athens, Georgia, United States of America
Mary Schaeffer Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America Plant Genetics Research Unit, Agricultural Research Service, United States Department of Agriculture, Columbia, Missouri, United States of America
Jack Gardiner Division of Plant Sciences, University of Missouri, Columbia, Missouri, United States of America
Karen Cone Division of Biological Sciences, University of Missouri, Columbia, Missouri, Arizona, United States of America
Joachim Messing Plant Genome Initiative at Rutgers, Waksman Institute, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States of America
Carol Soderlund Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America Arizona Genomics Computational Laboratory, University of Arizona, Tucson, Arizona, United States of America * To whom correspondence should be addressed. E-mail: (CS); (RAW)
Rod A Wing Arizona Genomics Institute, University of Arizona, Tucson, Arizona, United States of America Department of Plant Sciences, University of Arizona, Tucson, Arizona, United States of America BIO5 Institute, University of Arizona, Tucson, Arizona, United States of America * To whom correspondence should be addressed. E-mail: (CS); (RAW)

Collapse

Mathewson CA, Schein JE, Marra MA. Large-scale BAC clone restriction digest fingerprinting. ACTA ACUST UNITED AC 2008;Chapter 5:Unit 5.19. [PMID: 18428413 DOI: 10.1002/0471142905.hg0519s53] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Chanderbali AS, Albert VA, Ashworth VETM, Clegg MT, Litz RE, Soltis DE, Soltis PS. Persea americana (avocado): bringing ancient flowers to fruit in the genomics era. Bioessays 2008;30:386-96. [PMID: 18348249 DOI: 10.1002/bies.20721] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Choi JH, Kim S, Tang H, Andrews J, Gilbert DG, Colbourne JK. A machine-learning approach to combined evidence validation of genome assemblies. ACTA ACUST UNITED AC 2008;24:744-50. [PMID: 18204064 DOI: 10.1093/bioinformatics/btm608] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Abstract

MOTIVATION

While it is common to refer to 'the genome sequence' as if it were a single, complete and contiguous DNA string, it is in fact an assembly of millions of small, partially overlapping DNA fragments. Sophisticated computer algorithms (assemblers and scaffolders) merge these DNA fragments into contigs, and place these contigs into sequence scaffolds using the paired-end sequences derived from large-insert DNA libraries. Each step in this automated process is susceptible to producing errors; hence, the resulting draft assembly represents (in practice) only a likely assembly that requires further validation. Knowing which parts of the draft assembly are likely free of errors is critical if researchers are to draw reliable conclusions from the assembled sequence data.

RESULTS

We develop a machine-learning method to detect assembly errors in sequence assemblies. Several in silico measures for assembly validation have been proposed by various researchers. Using three benchmarking Drosophila draft genomes, we evaluate these techniques along with some new measures that we propose, including the good-minus-bad coverage (GMB), the good-to-bad-ratio (RGB), the average Z-score (AZ) and the average absolute Z-score (ASZ). Our results show that the GMB measure performs better than the others in both its sensitivity and its specificity for assembly error detection. Nevertheless, no single method performs sufficiently well to reliably detect genomic regions requiring attention for further experimental verification. To utilize the advantages of all these measures, we develop a novel machine learning approach that combines these individual measures to achieve a higher prediction accuracy (i.e. greater than 90%). Our combined evidence approach avoids the difficult and often ad hoc selection of many parameters the individual measures require, and significantly improves the overall precisions on the benchmarking data sets.

Collapse

Han Y, Gasic K, Korban SS. Multiple-copy cluster-type organization and evolution of genes encoding O-methyltransferases in the apple. Genetics 2007;176:2625-35. [PMID: 17717198 PMCID: PMC1950660 DOI: 10.1534/genetics.107.073650] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

A transgenomic cytogenetic sorghum (Sorghum propinquum) bacterial artificial chromosome fluorescence in situ hybridization map of maize (Zea mays L.) pachytene chromosome 9, evidence for regions of genome hyperexpansion. Genetics 2007;177:1509-26. [PMID: 17947405 DOI: 10.1534/genetics.107.080846] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Li Y, Uhm T, Ren C, Wu C, Santos TS, Lee MK, Yan B, Santos F, Zhang A, Scheuring C, Sanchez A, Millena AC, Nguyen HT, Kou H, Liu D, Zhang HB. A plant-transformation-competent BIBAC/BAC-based map of rice for functional analysis and genetic engineering of its genomic sequence. Genome 2007;50:278-88. [PMID: 17502901 DOI: 10.1139/g07-006] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Troggio M, Malacarne G, Coppola G, Segala C, Cartwright DA, Pindo M, Stefanini M, Mank R, Moroldo M, Morgante M, Grando MS, Velasco R. A dense single-nucleotide polymorphism-based genetic linkage map of grapevine (Vitis vinifera L.) anchoring Pinot Noir bacterial artificial chromosome contigs. Genetics 2007;176:2637-50. [PMID: 17603124 PMCID: PMC1950661 DOI: 10.1534/genetics.106.067462] [Citation(s) in RCA: 111] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2006] [Accepted: 06/14/2007] [Indexed: 11/18/2022] Open

Xu P, Wang S, Liu L, Thorsen J, Kucuktas H, Liu Z. A BAC-based physical map of the channel catfish genome. Genomics 2007;90:380-8. [PMID: 17582737 DOI: 10.1016/j.ygeno.2007.05.008] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2007] [Revised: 05/11/2007] [Accepted: 05/16/2007] [Indexed: 01/12/2023]

Kelleher CT, Chiu R, Shin H, Bosdet IE, Krzywinski MI, Fjell CD, Wilkin J, Yin T, DiFazio SP, Ali J, Asano JK, Chan S, Cloutier A, Girn N, Leach S, Lee D, Mathewson CA, Olson T, O'connor K, Prabhu AL, Smailus DE, Stott JM, Tsai M, Wye NH, Yang GS, Zhuang J, Holt RA, Putnam NH, Vrebalov J, Giovannoni JJ, Grimwood J, Schmutz J, Rokhsar D, Jones SJM, Marra MA, Tuskan GA, Bohlmann J, Ellis BE, Ritland K, Douglas CJ, Schein JE. A physical map of the highly heterozygous Populus genome: integration with the genome sequence and genetic map and analysis of haplotype variation. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2007;50:1063-78. [PMID: 17488239 DOI: 10.1111/j.1365-313x.2007.03112.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]

Kim H, San Miguel P, Nelson W, Collura K, Wissotski M, Walling JG, Kim JP, Jackson SA, Soderlund C, Wing RA. Comparative physical mapping between Oryza sativa (AA genome type) and O. punctata (BB genome type). Genetics 2007;176:379-90. [PMID: 17339227 PMCID: PMC1893071 DOI: 10.1534/genetics.106.068783] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2006] [Accepted: 02/09/2007] [Indexed: 11/18/2022] Open

Wendl MC. Algebraic correction methods for computational assessment of clone overlaps in DNA fingerprint mapping. BMC Bioinformatics 2007;8:127. [PMID: 17442113 PMCID: PMC1868038 DOI: 10.1186/1471-2105-8-127] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2007] [Accepted: 04/18/2007] [Indexed: 12/31/2022] Open