Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008;18:821-9. [PMID: 18349386 DOI: 10.1101/gr.074492.107] [Citation(s) in RCA: 7090] [Impact Index Per Article: 443.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Number

Cited by Other Article(s)

7001

Peng Y, Leung HCM, Yiu SM, Chin FYL. IDBA – A Practical Iterative de Bruijn Graph De Novo Assembler. LECTURE NOTES IN COMPUTER SCIENCE 2010. [DOI: 10.1007/978-3-642-12683-3_28] [Citation(s) in RCA: 159] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]

7002

Nagarajan N, Pop M. Sequencing and genome assembly using next-generation technologies. Methods Mol Biol 2010;673:1-17. [PMID: 20835789 DOI: 10.1007/978-1-60761-842-3_1] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

7003

Ng PC, Kirkness EF. Whole genome sequencing. Methods Mol Biol 2010;628:215-26. [PMID: 20238084 DOI: 10.1007/978-1-60327-367-1_12] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

7004

Toward next-generation sequencing of mitochondrial genomes — Focus on parasitic worms of animals and biotechnological implications. Biotechnol Adv 2010;28:151-9. [DOI: 10.1016/j.biotechadv.2009.11.002] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Revised: 10/28/2009] [Accepted: 11/04/2009] [Indexed: 11/21/2022]

7005

Milos PM. Emergence of single-molecule sequencing and potential for molecular diagnostic applications. Expert Rev Mol Diagn 2009;9:659-66. [PMID: 19817551 DOI: 10.1586/erm.09.50] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

7006

Zerbino DR, McEwen GK, Margulies EH, Birney E. Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler. PLoS One 2009;4:e8407. [PMID: 20027311 PMCID: PMC2793427 DOI: 10.1371/journal.pone.0008407] [Citation(s) in RCA: 150] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Accepted: 10/21/2009] [Indexed: 11/22/2022] Open

7007

Genome diversity of Pseudomonas aeruginosa PAO1 laboratory strains. J Bacteriol 2009;192:1113-21. [PMID: 20023018 DOI: 10.1128/jb.01515-09] [Citation(s) in RCA: 189] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

7008

Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J. De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 2009;20:265-72. [PMID: 20019144 DOI: 10.1101/gr.097261.109] [Citation(s) in RCA: 2099] [Impact Index Per Article: 139.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

7009

Cock PJA, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 2009;38:1767-71. [PMID: 20015970 PMCID: PMC2847217 DOI: 10.1093/nar/gkp1137] [Citation(s) in RCA: 968] [Impact Index Per Article: 64.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

7010

Metzker ML. Sequencing technologies — the next generation. Nat Rev Genet 2009;11:31-46. [PMID: 19997069 DOI: 10.1038/nrg2626] [Citation(s) in RCA: 4010] [Impact Index Per Article: 267.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

7011

Parks M, Cronn R, Liston A. Increasing phylogenetic resolution at low taxonomic levels using massively parallel sequencing of chloroplast genomes. BMC Biol 2009;7:84. [PMID: 19954512 PMCID: PMC2793254 DOI: 10.1186/1741-7007-7-84] [Citation(s) in RCA: 358] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2009] [Accepted: 12/02/2009] [Indexed: 11/12/2022] Open

Abstract

BACKGROUND

Molecular evolutionary studies share the common goal of elucidating historical relationships, and the common challenge of adequately sampling taxa and characters. Particularly at low taxonomic levels, recent divergence, rapid radiations, and conservative genome evolution yield limited sequence variation, and dense taxon sampling is often desirable. Recent advances in massively parallel sequencing make it possible to rapidly obtain large amounts of sequence data, and multiplexing makes extensive sampling of megabase sequences feasible. Is it possible to efficiently apply massively parallel sequencing to increase phylogenetic resolution at low taxonomic levels?

RESULTS

We reconstruct the infrageneric phylogeny of Pinus from 37 nearly-complete chloroplast genomes (average 109 kilobases each of an approximately 120 kilobase genome) generated using multiplexed massively parallel sequencing. 30/33 ingroup nodes resolved with > or = 95% bootstrap support; this is a substantial improvement relative to prior studies, and shows massively parallel sequencing-based strategies can produce sufficient high quality sequence to reach support levels originally proposed for the phylogenetic bootstrap. Resampling simulations show that at least the entire plastome is necessary to fully resolve Pinus, particularly in rapidly radiating clades. Meta-analysis of 99 published infrageneric phylogenies shows that whole plastome analysis should provide similar gains across a range of plant genera. A disproportionate amount of phylogenetic information resides in two loci (ycf1, ycf2), highlighting their unusual evolutionary properties.

CONCLUSION

Plastome sequencing is now an efficient option for increasing phylogenetic resolution at lower taxonomic levels in plant phylogenetic and population genetic analyses. With continuing improvements in sequencing capacity, the strategies herein should revolutionize efforts requiring dense taxon and character sampling, such as phylogeographic analyses and species-level DNA barcoding.

Collapse

7012

Sánchez CC, Smith TPL, Wiedmann RT, Vallejo RL, Salem M, Yao J, Rexroad CE. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library. BMC Genomics 2009;10:559. [PMID: 19939274 PMCID: PMC2790473 DOI: 10.1186/1471-2164-10-559] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2009] [Accepted: 11/25/2009] [Indexed: 11/21/2022] Open

Abstract

Background

To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population.

Results

The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the sequences from the validated markers were associated with rainbow trout transcripts.

Conclusion

The use of reduced representation libraries and pyrosequencing technology proved to be an effective strategy for the discovery of a high number of putative SNPs in rainbow trout; however, modifications to the technique to decrease the false discovery rate resulting from the evolutionary recent genome duplication would be desirable.

Collapse

7013

Imelfort M, Edwards D. De novo sequencing of plant genomes using second-generation technologies. Brief Bioinform 2009;10:609-18. [DOI: 10.1093/bib/bbp039] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

7014

Bickel PJ, Brown JB, Huang H, Li Q. An overview of recent developments in genomics and associated statistical methods. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2009;367:4313-37. [PMID: 19805447 DOI: 10.1098/rsta.2009.0164] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]

7015

Pepke S, Wold B, Mortazavi A. Computation for ChIP-seq and RNA-seq studies. Nat Methods 2009;6:S22-32. [PMID: 19844228 DOI: 10.1038/nmeth.1371] [Citation(s) in RCA: 400] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

7016

Leinonen R, Akhtar R, Birney E, Bonfield J, Bower L, Corbett M, Cheng Y, Demiralp F, Faruque N, Goodgame N, Gibson R, Hoad G, Hunter C, Jang M, Leonard S, Lin Q, Lopez R, Maguire M, McWilliam H, Plaister S, Radhakrishnan R, Sobhany S, Slater G, Ten Hoopen P, Valentin F, Vaughan R, Zalunin V, Zerbino D, Cochrane G. Improvements to services at the European Nucleotide Archive. Nucleic Acids Res 2009;38:D39-45. [PMID: 19906712 PMCID: PMC2808951 DOI: 10.1093/nar/gkp998] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

7017

Medvedev P, Brudno M. Maximum likelihood genome assembly. J Comput Biol 2009;16:1101-16. [PMID: 19645596 DOI: 10.1089/cmb.2009.0047] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

7018

Nielsen CB, Jackman SD, Birol I, Jones SJM. ABySS-Explorer: visualizing genome sequence assemblies. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2009;15:881-8. [PMID: 19834150 DOI: 10.1109/tvcg.2009.116] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]

7019

Marguerat S, Bähler J. RNA-seq: from technology to biology. CELLULAR AND MOLECULAR LIFE SCIENCES : CMLS 2009. [PMID: 19859660 DOI: 10.1007/s00018‐009‐0180‐6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

7020

Horner DS, Pavesi G, Castrignano T, De Meo PD, Liuni S, Sammeth M, Picardi E, Pesole G. Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing. Brief Bioinform 2009;11:181-97. [DOI: 10.1093/bib/bbp046] [Citation(s) in RCA: 111] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

7021

Arner E, Hayashizaki Y, Daub CO. NGSView: an extensible open source editor for next-generation sequencing data. Bioinformatics 2009;26:125-6. [PMID: 19855106 PMCID: PMC2796816 DOI: 10.1093/bioinformatics/btp611] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open

7022

Kerstens HHD, Crooijmans RPMA, Veenendaal A, Dibbits BW, Chin-A-Woeng TFC, den Dunnen JT, Groenen MAM. Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey. BMC Genomics 2009;10:479. [PMID: 19835600 PMCID: PMC2772860 DOI: 10.1186/1471-2164-10-479] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Accepted: 10/16/2009] [Indexed: 01/18/2023] Open

Abstract

BACKGROUND

The development of second generation sequencing methods has enabled large scale DNA variation studies at moderate cost. For the high throughput discovery of single nucleotide polymorphisms (SNPs) in species lacking a sequenced reference genome, we set-up an analysis pipeline based on a short read de novo sequence assembler and a program designed to identify variation within short reads. To illustrate the potential of this technique, we present the results obtained with a randomly sheared, enzymatically generated, 2-3 kbp genome fraction of six pooled Meleagris gallopavo (turkey) individuals.

RESULTS

A total of 100 million 36 bp reads were generated, representing approximately 5-6% (approximately 62 Mbp) of the turkey genome, with an estimated sequence depth of 58. Reads consisting of bases called with less than 1% error probability were selected and assembled into contigs. Subsequently, high throughput discovery of nucleotide variation was performed using sequences with more than 90% reliability by using the assembled contigs that were 50 bp or longer as the reference sequence. We identified more than 7,500 SNPs with a high probability of representing true nucleotide variation in turkeys. Increasing the reference genome by adding publicly available turkey BAC-end sequences increased the number of SNPs to over 11,000. A comparison with the sequenced chicken genome indicated that the assembled turkey contigs were distributed uniformly across the turkey genome. Genotyping of a representative sample of 340 SNPs resulted in a SNP conversion rate of 95%. The correlation of the minor allele count (MAC) and observed minor allele frequency (MAF) for the validated SNPs was 0.69.

CONCLUSION

We provide an efficient and cost-effective approach for the identification of thousands of high quality SNPs in species currently lacking a sequenced genome and applied this to turkey. The methodology addresses a random fraction of the genome, resulting in an even distribution of SNPs across the targeted genome.

Collapse

7023

Sense from sequence reads: methods for alignment and assembly. Nat Methods 2009;6:S6-S12. [DOI: 10.1038/nmeth.1376] [Citation(s) in RCA: 266] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

7024

Sudbery I, Stalker J, Simpson JT, Keane T, Rust AG, Hurles ME, Walter K, Lynch D, Teboul L, Brown SD, Li H, Ning Z, Nadeau JH, Croniger CM, Durbin R, Adams DJ. Deep short-read sequencing of chromosome 17 from the mouse strains A/J and CAST/Ei identifies significant germline variation and candidate genes that regulate liver triglyceride levels. Genome Biol 2009;10:R112. [PMID: 19825173 PMCID: PMC2784327 DOI: 10.1186/gb-2009-10-10-r112] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2009] [Revised: 08/26/2009] [Accepted: 10/13/2009] [Indexed: 11/10/2022] Open

7025

Zhou X, Su Z, Sammons RD, Peng Y, Tranel PJ, Stewart CN, Yuan JS. Novel software package for cross-platform transcriptome analysis (CPTRA). BMC Bioinformatics 2009;10 Suppl 11:S16. [PMID: 19811681 PMCID: PMC3226187 DOI: 10.1186/1471-2105-10-s11-s16] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Abstract

Background

Next-generation sequencing techniques enable several novel transcriptome profiling approaches. Recent studies indicated that digital gene expression profiling based on short sequence tags has superior performance as compared to other transcriptome analysis platforms including microarrays. However, the transcriptomic analysis with tag-based methods often depends on available genome sequence. The use of tag-based methods in species without genome sequence should be complemented by other methods such as cDNA library sequencing. The combination of different next generation sequencing techniques like 454 pyrosequencing and Illumina Genome Analyzer (Solexa) will enable high-throughput and accurate global gene expression profiling in species with limited genome information. The combination of transcriptome data acquisition methods requires cross-platform transcriptome data analysis platforms, including a new software package for data processing.

Results

Here we presented a software package, CPTRA: Cross-Platform TRanscriptome Analysis, to analyze transcriptome profiling data from separate methods. The software package is available at http://people.tamu.edu/~syuan/cptra/cptra.html. It was applied to the case study of non-target site glyphosate resistance in horseweed; and the data was mined to discover resistance target gene(s). For the software, the input data included a long-read sequence dataset with proper annotation, and a short-read sequence tag dataset for the quantification of transcripts. By combining the two datasets, the software carries out the unique sequence tag identification, tag counting for transcript quantification, and cross-platform sequence matching functions, whereby the short sequence tags can be annotated with a function, level of expression, and Gene Ontology (GO) classification. Multiple sequence search algorithms were implemented and compared. The analysis highlighted the importance of transport genes in glyphosate resistance and identified several candidate genes for down-stream analysis.

Conclusion

CPTRA is a powerful software package for next generation sequencing-based transcriptome profiling in species with limited genome information. According to our case study, the strategy can greatly broaden the application of the next generation sequencing for transcriptome analysis in species without reference genome sequence.

Collapse

7026

Maccallum I, Przybylski D, Gnerre S, Burton J, Shlyakhter I, Gnirke A, Malek J, McKernan K, Ranade S, Shea TP, Williams L, Young S, Nusbaum C, Jaffe DB. ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads. Genome Biol 2009;10:R103. [PMID: 19796385 PMCID: PMC2784318 DOI: 10.1186/gb-2009-10-10-r103] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Revised: 08/20/2009] [Accepted: 10/01/2009] [Indexed: 11/10/2022] Open

7027

Nagarajan N, Pop M. Parametric complexity of sequence assembly: theory and applications to next generation sequencing. J Comput Biol 2009;16:897-908. [PMID: 19580519 DOI: 10.1089/cmb.2009.0005] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

7028

Stabler RA, He M, Dawson L, Martin M, Valiente E, Corton C, Lawley TD, Sebaihia M, Quail MA, Rose G, Gerding DN, Gibert M, Popoff MR, Parkhill J, Dougan G, Wren BW. Comparative genome and phenotypic analysis of Clostridium difficile 027 strains provides insight into the evolution of a hypervirulent bacterium. Genome Biol 2009;10:R102. [PMID: 19781061 PMCID: PMC2768977 DOI: 10.1186/gb-2009-10-9-r102] [Citation(s) in RCA: 364] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2009] [Revised: 06/29/2009] [Accepted: 09/25/2009] [Indexed: 11/10/2022] Open

7029

Diguistini S, Liao NY, Platt D, Robertson G, Seidel M, Chan SK, Docking TR, Birol I, Holt RA, Hirst M, Mardis E, Marra MA, Hamelin RC, Bohlmann J, Breuil C, Jones SJ. De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data. Genome Biol 2009;10:R94. [PMID: 19747388 PMCID: PMC2768983 DOI: 10.1186/gb-2009-10-9-r94] [Citation(s) in RCA: 123] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2009] [Accepted: 09/11/2009] [Indexed: 12/16/2022] Open

7030

Rodrigue S, Malmstrom RR, Berlin AM, Birren BW, Henn MR, Chisholm SW. Whole genome amplification and de novo assembly of single bacterial cells. PLoS One 2009;4:e6864. [PMID: 19724646 PMCID: PMC2731171 DOI: 10.1371/journal.pone.0006864] [Citation(s) in RCA: 183] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Accepted: 07/27/2009] [Indexed: 12/21/2022] Open

7031

Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 2009;6:677-81. [PMID: 19668202 PMCID: PMC3661775 DOI: 10.1038/nmeth.1363] [Citation(s) in RCA: 1017] [Impact Index Per Article: 67.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2009] [Accepted: 07/13/2009] [Indexed: 11/09/2022]

7032

Giannakis M, Bäckhed HK, Chen SL, Faith JJ, Wu M, Guruge JL, Engstrand L, Gordon JI. Response of gastric epithelial progenitors to Helicobacter pylori Isolates obtained from Swedish patients with chronic atrophic gastritis. J Biol Chem 2009;284:30383-94. [PMID: 19723631 PMCID: PMC2781593 DOI: 10.1074/jbc.m109.052738] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

Helicobacter pylori infection is associated with gastric adenocarcinoma in some humans, especially those that develop an antecedent condition, chronic atrophic gastritis (ChAG). Gastric epithelial progenitors (GEPs) in transgenic gnotobiotic mice with a ChAG-like phenotype harbor intracellular collections of H. pylori. To characterize H. pylori adaptations to ChAG, we sequenced the genomes of 24 isolates obtained from 6 individuals, each sampled over a 4-year interval, as they did or did not progress from normal gastric histology to ChAG and/or adenocarcinoma. H. pylori populations within study participants were largely clonal and remarkably stable regardless of disease state. GeneChip studies of the responses of a cultured mouse gastric stem cell-like line (mGEPs) to infection with sequenced strains yielded a 695-member dataset of transcripts that are (i) differentially expressed after infection with ChAG-associated isolates, but not with a “normal” or a heat-killed ChAG isolate, and (ii) enriched in genes and gene functions associated with tumorigenesis in general and gastric carcinogenesis in specific cases. Transcriptional profiling of a ChAG strain during mGEP infection disclosed a set of responses, including up-regulation of hopZ, an adhesin belonging to a family of outer membrane proteins. Expression profiles of wild-type and ΔhopZ strains revealed a number of pH-regulated genes modulated by HopZ, including hopP, which binds sialylated glycans produced by GEPs in vivo. Genetic inactivation of hopZ produced a fitness defect in the stomachs of gnotobiotic transgenic mice but not in wild-type littermates. This study illustrates an approach for identifying GEP responses specific to ChAG-associated H. Pylori strains and bacterial genes important for survival in a model of the ChAG gastric ecosystem.

Collapse

7033

Morozova O, Hirst M, Marra MA. Applications of new sequencing technologies for transcriptome analysis. Annu Rev Genomics Hum Genet 2009;10:135-51. [PMID: 19715439 DOI: 10.1146/annurev-genom-082908-145957] [Citation(s) in RCA: 340] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

7034

Soderlund C, Johnson E, Bomhoff M, Descour A. PAVE: program for assembling and viewing ESTs. BMC Genomics 2009;10:400. [PMID: 19709403 PMCID: PMC2748094 DOI: 10.1186/1471-2164-10-400] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 08/26/2009] [Indexed: 11/10/2022] Open

Abstract

Background

New sequencing technologies are rapidly emerging. Many laboratories are simultaneously working with the traditional Sanger ESTs and experimenting with ESTs generated by the 454 Life Science sequencers. Though Sanger ESTs have been used to generate contigs for many years, no program takes full advantage of the 5' and 3' mate-pair information, hence, many tentative transcripts are assembled into two separate contigs. The new 454 technology has the benefit of high-throughput expression profiling, but introduces time and space problems for assembling large contigs.

Results

The PAVE (Program for Assembling and Viewing ESTs) assembler takes advantage of the 5' and 3' mate-pair information by requiring that the mate-pairs be assembled into the same contig and joined by n's if the two sub-contigs do not overlap. It handles the depth of 454 data sets by "burying" similar ESTs during assembly, which retains the expression level information while circumventing time and space problems. PAVE uses MegaBLAST for the clustering step and CAP3 for assembly, however it assembles incrementally to enforce the mate-pair constraint, bury ESTs, and reduce incorrect joins and splits. The PAVE data management system uses a MySQL database to store multiple libraries of ESTs along with their metadata; the management system allows multiple assemblies with variations on libraries and parameters. Analysis routines provide standard annotation for the contigs including a measure of differentially expressed genes across the libraries. A Java viewer program is provided for display and analysis of the results. Our results clearly show the benefit of using the PAVE assembler to explicitly use mate-pair information and bury ESTs for large contigs.

Conclusion

The PAVE assembler provides a software package for assembling Sanger and/or 454 ESTs. The assembly software, data management software, Java viewer and user's guide are freely available.

Collapse

7035

Gibbons JG, Janson EM, Hittinger CT, Johnston M, Abbot P, Rokas A. Benchmarking next-generation transcriptome sequencing for functional and evolutionary genomics. Mol Biol Evol 2009;26:2731-44. [PMID: 19706727 DOI: 10.1093/molbev/msp188] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

7036

Studholme DJ, Ibanez SG, MacLean D, Dangl JL, Chang JH, Rathjen JP. A draft genome sequence and functional screen reveals the repertoire of type III secreted proteins of Pseudomonas syringae pathovar tabaci 11528. BMC Genomics 2009;10:395. [PMID: 19703286 PMCID: PMC2745422 DOI: 10.1186/1471-2164-10-395] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2009] [Accepted: 08/24/2009] [Indexed: 11/28/2022] Open

Abstract

Background

Pseudomonas syringae is a widespread bacterial pathogen that causes disease on a broad range of economically important plant species. Pathogenicity of P. syringae strains is dependent on the type III secretion system, which secretes a suite of up to about thirty virulence 'effector' proteins into the host cytoplasm where they subvert the eukaryotic cell physiology and disrupt host defences. P. syringae pathovar tabaci naturally causes disease on wild tobacco, the model member of the Solanaceae, a family that includes many crop species as well as on soybean.

Results

We used the 'next-generation' Illumina sequencing platform and the Velvet short-read assembly program to generate a 145X deep 6,077,921 nucleotide draft genome sequence for P. syringae pathovar tabaci strain 11528. From our draft assembly, we predicted 5,300 potential genes encoding proteins of at least 100 amino acids long, of which 303 (5.72%) had no significant sequence similarity to those encoded by the three previously fully sequenced P. syringae genomes. Of the core set of Hrp Outer Proteins that are conserved in three previously fully sequenced P. syringae strains, most were also conserved in strain 11528, including AvrE1, HopAH2, HopAJ2, HopAK1, HopAN1, HopI, HopJ1, HopX1, HrpK1 and HrpW1. However, the hrpZ1 gene is partially deleted and hopAF1 is completely absent in 11528. The draft genome of strain 11528 also encodes close homologues of HopO1, HopT1, HopAH1, HopR1, HopV1, HopAG1, HopAS1, HopAE1, HopAR1, HopF1, and HopW1 and a degenerate HopM1'. Using a functional screen, we confirmed that hopO1, hopT1, hopAH1, hopM1', hopAE1, hopAR1, and hopAI1' are part of the virulence-associated HrpL regulon, though the hopAI1' and hopM1' sequences were degenerate with premature stop codons. We also discovered two additional HrpL-regulated effector candidates and an HrpL-regulated distant homologue of avrPto1.

Conclusion

The draft genome sequence facilitates the continued development of P. syringae pathovar tabaci on wild tobacco as an attractive model system for studying bacterial disease on plants. The catalogue of effectors sheds further light on the evolution of pathogenicity and host-specificity as well as providing a set of molecular tools for the study of plant defence mechanisms. We also discovered several large genomic regions in Pta 11528 that do not share detectable nucleotide sequence similarity with previously sequenced Pseudomonas genomes. These regions may include horizontally acquired islands that possibly contribute to pathogenicity or epiphytic fitness of Pta 11528.

Collapse

7037

Parker HG, VonHoldt BM, Quignon P, Margulies EH, Shao S, Mosher DS, Spady TC, Elkahloun A, Cargill M, Jones PG, Maslen CL, Acland GM, Sutter NB, Kuroki K, Bustamante CD, Wayne RK, Ostrander EA. An expressed fgf4 retrogene is associated with breed-defining chondrodysplasia in domestic dogs. Science 2009;325:995-8. [PMID: 19608863 PMCID: PMC2748762 DOI: 10.1126/science.1173275] [Citation(s) in RCA: 238] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Affiliation(s)

Heidi G. Parker Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Bridgett M. VonHoldt Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, 90095 USA
Pascale Quignon Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Elliott H. Margulies Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Stephanie Shao Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Dana S. Mosher Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Tyrone C. Spady Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Abdel Elkahloun Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA
Michele Cargill Affymetrix Corporation, 3420 Central Expwy, Santa Clara, CA 95051 USA
Paul G. Jones The WALTHAM® Centre for Pet Nutrition, Waltham on the Wolds, Leicestershire, UK, LE14 4RT
Cheryl L. Maslen Division of Cardiovascular Medicine, Oregon Health & Science University, Portland, OR 97239 USA
Gregory M. Acland Baker Institute for Animal Health, Cornell University, Ithaca, NY 14853, USA College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
Nathan B. Sutter College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
Keiichi Kuroki Comparative Orthopaedic Laboratory, University of Missouri, Columbia, MO 65211
Carlos D. Bustamante Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA
Robert K. Wayne Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA, 90095 USA
Elaine A. Ostrander Cancer Genetics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892 USA

Collapse

7038

Varshney RK, Nayak SN, May GD, Jackson SA. Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 2009;27:522-30. [PMID: 19679362 DOI: 10.1016/j.tibtech.2009.05.006] [Citation(s) in RCA: 401] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2009] [Revised: 05/21/2009] [Accepted: 05/27/2009] [Indexed: 10/20/2022]

7039

Bozdag S, Close TJ, Lonardi S. A compartmentalized approach to the assembly of physical maps. BMC Bioinformatics 2009;10:217. [PMID: 19604400 PMCID: PMC2717093 DOI: 10.1186/1471-2105-10-217] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 07/15/2009] [Indexed: 12/30/2022] Open

7040

Du J, Bjornson RD, Zhang ZD, Kong Y, Snyder M, Gerstein MB. Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants. PLoS Comput Biol 2009;5:e1000432. [PMID: 19593373 PMCID: PMC2700963 DOI: 10.1371/journal.pcbi.1000432] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Accepted: 06/04/2009] [Indexed: 12/02/2022] Open

Abstract

The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at maximum accuracy and low cost.

In recent years, the development of high throughput sequencing and array technologies has enabled the accurate re-sequencing of individual genomes, especially in identifying and reconstructing the variants in an individual's genome compared to a “reference”. The costs and sensitivities of these technologies differ considerably from each other, and even more technologies are expected to appear in the near future. To both reduce the total cost of re-sequencing to an affordable point and be adaptive to these constantly evolving bio-technologies, we propose to build a computationally efficient simulation framework that can help us optimize the combination of different technologies to perform low cost comparative genome re-sequencing, especially in reconstructing large structural variants, which is considered in many respects the most challenging step in genome re-sequencing. Our simulation results quantitatively show how much improvement one can gain in reconstructing large structural variants by integrating different technologies in optimal ways. We envision that in the future, more experimental technologies will be incorporated into this simulation framework and its results can provide informative guidelines for the actual experimental design to achieve optimal genome re-sequencing output at low costs.

Collapse

7041

Chen XS, Collins LJ, Biggs PJ, Penny D. High throughput genome-wide survey of small RNAs from the parasitic protists Giardia intestinalis and Trichomonas vaginalis. Genome Biol Evol 2009;1:165-75. [PMID: 20333187 PMCID: PMC2817412 DOI: 10.1093/gbe/evp017] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/30/2009] [Indexed: 12/26/2022] Open

7042

Manske HM, Kwiatkowski DP. SNP-o-matic. ACTA ACUST UNITED AC 2009;25:2434-5. [PMID: 19574284 PMCID: PMC2735664 DOI: 10.1093/bioinformatics/btp403] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

7043

Zhao F, Hou H, Bao Q, Wu J. PGA4genomics for comparative genome assembly based on genetic algorithm optimization. Genomics 2009;94:284-6. [PMID: 19573591 DOI: 10.1016/j.ygeno.2009.06.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2009] [Accepted: 06/19/2009] [Indexed: 11/28/2022]

7044

Ye K, Schulz MH, Long Q, Apweiler R, Ning Z. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 2009;25:2865-71. [PMID: 19561018 PMCID: PMC2781750 DOI: 10.1093/bioinformatics/btp394] [Citation(s) in RCA: 1485] [Impact Index Per Article: 99.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

7045

Schröder J, Schröder H, Puglisi SJ, Sinha R, Schmidt B. SHREC: a short-read error correction method. Bioinformatics 2009;25:2157-63. [PMID: 19542152 DOI: 10.1093/bioinformatics/btp379] [Citation(s) in RCA: 115] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

7046

Dutilh BE, Huynen MA, Strous M. Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. ACTA ACUST UNITED AC 2009;25:2878-81. [PMID: 19542148 PMCID: PMC2781756 DOI: 10.1093/bioinformatics/btp377] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

7047

Hurd PJ, Nelson CJ. Advantages of next-generation sequencing versus the microarray in epigenetic research. BRIEFINGS IN FUNCTIONAL GENOMICS AND PROTEOMICS 2009;8:174-83. [PMID: 19535508 DOI: 10.1093/bfgp/elp013] [Citation(s) in RCA: 157] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

7048

Schmidt B, Sinha R, Beresford-Smith B, Puglisi SJ. A fast hybrid short read fragment assembly algorithm. Bioinformatics 2009;25:2279-80. [PMID: 19535537 DOI: 10.1093/bioinformatics/btp374] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

7049

Birol I, Jackman SD, Nielsen CB, Qian JQ, Varhol R, Stazyk G, Morin RD, Zhao Y, Hirst M, Schein JE, Horsman DE, Connors JM, Gascoyne RD, Marra MA, Jones SJM. De novo transcriptome assembly with ABySS. Bioinformatics 2009;25:2872-7. [PMID: 19528083 DOI: 10.1093/bioinformatics/btp367] [Citation(s) in RCA: 297] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

7050

Directed evolution of ionizing radiation resistance in Escherichia coli. J Bacteriol 2009;191:5240-52. [PMID: 19502398 DOI: 10.1128/jb.00502-09] [Citation(s) in RCA: 87] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open