51
|
Wright IA, Travers SA. RAMICS: trainable, high-speed and biologically relevant alignment of high-throughput sequencing reads to coding DNA. Nucleic Acids Res 2014; 42:e106. [PMID: 24861618 PMCID: PMC4117746 DOI: 10.1093/nar/gku473] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance.
Collapse
Affiliation(s)
- Imogen A Wright
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| | - Simon A Travers
- South African National Bioinformatics Institute, South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville 7535, South Africa
| |
Collapse
|
52
|
Farrell A, Coleman BI, Benenati B, Brown KM, Blader IJ, Marth GT, Gubbels MJ. Whole genome profiling of spontaneous and chemically induced mutations in Toxoplasma gondii. BMC Genomics 2014; 15:354. [PMID: 24885922 PMCID: PMC4035079 DOI: 10.1186/1471-2164-15-354] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 05/02/2014] [Indexed: 12/18/2022] Open
Abstract
Background Next generation sequencing is helping to overcome limitations in organisms less accessible to classical or reverse genetic methods by facilitating whole genome mutational analysis studies. One traditionally intractable group, the Apicomplexa, contains several important pathogenic protozoan parasites, including the Plasmodium species that cause malaria. Here we apply whole genome analysis methods to the relatively accessible model apicomplexan, Toxoplasma gondii, to optimize forward genetic methods for chemical mutagenesis using N-ethyl-N-nitrosourea (ENU) and ethylmethane sulfonate (EMS) at varying dosages. Results By comparing three different lab-strains we show that spontaneously generated mutations reflect genome composition, without nucleotide bias. However, the single nucleotide variations (SNVs) are not distributed randomly over the genome; most of these mutations reside either in non-coding sequence or are silent with respect to protein coding. This is in contrast to the random genomic distribution of mutations induced by chemical mutagenesis. Additionally, we report a genome wide transition vs transversion ratio (ti/tv) of 0.91 for spontaneous mutations in Toxoplasma, with a slightly higher rate of 1.20 and 1.06 for variants induced by ENU and EMS respectively. We also show that in the Toxoplasma system, surprisingly, both ENU and EMS have a proclivity for inducing mutations at A/T base pairs (78.6% and 69.6%, respectively). Conclusions The number of SNVs between related laboratory strains is relatively low and managed by purifying selection away from changes to amino acid sequence. From an experimental mutagenesis point of view, both ENU (24.7%) and EMS (29.1%) are more likely to generate variation within exons than would naturally accumulate over time in culture (19.1%), demonstrating the utility of these approaches for yielding proportionally greater changes to the amino acid sequence. These results will not only direct the methods of future chemical mutagenesis in Toxoplasma, but also aid in designing forward genetic approaches in less accessible pathogenic protozoa as well. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-354) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Marc-Jan Gubbels
- Department of Biology, Boston College, Higgins Hall 355, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA.
| |
Collapse
|
53
|
Wang BL, Ghaderi A, Zhou H, Agresti J, Weitz DA, Fink GR, Stephanopoulos G. Microfluidic high-throughput culturing of single cells for selection based on extracellular metabolite production or consumption. Nat Biotechnol 2014; 32:473-8. [PMID: 24705516 PMCID: PMC4412259 DOI: 10.1038/nbt.2857] [Citation(s) in RCA: 236] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Accepted: 02/21/2014] [Indexed: 11/09/2022]
Abstract
Phenotyping single cells based on the products they secrete or consume is a key bottleneck in many biotechnology applications, such as combinatorial metabolic engineering for the overproduction of secreted metabolites. Here we present a flexible high-throughput approach that uses microfluidics to compartmentalize individual cells for growth and analysis in monodisperse nanoliter aqueous droplets surrounded by an immiscible fluorinated oil phase. We use this system to identify xylose-overconsuming Saccharomyces cerevisiae cells from a population containing one such cell per 10(4) cells and to screen a genomic library to identify multiple copies of the xylose isomerase gene as a genomic change contributing to high xylose consumption, a trait important for lignocellulosic feedstock utilization. We also enriched L-lactate-producing Escherichia coli clones 5,800× from a population containing one L-lactate producer per 10(4) D-lactate producers. Our approach has broad applications for single-cell analyses, such as in strain selection for the overproduction of fuels, chemicals and pharmaceuticals.
Collapse
Affiliation(s)
- Benjamin L Wang
- 1] Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [2]
| | - Adel Ghaderi
- 1] Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. [2]
| | - Hang Zhou
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Jeremy Agresti
- Department of Physics and School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
| | - David A Weitz
- Department of Physics and School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA
| | - Gerald R Fink
- Whitehead Institute for Biomedical Research, Cambridge, Massachusetts, USA
| | - Gregory Stephanopoulos
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| |
Collapse
|
54
|
Illumina-based analysis of endophytic bacterial diversity and space-time dynamics in sugar beet on the north slope of Tianshan mountain. Appl Microbiol Biotechnol 2014; 98:6375-85. [PMID: 24752839 DOI: 10.1007/s00253-014-5720-9] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2014] [Revised: 03/20/2014] [Accepted: 03/21/2014] [Indexed: 10/25/2022]
Abstract
Plants harbors complex and variable microbial communities. Endophytic bacteria play an important function and potential role more effectively in developing sustainable systems of crop production. To examine how endophytic bacteria in sugar beet (Beta vulgaris L.) vary across both host growth period and location, PCR-based Illumina was applied to revealed the diversity and stability of endophytic bacteria in sugar beet on the north slope of Tianshan mountain, China. A total of 60.84 M effective sequences of 16S rRNA gene V3 region were obtained from sugar beet samples. These sequences revealed huge amount of operational taxonomic units (OTUs) in sugar beet, that is, 19-121 OTUs in a beet sample, at 3 % cutoff level and sequencing depth of 30,000 sequences. We identified 13 classes from the resulting 449,585 sequences. Alphaproteobacteria were the dominant class in all sugar beets, followed by Acidobacteria, Gemmatimonadetes and Actinobacteria. A marked difference in the diversity of endophytic bacteria in sugar beet for different growth periods was evident. The greatest number of OTUs was detected during rossette formation (109 OTUs) and tuber growth (146 OTUs). Endophytic bacteria diversity was reduced during seedling growth (66 OTUs) and sucrose accumulation (95 OTUs). Forty-three OTUs were common to all four periods. There were more tags of Alphaproteobacteria and Gammaproteobacteria in Shihezi than in Changji. The dynamics of endophytic bacteria communities were influenced by plant genotype and plant growth stage. To the best of our knowledge, this study is the first application of PCR-based Illumina pyrosequencing to characterize and compare multiple sugar beet samples.
Collapse
|
55
|
Harper M, Gronenberg L, Liao J, Lee C. Comprehensive detection of genes causing a phenotype using phenotype sequencing and pathway analysis. PLoS One 2014; 9:e88072. [PMID: 24586303 PMCID: PMC3935835 DOI: 10.1371/journal.pone.0088072] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Accepted: 01/06/2014] [Indexed: 12/30/2022] Open
Abstract
Discovering all the genetic causes of a phenotype is an important goal in functional genomics. We combine an experimental design for detecting independent genetic causes of a phenotype with a high-throughput sequencing analysis that maximizes sensitivity for comprehensively identifying them. Testing this approach on a set of 24 mutant strains generated for a metabolic phenotype with many known genetic causes, we show that this pathway-based phenotype sequencing analysis greatly improves sensitivity of detection compared with previous methods, and reveals a wide range of pathways that can cause this phenotype. We demonstrate our approach on a metabolic re-engineering phenotype, the PEP/OAA metabolic node in E. coli, which is crucial to a substantial number of metabolic pathways and under renewed interest for biofuel research. Out of 2157 mutations in these strains, pathway-phenoseq discriminated just five gene groups (12 genes) as statistically significant causes of the phenotype. Experimentally, these five gene groups, and the next two high-scoring pathway-phenoseq groups, either have a clear connection to the PEP metabolite level or offer an alternative path of producing oxaloacetate (OAA), and thus clearly explain the phenotype. These high-scoring gene groups also show strong evidence of positive selection pressure, compared with strictly neutral selection in the rest of the genome.
Collapse
Affiliation(s)
- Marc Harper
- Institute for Genomics and Proteomics, University of California Los Angeles, Los Angeles, California, United States of America
- * E-mail:
| | - Luisa Gronenberg
- Department of Chemical and Biomolecular Engineering, University of California Los Angeles, Los Angeles, California, United States of America
| | - James Liao
- Institute for Genomics and Proteomics, University of California Los Angeles, Los Angeles, California, United States of America
- Department of Chemical and Biomolecular Engineering, University of California Los Angeles, Los Angeles, California, United States of America
| | - Christopher Lee
- Institute for Genomics and Proteomics, University of California Los Angeles, Los Angeles, California, United States of America
- Dept. of Chemistry & Biochemistry, University of California Los Angeles, Los Angeles, California, United States of America
- Dept. of Computer Science, University of California Los Angeles, Los Angeles, California, United States of America
- Molecular Biology Institute, University of California Los Angeles, Los Angeles, California, United States of America
| |
Collapse
|
56
|
Chen W, Chen H, Zheng T, Yu R, Terzaghi WB, Li Z, Deng XW, Xu J, He H. Highly efficient genotyping of rice biparental populations by GoldenGate assays based on parental resequencing. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2014; 127:297-307. [PMID: 24190103 DOI: 10.1007/s00122-013-2218-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2013] [Accepted: 10/14/2013] [Indexed: 05/04/2023]
Abstract
A new time- and cost-effective strategy was developed for medium-density SNP genotyping of rice biparental populations, using GoldenGate assays based on parental resequencing. Since the advent of molecular markers, crop researchers and breeders have dedicated huge amounts of effort to detecting quantitative trait loci (QTL) in biparental populations for genetic analysis and marker-assisted selection (MAS). In this study, we developed a new time- and cost-effective strategy for genotyping a population of progeny from a rice cross using medium-density single nucleotide polymorphisms (SNPs). Using this strategy, 728,362 "high quality" SNPs were identified by resequencing Teqing and Lemont, the parents of the population. We selected 384 informative SNPs that were evenly distributed across the genome for genotyping the biparental population using the Illumina GoldenGate assay. 335 (87.2 %) validated SNPs were used for further genetic analyses. After removing segregation distortion markers, 321 SNPs were used for linkage map construction and QTL mapping. This strategy generated SNP markers distributed more evenly across the genome than previous SSR assays. Taking the GW5 gene that controls grain shape as an example, our strategy provided higher accuracy (0.8 Mb) and significance (LOD 5.5 and 10.1) in QTL mapping than SSR analysis. Our study thus provides a rapid and efficient strategy for genetic studies and QTL mapping using SNP genotyping assays.
Collapse
Affiliation(s)
- Wei Chen
- Peking-Yale Joint Center for Plant Molecular Genetics and Agro-Biotechnology, National Laboratory of Protein Engineering and Plant Genetic Engineering, College of Life Sciences, Peking University, Beijing, 100871, China
| | | | | | | | | | | | | | | | | |
Collapse
|
57
|
Abstract
Next-generation sequencing platforms have made it possible to very rapidly map genetic mutations in Arabidopsis using whole-genome resequencing against pooled members of an F2 mapping population. In the case of recessive mutations, all individuals expressing the phenotype will be homozygous for the mutant genome at the locus responsible for the phenotype, while all other loci segregate roughly equally for both parental lines due to recombination. Importantly, genomic regions flanking the recessive mutation will be in linkage disequilibrium and therefore also be homozygous due to genetic hitchhiking. This information can be exploited to quickly and effectively identify the causal mutation. To this end, sequence data generated from members of the pooled population exhibiting the mutant phenotype are first aligned to the reference genome. Polymorphisms between the mutant and mapping line are then identified and used to determine the homozygous, nonrecombinant region harboring the mutation. Polymorphisms in the identified region are filtered to provide a short list of markers potentially responsible for the phenotype of interest, which is followed by validation at the bench. Although the focus of recent studies has been on the mapping of point mutations exhibiting recessive phenotypes, the techniques employed can be extended to incorporate more complicated scenarios such as dominant mutations and those caused by insertions or deletions in genomic sequence. This chapter describes detailed procedures for performing next-generation mapping against an Arabidopsis mutant and discusses how different mutations might be approached.
Collapse
|
58
|
Worthey EA. Analysis and annotation of whole-genome or whole-exome sequencing-derived variants for clinical diagnosis. CURRENT PROTOCOLS IN HUMAN GENETICS 2013; 79:9.24.1-9.24.24. [PMID: 24510652 DOI: 10.1002/0471142905.hg0924s79] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Over the last several years, next-generation sequencing (NGS) has transformed genomic research through substantial advances in technology and reduction in the cost of sequencing, and also in the systems required for analysis of these large volumes of data. This technology is now being used as a standard molecular diagnostic test under particular circumstances in some clinical settings. The advances in sequencing have come so rapidly that the major bottleneck in identification of causal variants is no longer the sequencing but rather the analysis and interpretation. Interpretation of genetic findings in a clinical setting is scarcely a new challenge, but the task is increasingly complex in clinical genome-wide sequencing given the dramatic increase in dataset size and complexity. This increase requires the development of novel or repositioned analysis tools, methodologies, and processes. This unit provides an overview of these items. Specific challenges related to implementation in a clinical setting are discussed.
Collapse
Affiliation(s)
- Elizabeth A Worthey
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, Wisconsin.,The Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin.,Department of Computer Science, University of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
59
|
Men L, Yan S, Liu G. De novo characterization of Larix gmelinii (Rupr.) Rupr. transcriptome and analysis of its gene expression induced by jasmonates. BMC Genomics 2013; 14:548. [PMID: 23941306 PMCID: PMC3765852 DOI: 10.1186/1471-2164-14-548] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2013] [Accepted: 08/03/2013] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Larix gmelinii is a dominant tree species in China's boreal forests and plays an important role in the coniferous ecosystem. It is also one of the most economically important tree species in the Chinese timber industry due to excellent water resistance and anti-corrosion of its wood products. Unfortunately, in Northeast China, L. gmelinii often suffers from serious attacks by diseases and insects. The application of exogenous volatile semiochemicals may induce and enhance its resistance against insect or disease attacks; however, little is known regarding the genes and molecular mechanisms related to induced resistance. RESULTS We performed de novo sequencing and assembly of the L. gmelinii transcriptome using a short read sequencing technology (Illumina). Chemical defenses of L. gmelinii seedlings were induced with jasmonic acid (JA) or methyl jasmonate (MeJA) for 6 hours. Transcriptomes were compared between seedlings induced by JA, MeJA and untreated controls using a tag-based digital gene expression profiling system. In a single run, 25,977,782 short reads were produced and 51,157 unigenes were obtained with a mean length of 517 nt. We sequenced 3 digital gene expression libraries and generated between 3.5 and 5.9 million raw tags, and obtained 52,040 reliable reference genes after removing redundancy. The expression of disease/insect-resistance genes (e.g., phenylalanine ammonialyase, coumarate 3-hydroxylase, lipoxygenase, allene oxide synthase and allene oxide cyclase) was up-regulated. The expression profiles of some abundant genes under different elicitor treatment were studied by using real-time qRT-PCR.The results showed that the expression levels of disease/insect-resistance genes in the seedling samples induced by JA and MeJA were higher than those in the control group. The seedlings induced with MeJA elicited the strongest increases in disease/insect-resistance genes. CONCLUSIONS Both JA and MeJA induced seedlings of L. gmelinii showed significantly increased expression of disease/insect-resistance genes. MeJA seemed to have a stronger induction effect than JA on expression of disease/insect-resistance related genes. This study provides sequence resources for L. gmelinii research and will help us to better understand the functions of disease/insect-resistance genes and the molecular mechanisms of secondary metabolisms in L. gmelinii.
Collapse
Affiliation(s)
- Lina Men
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, No, 26 Hexing Road, Harbin 150040, P, R, China.
| | | | | |
Collapse
|
60
|
Bouyioukos C, Moscou MJ, Champouret N, Hernández-Pinzón I, Ward ER, Wulff BBH. Characterisation and analysis of the Aegilops sharonensis transcriptome, a wild relative of wheat in the Sitopsis section. PLoS One 2013; 8:e72782. [PMID: 23951332 PMCID: PMC3738571 DOI: 10.1371/journal.pone.0072782] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 07/11/2013] [Indexed: 12/19/2022] Open
Abstract
Aegilops sharonensis Eig (Sharon goatgrass) is a wild diploid relative of wheat within the Sitopsis section of Aegilops. This species represents an untapped reservoir of genetic diversity for traits of agronomic importance, especially as a source of novel disease resistance. To gain a foothold in this genetic resource, we sequenced the cDNA from leaf tissue of two geographically distinct Ae. sharonensis accessions (1644 and 2232) using the 454 Life Sciences platform. We compared the results of two different assembly programs using different parameter sets to generate 13 distinct assemblies in an attempt to maximize representation of the gene space in de novo transcriptome assembly. The most sensitive assembly (71,029 contigs; N50 674 nts) retrieved 18,684 unique best reciprocal BLAST hits (BRBH) against six previously characterised grass proteomes while the most specific assembly (30,609 contigs; N50 815 nts) retrieved 15,687 BRBH. We combined these two assemblies into a set of 62,243 non-redundant sequences and identified 139 belonging to plant disease resistance genes of the nucleotide binding leucine-rich repeat class. Based on the non-redundant sequences, we predicted 37,743 single nucleotide polymorphisms (SNP), equivalent to one per 1,142 bp. We estimated the level of heterozygosity as 1.6% in accession 1644 and 30.1% in 2232. The Ae. sharonensis leaf transcriptome provides a rich source of sequence and SNPs for this wild wheat relative. These sequences can be used with existing monocot genome sequences and EST sequence collections (e.g. barley, Brachypodium, wheat, rice, maize and Sorghum) to assist with genetic and physical mapping and candidate gene identification in Ae. sharonensis. These resources provide an initial framework to further build on and characterise the genetic and genomic structure of Ae. sharonensis.
Collapse
Affiliation(s)
| | | | | | | | - Eric R. Ward
- The Sainsbury Laboratory, Norwich, United Kingdom
| | | |
Collapse
|
61
|
Toepel J, Illmer-Kephalides M, Jaenicke S, Straube J, May P, Goesmann A, Kruse O. New insights into Chlamydomonas reinhardtii hydrogen production processes by combined microarray/RNA-seq transcriptomics. PLANT BIOTECHNOLOGY JOURNAL 2013; 11:717-33. [PMID: 23551401 DOI: 10.1111/pbi.12062] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Revised: 01/07/2013] [Accepted: 02/09/2013] [Indexed: 05/06/2023]
Abstract
Hydrogen production with Chlamydomonas reinhardtii induced by sulphur starvation is a multiphase process while the cell internal metabolism is completely remodelled. The first cellular response is characterized by induction of genes with regulatory functions, followed by a total remodelling of the metabolism to provide reduction equivalents for cellular processes. We were able to characterize all major processes that provide energy and reduction equivalents during hydrogen production. Furthermore, C. reinhardtii showed a strong transcript increase for gene models responsible for stress response and detoxification of oxygen radicals. Finally, we were able to determine potential bottlenecks and target genes for manipulation to increase hydrogen production or to prolong the hydrogen production phase. The investigation of transcriptomic changes during the time course of hydrogen production in C. reinhardtii with microarrays and RNA-seq revealed new insights into the regulation and remodelling of the cell internal metabolism. Both methods showed a good correlation. The microarray platform can be used as a reliable standard tool for routine gene expression analysis. RNA-seq additionally allowed a detailed time-dependent study of gene expression and determination of new genes involved in the hydrogen production process.
Collapse
Affiliation(s)
- Jörg Toepel
- Algae Biotechnology & Bioenergy Group, Department of Biology/Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | | | | | | | | | | | | |
Collapse
|
62
|
Zhou W, Hu Y, Sui Z, Fu F, Wang J, Chang L, Guo W, Li B. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta) based on next-generation sequencing. PLoS One 2013; 8:e69909. [PMID: 23875008 PMCID: PMC3713064 DOI: 10.1371/journal.pone.0069909] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 06/13/2013] [Indexed: 12/15/2022] Open
Abstract
Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.
Collapse
Affiliation(s)
- Wei Zhou
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Yiyi Hu
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Zhenghong Sui
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
- * E-mail:
| | - Feng Fu
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
- Ocean School, Yantai University, Yantai, China
| | - Jinguo Wang
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Lianpeng Chang
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Weihua Guo
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| | - Binbin Li
- Key Laboratory of Marine Genetics and Breeding, College of Marine Life Sciences, Ocean University of China, Qingdao, China
| |
Collapse
|
63
|
Alexander A, Steel D, Slikas B, Hoekzema K, Carraher C, Parks M, Cronn R, Baker CS. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing. Genome Biol Evol 2013; 5:113-29. [PMID: 23254394 PMCID: PMC3595033 DOI: 10.1093/gbe/evs126] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20 mitogenomes from 17 sperm whales representative of worldwide diversity using Next Generation Sequencing (NGS) technologies (Illumina GAIIx, Roche 454 GS Junior). Resequencing of three individuals with both NGS platforms and partial Sanger sequencing showed low discrepancy rates (454-Illumina: 0.0071%; Sanger-Illumina: 0.0034%; and Sanger-454: 0.0023%) confirming suitability of both NGS platforms for investigating low mitogenomic diversity. Using the 17 sperm whale mitogenomes in a phylogenetic reconstruction with 41 other species, including 11 new dolphin mitogenomes, we tested two hypotheses for the low CR diversity. First, the hypothesis that CR-specific constraints have reduced diversity solely in the CR was rejected as diversity was low throughout the mitogenome, not just in the CR (overall diversity π = 0.096%; protein-coding 3rd codon = 0.22%; CR = 0.35%), and CR phylogenetic signal was congruent with protein-coding regions. Second, the hypothesis that slow substitution rates reduced diversity throughout the sperm whale mitogenome was rejected as sperm whales had significantly higher rates of CR evolution and no evidence of slow coding region evolution relative to other cetaceans. The estimated time to most recent common ancestor for sperm whale mitogenomes was 72,800 to 137,400 years ago (95% highest probability density interval), consistent with previous hypotheses of a bottleneck or selective sweep as likely causes of low mitogenome diversity.
Collapse
Affiliation(s)
- Alana Alexander
- Marine Mammal Institute, Hatfield Marine Science Center, Oregon State University, OR, USA.
| | | | | | | | | | | | | | | |
Collapse
|
64
|
Lin WD, Chang KP, Wang CH, Chen SJ, Fan PC, Weng WC, Lin WC, Tsai Y, Tsai CH, Chou IC, Tsai FJ. Molecular aspects of Dravet syndrome patients in Taiwan. Clin Chim Acta 2013; 421:34-40. [DOI: 10.1016/j.cca.2013.02.015] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Revised: 02/10/2013] [Accepted: 02/12/2013] [Indexed: 01/08/2023]
|
65
|
Effects of GC bias in next-generation-sequencing data on de novo genome assembly. PLoS One 2013; 8:e62856. [PMID: 23638157 PMCID: PMC3639258 DOI: 10.1371/journal.pone.0062856] [Citation(s) in RCA: 162] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 03/26/2013] [Indexed: 11/23/2022] Open
Abstract
Next-generation-sequencing (NGS) has revolutionized the field of genome assembly because of its much higher data throughput and much lower cost compared with traditional Sanger sequencing. However, NGS poses new computational challenges to de novo genome assembly. Among the challenges, GC bias in NGS data is known to aggravate genome assembly. However, it is not clear to what extent GC bias affects genome assembly in general. In this work, we conduct a systematic analysis on the effects of GC bias on genome assembly. Our analyses reveal that GC bias only lowers assembly completeness when the degree of GC bias is above a threshold. At a strong GC bias, the assembly fragmentation due to GC bias can be explained by the low coverage of reads in the GC-poor or GC-rich regions of a genome. This effect is observed for all the assemblers under study. Increasing the total amount of NGS data thus rescues the assembly fragmentation because of GC bias. However, the amount of data needed for a full rescue depends on the distribution of GC contents. Both low and high coverage depths due to GC bias lower the accuracy of assembly. These pieces of information provide guidance toward a better de novo genome assembly in the presence of GC bias.
Collapse
|
66
|
Jeong IS, Yoon UH, Lee GS, Ji HS, Lee HJ, Han CD, Hahn JH, An G, Kim TH. SNP-based analysis of genetic diversity in anther-derived rice by whole genome sequencing. RICE (NEW YORK, N.Y.) 2013; 6:6. [PMID: 24280451 PMCID: PMC4883692 DOI: 10.1186/1939-8433-6-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Accepted: 03/06/2013] [Indexed: 05/10/2023]
Abstract
BACKGROUND Anther culture has advantage to obtain a homozygous progeny by induced doubling of haploid chromosomes and to improve selection efficiency for invaluable agronomical traits. Therefore, anther culturing is widely utilized to breed new varieties and to induce genetic variations in several crops including rice. Genome sequencing technologies allow the detection of a massive number of DNA polymorphism such as SNPs and Indels between closely related cultivars. These DNA polymorphisms permit the rapid identification of genetic diversity among cultivars and genomic locations of heritable traits. To estimate sequence diversity derived from anther culturing, we performed whole-genome resequencing of five Korean rice accessions, including three anther culture lines (BLB, HY-04 and HY-08), their progenitor cultivar (Hwayeong), and an additional japonica cultivar (Dongjin). RESULTS A total of 1,165 × 106 raw reads were generated with over 58× coverage that detected 1,154,063 DNA polymorphisms between the Korean rice accessions and Nipponbare. We observed that in Hwayeong and its progenies, 0.64 SNP was found per one kb of Nipponbare genome, while Dongjin, bred by a conventional breeding method, had a lower number of SNPs (0.45 SNP/kb). Among 1,154,063 DNA polymorphisms, 29,269 non-synonymous SNPs located on 30,013 genes and these genes were functionally classified based on gene ontology (GO). We also analyzed line-specific SNPs which were estimated 1 ~ 3% of the total SNPs. The frequency of non-synonymous SNPs in each accession ranged from 26 SNPs in Hwayeong to 214 SNPs in HY-04. CONCLUSIONS The genetic difference we detected between the progenies derived from anther culture and their mother cultivar is due to somaclonal variation during tissue culture process, such as karyotype change, chromosome rearrangement, gene amplification and deletion, transposable element, and DNA methylation. Detection of genome-wide DNA polymorphisms by high-throughput sequencer enabled to identify sequence diversity derived from anther culturing and genomic locations of heritable traits. Furthermore, it will provide an invaluable resource to identify molecular markers and genes associated with diverse traits of agronomical importance.
Collapse
Affiliation(s)
- In-Seon Jeong
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Ung-Han Yoon
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Gang-Seob Lee
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Hyeon-So Ji
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Hyun-Ju Lee
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Chang-Deok Han
- />Department of Biochemistry, Gyeongsang National University, Jinju, 660-701 Republic of Korea
| | - Jang-Ho Hahn
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| | - Gynheung An
- />Department of plant molecular systems biotechnology and Crop biotech institute, Kyung Hee university, Yongin, 446-701 Republic of Korea
| | - Tae-Ho Kim
- />Rural Development Administration, Genomics Division, National Academy of Agricultural Science, Suwon, 441-707 Republic of Korea
| |
Collapse
|
67
|
Fulton R, d’Offay J, Eberle R. Bovine herpesvirus-1: Comparison and differentiation of vaccine and field strains based on genomic sequence variation. Vaccine 2013; 31:1471-9. [DOI: 10.1016/j.vaccine.2013.01.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2012] [Revised: 01/03/2013] [Accepted: 01/04/2013] [Indexed: 11/29/2022]
|
68
|
Chin ELH, da Silva C, Hegde M. Assessment of clinical analytical sensitivity and specificity of next-generation sequencing for detection of simple and complex mutations. BMC Genet 2013; 14:6. [PMID: 23418865 PMCID: PMC3599218 DOI: 10.1186/1471-2156-14-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 02/08/2013] [Indexed: 09/03/2023] Open
Abstract
Background Detecting mutations in disease genes by full gene sequence analysis is common in clinical diagnostic laboratories. Sanger dideoxy terminator sequencing allows for rapid development and implementation of sequencing assays in the clinical laboratory, but it has limited throughput, and due to cost constraints, only allows analysis of one or at most a few genes in a patient. Next-generation sequencing (NGS), on the other hand, has evolved rapidly, although to date it has mainly been used for large-scale genome sequencing projects and is beginning to be used in the clinical diagnostic testing. One advantage of NGS is that many genes can be analyzed easily at the same time, allowing for mutation detection when there are many possible causative genes for a specific phenotype. In addition, regions of a gene typically not tested for mutations, like deep intronic and promoter mutations, can also be detected. Results Here we use 20 previously characterized Sanger-sequenced positive controls in disease-causing genes to demonstrate the utility of NGS in a clinical setting using standard PCR based amplification to assess the analytical sensitivity and specificity of the technology for detecting all previously characterized changes (mutations and benign SNPs). The positive controls chosen for validation range from simple substitution mutations to complex deletion and insertion mutations occurring in autosomal dominant and recessive disorders. The NGS data was 100% concordant with the Sanger sequencing data identifying all 119 previously identified changes in the 20 samples. Conclusions We have demonstrated that NGS technology is ready to be deployed in clinical laboratories. However, NGS and associated technologies are evolving, and clinical laboratories will need to invest significantly in staff and infrastructure to build the necessary foundation for success.
Collapse
Affiliation(s)
- Ephrem L H Chin
- Department of Human Genetics, Emory University, Michael Street, Atlanta, GA, USA
| | | | | |
Collapse
|
69
|
Iyer R, Stepanov VG, Iken B. Isolation and molecular characterization of a novel <i>pseudomonas putida</i> strain capable of degrading organophosphate and aromatic compounds. ACTA ACUST UNITED AC 2013. [DOI: 10.4236/abc.2013.36065] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
70
|
Tabata R, Kamiya T, Shigenobu S, Yamaguchi K, Yamada M, Hasebe M, Fujiwara T, Sawa S. Identification of an EMS-induced causal mutation in a gene required for boron-mediated root development by low-coverage genome re-sequencing in Arabidopsis. PLANT SIGNALING & BEHAVIOR 2013; 8:e22534. [PMID: 23104114 PMCID: PMC3745560 DOI: 10.4161/psb.22534] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2012] [Revised: 10/12/2012] [Accepted: 10/12/2012] [Indexed: 05/22/2023]
Abstract
Next-generation sequencing (NGS) technologies enable the rapid production of an enormous quantity of sequence data. These powerful new technologies allow the identification of mutations by whole-genome sequencing. However, most reported NGS-based mapping methods, which are based on bulked segregant analysis, are costly and laborious. To address these limitations, we designed a versatile NGS-based mapping method that consists of a combination of low- to medium-coverage multiplex SOLiD (Sequencing by Oligonucleotide Ligation and Detection) and classical genetic rough mapping. Using only low to medium coverage reduces the SOLiD sequencing costs and, since just 10 to 20 mutant F 2 plants are required for rough mapping, the operation is simple enough to handle in a laboratory with limited space and funding. As a proof of principle, we successfully applied this method to identify the CTR1, which is involved in boron-mediated root development, from among a population of high boron requiring Arabidopsis thaliana mutants. Our work demonstrates that this NGS-based mapping method is a moderately priced and versatile method that can readily be applied to other model organisms.
Collapse
Affiliation(s)
- Ryo Tabata
- Graduate School of Science and Technology; Kumamoto University; Kumamoto, Japan
| | - Takehiro Kamiya
- Department of Applied Biological Chemistry; Graduate School of Agricultural and Life Sciences; University of Tokyo; Tokyo, Japan
| | - Shuji Shigenobu
- Functional Genomics Facility; National Institute for Basic Biology; Okazaki, Japan
| | - Katsushi Yamaguchi
- Functional Genomics Facility; National Institute for Basic Biology; Okazaki, Japan
| | - Masashi Yamada
- Department of Biology and IGSP Center for Systems Biology; Duke University; Durham, NC USA
| | - Mitsuyasu Hasebe
- Division of Evolutionary Biology; National Institute for Basic Biology; Okazaki, Japan
- School of Life Science; The Graduate University for Advanced Studies; Okazaki, Japan
- ERATO; Japan Science and Technology Agency; Okazaki, Japan
| | - Toru Fujiwara
- Department of Applied Biological Chemistry; Graduate School of Agricultural and Life Sciences; University of Tokyo; Tokyo, Japan
| | - Shinichiro Sawa
- Graduate School of Science and Technology; Kumamoto University; Kumamoto, Japan
| |
Collapse
|
71
|
Hayes M, Pyon YS, Li J. A model-based clustering method for genomic structural variant prediction and genotyping using paired-end sequencing data. PLoS One 2012; 7:e52881. [PMID: 23300804 PMCID: PMC3531386 DOI: 10.1371/journal.pone.0052881] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Accepted: 11/22/2012] [Indexed: 01/08/2023] Open
Abstract
Structural variation (SV) has been reported to be associated with numerous diseases such as cancer. With the advent of next generation sequencing (NGS) technologies, various types of SV can be potentially identified. We propose a model based clustering approach utilizing a set of features defined for each type of SV events. Our method, termed SVMiner, not only provides a probability score for each candidate, but also predicts the heterozygosity of genomic deletions. Extensive experiments on genome-wide deep sequencing data have demonstrated that SVMiner is robust against the variability of a single cluster feature, and it significantly outperforms several commonly used SV detection programs. SVMiner can be downloaded from http://cbc.case.edu/svminer/.
Collapse
Affiliation(s)
- Matthew Hayes
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Yoon Soo Pyon
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Jing Li
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, United States of America
| |
Collapse
|
72
|
d'Offay JM, Fulton RW, Eberle R. Complete genome sequence of the NVSL BoHV-1.1 Cooper reference strain. Arch Virol 2012; 158:1109-13. [PMID: 23254967 DOI: 10.1007/s00705-012-1574-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2012] [Accepted: 11/06/2012] [Indexed: 11/28/2022]
Abstract
The only complete genome sequence available for bovine herpesvirus 1 (BoHV-1) is a composite sequence derived from four different BoHV-1.1 strains and one BoHV-1.2 strain. Such a chimeric genome sequence is problematic for molecular genetic studies on this virus. We report here the complete genome sequence for the BoHV-1.1 NVSL reference strain Cooper. Although similar to the published chimeric genome sequence, there are a number of nucleotide substitutions and deletions/insertions across the genome, many of which affect coding sequences.
Collapse
Affiliation(s)
- Jean M d'Offay
- Department of Veterinary Pathobiology, Center for Veterinary Health Sciences, Oklahoma State University, 250 McElroy Hall, Stillwater, OK 74078, USA.
| | | | | |
Collapse
|
73
|
Whole Genome Sequencing and a New Bioinformatics Platform Allow for Rapid Gene Identification in D. melanogaster EMS Screens. BIOLOGY 2012; 1:766-77. [PMID: 24832518 PMCID: PMC4009818 DOI: 10.3390/biology1030766] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Revised: 11/14/2012] [Accepted: 11/20/2012] [Indexed: 11/17/2022]
Abstract
Forward genetic screens in Drosophila melanogaster using ethyl methanesulfonate (EMS) mutagenesis are a powerful approach for identifying genes that modulate specific biological processes in an in vivo setting. The mapping of genes that contain randomly-induced point mutations has become more efficient in Drosophila thanks to the maturation and availability of many types of genetic tools. However, classic approaches to gene mapping are relatively slow and ultimately require extensive Sanger sequencing of candidate chromosomal loci. With the advent of new high-throughput sequencing techniques, it is increasingly efficient to directly re-sequence the whole genome of model organisms. This approach, in combination with traditional chromosomal mapping, has the potential to greatly simplify and accelerate mutation identification in mutants generated in EMS screens. Here we show that next-generation sequencing (NGS) is an accurate and efficient tool for high-throughput sequencing and mutation discovery in Drosophila melanogaster. As a test case, mutant strains of Drosophila that exhibited long-term survival of severed peripheral axons were identified in a forward EMS mutagenesis. All mutants were recessive and fell into a single lethal complementation group, which suggested that a single gene was responsible for the protective axon degenerative phenotype. Whole genome sequencing of these genomes identified the underlying gene ect4. To improve the process of genome wide mutation identification, we developed Genomes Management Application (GEM.app, https://genomics.med.miami.edu), a graphical online user interface to a custom query framework. Using a custom GEM.app query, we were able to identify that each mutant carried a unique non-sense mutation in the gene ect4 (dSarm), which was recently shown by Osterloh et al. to be essential for the activation of axonal degeneration. Our results demonstrate the current advantages and limitations of NGS in Drosophila and we introduce GEM.app as a simple yet powerful genomics analysis tool for the Drosophila community. At a current cost of <$1,000 per genome, NGS should thus become a standard gene discovery tool in EMS induced genetic forward screens.
Collapse
|
74
|
MacConaill LE, Van Hummelen P, Meyerson M, Hahn WC. Clinical implementation of comprehensive strategies to characterize cancer genomes: opportunities and challenges. Cancer Discov 2012; 1:297-311. [PMID: 21935500 DOI: 10.1158/2159-8290.cd-11-0110] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
An increasing number of anticancer therapeutic agents target specific mutant proteins that are expressed by many different tumor types. Recent evidence suggests that the selection of patients whose tumors harbor specific genetic alterations identifies the subset of patients who are most likely to benefit from the use of such agents. As the number of genetic alterations that provide diagnostic and/or therapeutic information increases, the comprehensive characterization of cancer genomes will be necessary to understand the spectrum of distinct genomic alterations in cancer, to identify patients who are likely to respond to particular therapies, and to facilitate the selection of treatment modalities. Rapid developments in new technologies for genomic analysis now provide the means to perform comprehensive analyses of cancer genomes. In this article, we review the current state of cancer genome analysis and discuss the challenges and opportunities necessary to implement these technologies in a clinical setting.
Collapse
Affiliation(s)
- Laura E MacConaill
- Center for Cancer Genome Discovery, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts 02215, USA
| | | | | | | |
Collapse
|
75
|
Obholzer N, Swinburne IA, Schwab E, Nechiporuk AV, Nicolson T, Megason SG. Rapid positional cloning of zebrafish mutations by linkage and homozygosity mapping using whole-genome sequencing. Development 2012; 139:4280-90. [PMID: 23052906 DOI: 10.1242/dev.083931] [Citation(s) in RCA: 68] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Forward genetic screens in zebrafish have identified >9000 mutants, many of which are potential disease models. Most mutants remain molecularly uncharacterized because of the high cost, time and labor investment required for positional cloning. These costs limit the benefit of previous genetic screens and discourage future screens. Drastic improvements in DNA sequencing technology could dramatically improve the efficiency of positional cloning in zebrafish and other model organisms, but the best strategy for cloning by sequencing has yet to be established. Using four zebrafish inner ear mutants, we developed and compared two approaches for 'cloning by sequencing': one based on bulk segregant linkage (BSFseq) and one based on homozygosity mapping (HMFseq). Using BSFseq we discovered that mutations in lmx1b and jagged1b cause abnormal ear morphogenesis. With HMFseq we validated that the disruption of cdh23 abolishes the ear's sensory functions and identified a candidate lesion in lhfpl5a predicted to cause nonsyndromic deafness. The success of HMFseq shows that the high intrastrain polymorphism rate in zebrafish eliminates the need for time-consuming map crosses. Additionally, we analyzed diversity in zebrafish laboratory strains to find areas of elevated diversity and areas of fixed homozygosity, reinforcing recent findings that genome diversity is clustered. We present a database of >15 million sequence variants that provides much of this approach's power. In our four test cases, only a single candidate single nucleotide polymorphism (SNP) remained after subtracting all database SNPs from a mutant's critical region. The saturation of the common SNP database and our open source analysis pipeline MegaMapper will improve the pace at which the zebrafish community makes unique discoveries relevant to human health.
Collapse
Affiliation(s)
- Nikolaus Obholzer
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | | | | | | | | | | |
Collapse
|
76
|
[Application of next generation sequencing in microRNA detection]. YI CHUAN = HEREDITAS 2012; 34:784-92. [PMID: 22698751 DOI: 10.3724/sp.j.1005.2012.00784] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
MicroRNAs (miRNAs) are a class of ~22nt long non-coding RNAs. They are evolutionarily conserved and play essential roles in the regulation of post-transcriptional gene expression. The rapidly developing next generation sequencing (NGS) has important applications in miRNA detection. This review is focused on the mechanism of three NGS platforms and their applications in miRNA detection. In contrast to traditional methods, NGS has major advantages: high throughput, precise, accurate, and repeatable. Its application includes new miRNAs exploration, detection of miRNA*, miRNA editing, and isomiR and target mRNA detection. As NGS develops, the cost of sequencing is declining which makes it possible for NGS to be widely used in the coming years. Next generation sequencing will greatly promote researches of miRNAs.
Collapse
|
77
|
Wong A, Rodrigue N, Kassen R. Genomics of adaptation during experimental evolution of the opportunistic pathogen Pseudomonas aeruginosa. PLoS Genet 2012; 8:e1002928. [PMID: 23028345 PMCID: PMC3441735 DOI: 10.1371/journal.pgen.1002928] [Citation(s) in RCA: 110] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2011] [Accepted: 07/15/2012] [Indexed: 01/03/2023] Open
Abstract
Adaptation is likely to be an important determinant of the success of many pathogens, for example when colonizing a new host species, when challenged by antibiotic treatment, or in governing the establishment and progress of long-term chronic infection. Yet, the genomic basis of adaptation is poorly understood in general, and for pathogens in particular. We investigated the genetics of adaptation to cystic fibrosis-like culture conditions in the presence and absence of fluoroquinolone antibiotics using the opportunistic pathogen Pseudomonas aeruginosa. Whole-genome sequencing of experimentally evolved isolates revealed parallel evolution at a handful of known antibiotic resistance genes. While the level of antibiotic resistance was largely determined by these known resistance genes, the costs of resistance were instead attributable to a number of mutations that were specific to individual experimental isolates. Notably, stereotypical quinolone resistance mutations in DNA gyrase often co-occurred with other mutations that, together, conferred high levels of resistance but no consistent cost of resistance. This result may explain why these mutations are so prevalent in clinical quinolone-resistant isolates. In addition, genes involved in cyclic-di-GMP signalling were repeatedly mutated in populations evolved in viscous culture media, suggesting a shared mechanism of adaptation to this CF–like growth environment. Experimental evolutionary approaches to understanding pathogen adaptation should provide an important complement to studies of the evolution of clinical isolates. Pathogens face a hostile and often novel environment when infecting a new host, and adaptation to this environment can be critical to a pathogen's survival. The genetic basis of pathogen adaptation is in turn important for treatment, since the consistency with which therapies succeed may depend on the extent to which a pathogen adapts via the same routes in different patients. In this study, we investigate adaptation of the bacterium Pseudomonas aeruginosa to laboratory conditions that resemble the lungs of cystic fibrosis patients and to quinolone antibiotics. We find that a handful of genes and genetic pathways are repeatedly involved in adaptation to each condition. Nonetheless, other, less common mutations can play important roles in determining fitness, complicating strategies aimed at reducing the prevalence of antibiotic resistance.
Collapse
Affiliation(s)
- Alex Wong
- Department of Biology, Carleton University, Ottawa, Canada.
| | | | | |
Collapse
|
78
|
Lin Y, Li Z, Ozsolak F, Kim SW, Arango-Argoty G, Liu TT, Tenenbaum SA, Bailey T, Monaghan AP, Milos PM, John B. An in-depth map of polyadenylation sites in cancer. Nucleic Acids Res 2012; 40:8460-71. [PMID: 22753024 PMCID: PMC3458571 DOI: 10.1093/nar/gks637] [Citation(s) in RCA: 115] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Revised: 05/16/2012] [Accepted: 06/06/2012] [Indexed: 12/22/2022] Open
Abstract
We present a comprehensive map of over 1 million polyadenylation sites and quantify their usage in major cancers and tumor cell lines using direct RNA sequencing. We built the Expression and Polyadenylation Database to enable the visualization of the polyadenylation maps in various cancers and to facilitate the discovery of novel genes and gene isoforms that are potentially important to tumorigenesis. Analyses of polyadenylation sites indicate that a large fraction (∼30%) of mRNAs contain alternative polyadenylation sites in their 3' untranslated regions, independent of the cell type. The shortest 3' untranslated region isoforms are preferentially upregulated in cancer tissues, genome-wide. Candidate targets of alternative polyadenylation-mediated upregulation of short isoforms include POLR2K, and signaling cascades of cell-cell and cell-extracellular matrix contact, particularly involving regulators of Rho GTPases. Polyadenylation maps also helped to improve 3' untranslated region annotations and identify candidate regulatory marks such as sequence motifs, H3K36Me3 and Pabpc1 that are isoform dependent and occur in a position-specific manner. In summary, these results highlight the need to go beyond monitoring only the cumulative transcript levels for a gene, to separately analysing the expression of its RNA isoforms.
Collapse
Affiliation(s)
- Yuefeng Lin
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Zhihua Li
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Fatih Ozsolak
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Sang Woo Kim
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Gustavo Arango-Argoty
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Teresa T. Liu
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Scott A. Tenenbaum
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Timothy Bailey
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - A. Paula Monaghan
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Patrice M. Milos
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Bino John
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| |
Collapse
|
79
|
Elsharawy A, Forster M, Schracke N, Keller A, Thomsen I, Petersen BS, Stade B, Stähler P, Schreiber S, Rosenstiel P, Franke A. Improving mapping and SNP-calling performance in multiplexed targeted next-generation sequencing. BMC Genomics 2012; 13:417. [PMID: 22913592 PMCID: PMC3563481 DOI: 10.1186/1471-2164-13-417] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 08/10/2012] [Indexed: 11/10/2022] Open
Abstract
Background Compared to classical genotyping, targeted next-generation sequencing (tNGS) can be custom-designed to interrogate entire genomic regions of interest, in order to detect novel as well as known variants. To bring down the per-sample cost, one approach is to pool barcoded NGS libraries before sample enrichment. Still, we lack a complete understanding of how this multiplexed tNGS approach and the varying performance of the ever-evolving analytical tools can affect the quality of variant discovery. Therefore, we evaluated the impact of different software tools and analytical approaches on the discovery of single nucleotide polymorphisms (SNPs) in multiplexed tNGS data. To generate our own test model, we combined a sequence capture method with NGS in three experimental stages of increasing complexity (E. coli genes, multiplexed E. coli, and multiplexed HapMap BRCA1/2 regions). Results We successfully enriched barcoded NGS libraries instead of genomic DNA, achieving reproducible coverage profiles (Pearson correlation coefficients of up to 0.99) across multiplexed samples, with <10% strand bias. However, the SNP calling quality was substantially affected by the choice of tools and mapping strategy. With the aim of reducing computational requirements, we compared conventional whole-genome mapping and SNP-calling with a new faster approach: target-region mapping with subsequent ‘read-backmapping’ to the whole genome to reduce the false detection rate. Consequently, we developed a combined mapping pipeline, which includes standard tools (BWA, SAMtools, etc.), and tested it on public HiSeq2000 exome data from the 1000 Genomes Project. Our pipeline saved 12 hours of run time per Hiseq2000 exome sample and detected ~5% more SNPs than the conventional whole genome approach. This suggests that more potential novel SNPs may be discovered using both approaches than with just the conventional approach. Conclusions We recommend applying our general ‘two-step’ mapping approach for more efficient SNP discovery in tNGS. Our study has also shown the benefit of computing inter-sample SNP-concordances and inspecting read alignments in order to attain more confident results.
Collapse
Affiliation(s)
- Abdou Elsharawy
- Institute of Clinical Molecular Biology, Christian-Albrechts-University, Kiel, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
80
|
Jhanwar S, Priya P, Garg R, Parida SK, Tyagi AK, Jain M. Transcriptome sequencing of wild chickpea as a rich resource for marker development. PLANT BIOTECHNOLOGY JOURNAL 2012; 10:690-702. [PMID: 22672127 DOI: 10.1111/j.1467-7652.2012.00712.x] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
The transcriptome of cultivated chickpea (Cicer arietinum L.), an important crop legume, has recently been sequenced. Here, we report sequencing of the transcriptome of wild chickpea, C. reticulatum (PI489777), the progenitor of cultivated chickpea, by GS-FLX 454 technology. The optimized assembly of C. reticulatum transcriptome generated 37 265 transcripts in total with an average length of 946 bp. A total of 4072 simple sequence repeats (SSRs) could be identified in these transcript sequences, of which at least 561 SSRs were polymorphic between C. arietinum and C. reticulatum. In addition, a total of 36 446 single-nucleotide polymorphisms (SNPs) were identified after optimization of probability score, quality score, read depth and consensus base ratio. Several of these SSRs and SNPs could be associated with tissue-specific and transcription factor encoding transcripts. A high proportion (92-94%) of polymorphic SSRs and SNPs identified between the two chickpea species were validated successfully. Further, the estimation of synonymous substitution rates of orthologous transcript pairs suggested that the speciation event for divergence of C. arietinum and C. reticulatum may have happened approximately 0.53 million years ago. The results of our study provide a rich resource for exploiting genetic variations in chickpea for breeding programmes.
Collapse
Affiliation(s)
- Shalu Jhanwar
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, India
| | | | | | | | | | | |
Collapse
|
81
|
High-throughput multilocus sequence typing: bringing molecular typing to the next level. PLoS One 2012; 7:e39630. [PMID: 22815712 PMCID: PMC3399827 DOI: 10.1371/journal.pone.0039630] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2012] [Accepted: 05/23/2012] [Indexed: 11/19/2022] Open
Abstract
Multilocus sequence typing (MLST) is a widely used system for typing microorganisms by sequence analysis of housekeeping genes. The main advantage of MLST in comparison to other typing techniques is the unambiguity and transferability of sequence data. However, a main disadvantage is the high cost of DNA sequencing. Here we introduce a high-throughput MLST (HiMLST) method that employs next-generation sequencing (NGS) technology (Roche 454), to generate large quantities of high-quality MLST data at low costs. The HiMLST protocol consists of two steps. In the first step MLST target genes are amplified by PCR in multi-well plates. During this PCR the amplicons of each bacterial isolate are provided with a unique DNA barcode, the multiplex identifier (MID). In the second step all amplicons are pooled and sequenced in a single NGS-run. The MLST profile of each individual isolate can be retrieved easily using its unique MID. With HiMLST we have profiled 575 isolates of Legionella pneumophila, Staphylococcus aureus, Pseudomonas aeruginosa and Streptococcus pneumoniae in mixed species HiMLST experiments. In conclusion, the introduction of HiMLST paves the way for a broad employment of the MLST as a high-quality and cost-effective method for typing microbial species.
Collapse
|
82
|
Besaratinia A, Li H, Yoon JI, Zheng A, Gao H, Tommasi S. A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens. Nucleic Acids Res 2012; 40:e116. [PMID: 22735701 PMCID: PMC3424585 DOI: 10.1093/nar/gks610] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Many carcinogens leave a unique mutational fingerprint in the human genome. These mutational fingerprints manifest as specific types of mutations often clustering at certain genomic loci in tumor genomes from carcinogen-exposed individuals. To develop a high-throughput method for detecting the mutational fingerprint of carcinogens, we have devised a cost-, time- and labor-effective strategy, in which the widely used transgenic Big Blue® mouse mutation detection assay is made compatible with the Roche/454 Genome Sequencer FLX Titanium next-generation sequencing technology. As proof of principle, we have used this novel method to establish the mutational fingerprints of three prominent carcinogens with varying mutagenic potencies, including sunlight ultraviolet radiation, 4-aminobiphenyl and secondhand smoke that are known to be strong, moderate and weak mutagens, respectively. For verification purposes, we have compared the mutational fingerprints of these carcinogens obtained by our newly developed method with those obtained by parallel analyses using the conventional low-throughput approach, that is, standard mutation detection assay followed by direct DNA sequencing using a capillary DNA sequencer. We demonstrate that this high-throughput next-generation sequencing-based method is highly specific and sensitive to detect the mutational fingerprints of the tested carcinogens. The method is reproducible, and its accuracy is comparable with that of the currently available low-throughput method. In conclusion, this novel method has the potential to move the field of carcinogenesis forward by allowing high-throughput analysis of mutations induced by endogenous and/or exogenous genotoxic agents.
Collapse
Affiliation(s)
- Ahmad Besaratinia
- Department of Cancer Biology, Beckman Research Institute of City of Hope, 1500 East Duarte Road, Duarte, CA 91010, USA.
| | | | | | | | | | | |
Collapse
|
83
|
Emerging technologies for improved stratification of cancer patients: a review of opportunities, challenges, and tools. Cancer J 2012; 17:451-64. [PMID: 22157289 DOI: 10.1097/ppo.0b013e31823bd1f8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Cancer is a heterogeneous collection of diseases with wild variation in etiology, pathogenesis, response to therapy, and prognosis. Sources of variation are frequently obscure. Current practice attempts to classify tumors by tissue of origin and extent of disease through staging such that more risky tumors can be managed with more aggressive treatments. Modest inroads have been made with biomarkers to further characterize groups of tumors with important characteristics such as response to selected drugs. However, biomarker-driven decisions are relatively few when examining the maze of clinical decisions in the care of cancer patients. Against this backdrop, waves of researchers have unleashed a vast array of new technologies, with the goal of better characterization of the inherent diversity of tumors. This review outlines the use of cancer biomarkers and emerging technologies to stratify patients with a focus on the challenges and opportunities of next-generation nucleic acid sequencing approaches in oncology.
Collapse
|
84
|
Vidaurre D, Bonetta D. Accelerating forward genetics for cell wall deconstruction. FRONTIERS IN PLANT SCIENCE 2012; 3:119. [PMID: 22685448 PMCID: PMC3368152 DOI: 10.3389/fpls.2012.00119] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Accepted: 05/17/2012] [Indexed: 05/29/2023]
Abstract
The elucidation of the genes involved in cell wall synthesis and assembly remains one of the biggest challenges of cell wall biology. Although traditional genetic approaches, using simple yet elegant screens, have identified components of the cell wall, many unknowns remain. Exhausting the genetic toolbox by performing sensitized screens, adopting chemical genetics or combining these with improved cell wall imaging, hold the promise of new gene discovery and function. With the recent introduction of next-generation sequencing technologies, it is now possible to quickly and efficiently map and clone genes of interest in record time. The combination of a classical genetics approach and cutting edge technology will propel cell wall biology in plants forward into the future.
Collapse
Affiliation(s)
- Danielle Vidaurre
- Department of Cell and Systems Biology, University of Toronto,Toronto, ON, Canada
| | - Dario Bonetta
- Faculty of Science, University of Ontario Institute of Technology,Oshawa, ON, Canada
| |
Collapse
|
85
|
Solieri L, Dakal TC, Giudici P. Next-generation sequencing and its potential impact on food microbial genomics. ANN MICROBIOL 2012. [DOI: 10.1007/s13213-012-0478-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open
|
86
|
SNP-Ratio Mapping (SRM): identifying lethal alleles and mutations in complex genetic backgrounds by next-generation sequencing. Genetics 2012; 191:1381-6. [PMID: 22649081 DOI: 10.1534/genetics.112.141341] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We present a generally applicable method allowing rapid identification of causal alleles in mutagenized genomes by next-generation sequencing. Currently used approaches rely on recovering homozygotes or extensive backcrossing. In contrast, SNP-ratio mapping allows rapid cloning of lethal and/or poorly transmitted mutations and second-site modifiers, which are often in complex genetic/transgenic backgrounds.
Collapse
|
87
|
Gupta P, Swanberg JC, Lee KH. A single nucleotide polymorphism in ycdC alters tRNA synthetase expression and results in hypersecretion in Escherichia coli. Biotechnol Prog 2012; 28:646-53. [PMID: 22505047 DOI: 10.1002/btpr.1550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2011] [Revised: 04/06/2012] [Indexed: 11/08/2022]
Abstract
The most important approach to the development of platform organisms for recombinant protein production relies on random mutagenesis and phenotypic selection. Complex phenotypes, including those associated with significantly elevated expression and secretion of heterologous proteins, are the result of multiple genomic mutations. Using next generation sequencing, a parent and derivative hypersecreter strain (B41) of Escherichia coli were sequenced with an average coverage of 52.8X and 55X, respectively. A new base-pair calling program, revealed a single nucleotide polymorphism in the B41 genome at position 1,074,787, resulting in translation termination near the N-terminus of a transcriptional regulator protein, RutR, coded by the ycdC gene. We verified the hypersecretion phenotype in a ycdC::Tn5 mutant and observed a 3.4-fold increase in active hemolysin secretion, consistent with the increase observed in B41 strain. mRNA expression profiling showed decreased expression of tRNA-synthetases and some amino acid transporters in the ycdC::Tn5 mutant. This study demonstrates the power of next generation sequencing to characterize mutants leading to successful metabolic engineering strategies for strain improvement.
Collapse
Affiliation(s)
- Prateek Gupta
- School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY 14853, USA
| | | | | |
Collapse
|
88
|
Dark MJ, Lundgren AM, Barbet AF. Determining the repertoire of immunodominant proteins via whole-genome amplification of intracellular pathogens. PLoS One 2012; 7:e36456. [PMID: 22558468 PMCID: PMC3340345 DOI: 10.1371/journal.pone.0036456] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 04/07/2012] [Indexed: 11/18/2022] Open
Abstract
Culturing many obligate intracellular bacteria is difficult or impossible. However, these organisms have numerous adaptations allowing for infection persistence and immune system evasion, making them some of the most interesting to study. Recent advancements in genome sequencing, pyrosequencing and Phi29 amplification, have allowed for examination of whole-genome sequences of intracellular bacteria without culture. We have applied both techniques to the model obligate intracellular pathogen Anaplasma marginale and the human pathogen Anaplasma phagocytophilum, in order to examine the ability of phi29 amplification to determine the sequence of genes allowing for immune system evasion and long-term persistence in the host. When compared to traditional pyrosequencing, phi29-mediated genome amplification had similar genome coverage, with no additional gaps in coverage. Additionally, all msp2 functional pseudogenes from two strains of A. marginale were detected and extracted from the phi29-amplified genomes, highlighting its utility in determining the full complement of genes involved in immune evasion.
Collapse
Affiliation(s)
- Michael J Dark
- Department of Infectious Diseases and Pathology, College of Veterinary Medicine, University of Florida, Gainesville, Florida, USA.
| | | | | |
Collapse
|
89
|
Beal MA, Glenn TC, Somers CM. Whole genome sequencing for quantifying germline mutation frequency in humans and model species: cautious optimism. Mutat Res 2012; 750:96-106. [PMID: 22178956 DOI: 10.1016/j.mrrev.2011.11.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2011] [Revised: 11/29/2011] [Accepted: 11/30/2011] [Indexed: 05/31/2023]
Abstract
Factors affecting the type and frequency of germline mutations in animals are of significant interest from health and toxicology perspectives. However, studies in this field have been limited by the use of markers with low detection power or uncertain relevance to phenotype. Whole genome sequencing (WGS) is now a potential option to directly determine germline mutation type and frequency in family groups at all loci simultaneously. Medical studies have already capitalized on WGS to identify novel mutations in human families for clinical purposes, such as identifying candidate genes contributing to inherited conditions. However, WGS has not yet been used in any studies of vertebrates that aim to quantify changes in germline mutation frequency as a result of environmental factors. WGS is a promising tool for detecting mutation induction, but it is currently limited by several technical challenges. Perhaps the most pressing issue is sequencing error rates that are currently high in comparison to the intergenerational mutation frequency. Different platforms and depths of coverage currently result in a range of 10-10(3) false positives for every true mutation. In addition, the cost of WGS is still relatively high, particularly when comparing mutation frequencies among treatment groups with even moderate sample sizes. Despite these challenges, WGS offers the potential for unprecedented insight into germline mutation processes. Refinement of available tools and emergence of new technologies may be able to provide the improved accuracy and reduced costs necessary to make WGS viable in germline mutation studies in the very near future. To streamline studies, researchers may use multiple family triads per treatment group and sequence a targeted (reduced) portion of each genome with high (20-40 ×) depth of coverage. We are optimistic about the application of WGS for quantifying germline mutations, but caution researchers regarding the resource-intensive nature of the work using existing technology.
Collapse
Affiliation(s)
- Marc A Beal
- University of Regina, Department of Biology, 3737 Wascana Parkway, Regina, Saskatchewan, Canada S4S 0A2
| | - Travis C Glenn
- University of Georgia, Environmental Health Science, College of Public Health, Athens, GA 30602, USA
| | - Christopher M Somers
- University of Regina, Department of Biology, 3737 Wascana Parkway, Regina, Saskatchewan, Canada S4S 0A2.
| |
Collapse
|
90
|
Rodríguez-Santiago B, Armengol L. Tecnologías de secuenciación de nueva generación en diagnóstico genético pre- y postnatal. ACTA ACUST UNITED AC 2012. [DOI: 10.1016/j.diapre.2012.02.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
91
|
Dettman JR, Rodrigue N, Melnyk AH, Wong A, Bailey SF, Kassen R. Evolutionary insight from whole-genome sequencing of experimentally evolved microbes. Mol Ecol 2012; 21:2058-77. [PMID: 22332770 DOI: 10.1111/j.1365-294x.2012.05484.x] [Citation(s) in RCA: 114] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Experimental evolution (EE) combined with whole-genome sequencing (WGS) has become a compelling approach to study the fundamental mechanisms and processes that drive evolution. Most EE-WGS studies published to date have used microbes, owing to their ease of propagation and manipulation in the laboratory and relatively small genome sizes. These experiments are particularly suited to answer long-standing questions such as: How many mutations underlie adaptive evolution, and how are they distributed across the genome and through time? Are there general rules or principles governing which genes contribute to adaptation, and are certain kinds of genes more likely to be targets than others? How common is epistasis among adaptive mutations, and what does this reveal about the variety of genetic routes to adaptation? How common is parallel evolution, where the same mutations evolve repeatedly and independently in response to similar selective pressures? Here, we summarize the significant findings of this body of work, identify important emerging trends and propose promising directions for future research. We also outline an example of a computational pipeline for use in EE-WGS studies, based on freely available bioinformatics tools.
Collapse
Affiliation(s)
- Jeremy R Dettman
- Department of Biology and Centre for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, ON K1N 6N5, Canada.
| | | | | | | | | | | |
Collapse
|
92
|
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis. BMC Genomics 2012; 13:43. [PMID: 22276739 PMCID: PMC3284879 DOI: 10.1186/1471-2164-13-43] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 01/25/2012] [Indexed: 01/29/2023] Open
Abstract
BACKGROUND The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. RESULTS Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. CONCLUSIONS By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
Collapse
|
93
|
Farrell A, Thirugnanam S, Lorestani A, Dvorin JD, Eidell KP, Ferguson DJ, Anderson-White BR, Duraisingh MT, Marth GT, Gubbels MJ. A DOC2 protein identified by mutational profiling is essential for apicomplexan parasite exocytosis. Science 2012; 335:218-21. [PMID: 22246776 PMCID: PMC3354045 DOI: 10.1126/science.1210829] [Citation(s) in RCA: 100] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Exocytosis is essential to the lytic cycle of apicomplexan parasites and required for the pathogenesis of toxoplasmosis and malaria. DOC2 proteins recruit the membrane fusion machinery required for exocytosis in a Ca(2+)-dependent fashion. Here, the phenotype of a Toxoplasma gondii conditional mutant impaired in host cell invasion and egress was pinpointed to a defect in secretion of the micronemes, an apicomplexan-specific organelle that contains adhesion proteins. Whole-genome sequencing identified the etiological point mutation in TgDOC2.1. A conditional allele of the orthologous gene engineered into Plasmodium falciparum was also defective in microneme secretion. However, the major effect was on invasion, suggesting that microneme secretion is dispensable for Plasmodium egress.
Collapse
Affiliation(s)
- Andrew Farrell
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | | | | | - Jeffrey D. Dvorin
- Department of Immunology and Infectious Diseases, Harvard School of Public Health, Boston, MA 02115, USA
- Division of Infectious Diseases, Children’s Hospital Boston, Boston, MA 02115, USA
| | - Keith P. Eidell
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - David J.P. Ferguson
- Nuffield Department of Clinical Laboratory Science, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DU, UK
| | | | - Manoj T. Duraisingh
- Department of Immunology and Infectious Diseases, Harvard School of Public Health, Boston, MA 02115, USA
| | - Gabor T. Marth
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| | - Marc-Jan Gubbels
- Department of Biology, Boston College, Chestnut Hill, MA 02467, USA
| |
Collapse
|
94
|
Almiñana C, Fazeli A. Exploring the application of high-throughput genomics technologies in the field of maternal-embryo communication. Theriogenology 2012; 77:717-37. [PMID: 22217573 DOI: 10.1016/j.theriogenology.2011.11.009] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Revised: 08/30/2011] [Accepted: 09/02/2011] [Indexed: 01/23/2023]
Abstract
Deciphering the complex molecular dialogue between the maternal tract and embryo is crucial to increasing our understanding of pregnancy failure, infertility problems and in the modulation of embryo development, which has consequences through adulthood. High-throughput genomic technologies have been applied to look for a holistic view of the molecular interactions occurring during this dialogue. Among these technologies, microarrays have been widely used, being one of the most popular tools in maternal-embryo communication. Today, next generation sequencing technologies are dwarfing the capabilities of microarrays. The application of these new technologies has broadened to almost all areas of genomics research, because of their massive sequencing capacity. We review the current status of high-throughput genomic technologies and their application to maternal-embryo communication research. We also survey next generation technologies and their huge potential in many research areas. Given the diversity of unanswered questions in the field of maternal-embryo communication and the wide range of possibilities that these technologies offer, here we discuss future perspectives on the use of these technologies to enhance maternal-embryo research.
Collapse
Affiliation(s)
- Carmen Almiñana
- Academic Unit of Reproductive and Development Medicine, University of Sheffield, Sheffield, UK.
| | | |
Collapse
|
95
|
Ethanol-tolerant gene identification in Clostridium thermocellum using pyro-resequencing for metabolic engineering. Methods Mol Biol 2012; 834:111-36. [PMID: 22144357 DOI: 10.1007/978-1-61779-483-4_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Classic strain development that combines random mutagenesis and selection has a long history of success in generation of biocatalysts with industrially designed traits. However, the genetic loci contributing to the phenotypic strain changes are difficult to identify prior to genome sequencing technology advancement. In this chapter, we present the approach using Roche 454 next-generation pyro-resequencing to identify the genotypic changes such as single nucleotide polymorphisms (SNP) associated with an ethanol-tolerant strain of Clostridium thermocellum. The parameters used to filter the pyro-resequencing output for SNP identification are also discussed. These can help researchers to identify the genotypic change of other biocatalysts for strain improvement through metabolic engineering.
Collapse
|
96
|
Barbieri CE, Demichelis F, Rubin MA. Molecular genetics of prostate cancer: emerging appreciation of genetic complexity. Histopathology 2011; 60:187-98. [DOI: 10.1111/j.1365-2559.2011.04041.x] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
97
|
Larkum AWD, Ross IL, Kruse O, Hankamer B. Selection, breeding and engineering of microalgae for bioenergy and biofuel production. Trends Biotechnol 2011; 30:198-205. [PMID: 22178650 DOI: 10.1016/j.tibtech.2011.11.003] [Citation(s) in RCA: 145] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Revised: 11/07/2011] [Accepted: 11/07/2011] [Indexed: 01/08/2023]
Abstract
Microalgal production technologies are seen as increasingly attractive for bioenergy production to improve fuel security and reduce CO(2) emissions. Photosynthetically derived fuels are a renewable, potentially carbon-neutral and scalable alternative reserve. Microalgae have particular promise because they can be produced on non-arable land and utilize saline and wastewater streams. Furthermore, emerging microalgal technologies can be used to produce a range of products such as biofuels, protein-rich animal feeds, chemical feedstocks (e.g. bioplastic precursors) and higher-value products. This review focuses on the selection, breeding and engineering of microalgae for improved biomass and biofuel conversion efficiencies.
Collapse
Affiliation(s)
- Anthony W D Larkum
- Climate Change Cluster, University of Technology (Sydney), Broadway, NSW 2007, Australia.
| | | | | | | |
Collapse
|
98
|
RNA-Seq and its applications: a new technology for transcriptomics. YI CHUAN = HEREDITAS 2011; 33:1191-202. [DOI: 10.3724/sp.j.1005.2011.01191] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
99
|
Mertes F, Elsharawy A, Sauer S, van Helvoort JMLM, van der Zaag PJ, Franke A, Nilsson M, Lehrach H, Brookes AJ. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief Funct Genomics 2011; 10:374-86. [PMID: 22121152 PMCID: PMC3245553 DOI: 10.1093/bfgp/elr033] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings.
Collapse
Affiliation(s)
- Florian Mertes
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
100
|
Construction and evaluation of a whole genome microarray of Chlamydomonas reinhardtii. BMC Genomics 2011; 12:579. [PMID: 22118351 PMCID: PMC3235179 DOI: 10.1186/1471-2164-12-579] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 11/25/2011] [Indexed: 02/03/2023] Open
Abstract
Background Chlamydomonas reinhardtii is widely accepted as a model organism regarding photosynthesis, circadian rhythm, cell mobility, phototaxis, and biotechnology. The complete annotation of the genome allows transcriptomic studies, however a new microarray platform was needed. Based on the completed annotation of Chlamydomonas reinhardtii a new microarray on an Agilent platform was designed using an extended JGI 3.1 genome data set which included 15000 transcript models. Results In total 44000 probes were determined (3 independent probes per transcript model) covering 93% of the transcriptome. Alignment studies with the recently published AUGUSTUS 10.2 annotation confirmed 11000 transcript models resulting in a very good coverage of 70% of the transcriptome (17000). Following the estimation of 10000 predicted genes in Chlamydomonas reinhardtii our new microarray, nevertheless, covers the expected genome by 90-95%. Conclusions To demonstrate the capabilities of the new microarray, we analyzed transcript levels for cultures grown under nitrogen as well as sulfate limitation, and compared the results with recently published microarray and RNA-seq data. We could thereby confirm previous results derived from data on nutrient-starvation induced gene expression of a group of genes related to protein transport and adaptation of the metabolism as well as genes related to efficient light harvesting, light energy distribution and photosynthetic electron transport.
Collapse
|