1
|
Deng W, Mou T, Kalari KR, Niu N, Wang L, Pawitan Y, Vu TN. Alternating EM algorithm for a bilinear model in isoform quantification from RNA-seq data. Bioinformatics 2019; 36:805-812. [PMID: 31400221 PMCID: PMC9883676 DOI: 10.1093/bioinformatics/btz640] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 06/13/2019] [Accepted: 08/09/2019] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Estimation of isoform-level gene expression from RNA-seq data depends on simplifying assumptions, such as uniform read distribution, that are easily violated in real data. Such violations typically lead to biased estimates. Most existing methods provide bias correction step(s), which is based on biological considerations-such as GC content-and applied in single samples separately. The main problem is that not all biases are known. RESULTS We have developed a novel method called XAEM based on a more flexible and robust statistical model. Existing methods are essentially based on a linear model Xβ, where the design matrix X is known and is computed based on the simplifying assumptions. In contrast XAEM considers Xβ as a bilinear model with both X and β unknown. Joint estimation of X and β is made possible by a simultaneous analysis of multi-sample RNA-seq data. Compared to existing methods, XAEM automatically performs empirical correction of potentially unknown biases. We use an alternating expectation-maximization (AEM) algorithm, alternating between estimation of X and β. For speed XAEM utilizes quasi-mapping for read alignment, thus leading to a fast algorithm. Overall XAEM performs favorably compared to recent advanced methods. For simulated datasets, XAEM obtains higher accuracy for multiple-isoform genes. In a differential-expression analysis of a real single-cell RNA-seq dataset, XAEM achieves substantially better rediscovery rates in independent validation sets. AVAILABILITY AND IMPLEMENTATION The method and pipeline are implemented as a tool and freely available for use at http://fafner.meb.ki.se/biostatwiki/xaem/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenjiang Deng
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Tian Mou
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden
| | | | - Nifang Niu
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | - Liewei Wang
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, MN 55905, USA
| | | | | |
Collapse
|
2
|
Shridhar S, Klanert G, Auer N, Hernandez-Lopez I, Kańduła MM, Hackl M, Grillari J, Stralis-Pavese N, Kreil DP, Borth N. Transcriptomic changes in CHO cells after adaptation to suspension growth in protein-free medium analysed by a species-specific microarray. J Biotechnol 2017; 257:13-21. [DOI: 10.1016/j.jbiotec.2017.03.012] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2017] [Revised: 03/07/2017] [Accepted: 03/11/2017] [Indexed: 11/26/2022]
|
3
|
Liebe S, Christ DS, Ehricht R, Varrelmann M. Development of a DNA Microarray-Based Assay for the Detection of Sugar Beet Root Rot Pathogens. PHYTOPATHOLOGY 2016; 106:76-86. [PMID: 26524545 DOI: 10.1094/phyto-07-15-0171-r] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Sugar beet root rot diseases that occur during the cropping season or in storage are accompanied by high yield losses and a severe reduction of processing quality. The vast diversity of microorganism species involved in rot development requires molecular tools allowing simultaneous identification of many different targets. Therefore, a new microarray technology (ArrayTube) was applied in this study to improve diagnosis of sugar beet root rot diseases. Based on three marker genes (internal transcribed spacer, translation elongation factor 1 alpha, and 16S ribosomal DNA), 42 well-performing probes enabled the identification of prevalent field pathogens (e.g., Aphanomyces cochlioides), storage pathogens (e.g., Botrytis cinerea), and ubiquitous spoilage fungi (e.g., Penicillium expansum). All probes were proven for specificity with pure cultures from 73 microorganism species as well as for in planta detection of their target species using inoculated sugar beet tissue. Microarray-based identification of root rot pathogens in diseased field beets was successfully confirmed by classical detection methods. The high discriminatory potential was proven by Fusarium species differentiation based on a single nucleotide polymorphism. The results demonstrate that the ArrayTube constitute an innovative tool allowing a rapid and reliable detection of plant pathogens particularly when multiple microorganism species are present.
Collapse
Affiliation(s)
- Sebastian Liebe
- First, second, and fourth authors: Institute of Sugar Beet Research, Holtenser Landstr. 77, 37079 Göttingen, Germany; and third author: Alere Technologies GmbH, Löbstedter Str. 105, 07743 Jena, Germany, and InfectoGnostics Research Campus Jena, Germany
| | - Daniela S Christ
- First, second, and fourth authors: Institute of Sugar Beet Research, Holtenser Landstr. 77, 37079 Göttingen, Germany; and third author: Alere Technologies GmbH, Löbstedter Str. 105, 07743 Jena, Germany, and InfectoGnostics Research Campus Jena, Germany
| | - Ralf Ehricht
- First, second, and fourth authors: Institute of Sugar Beet Research, Holtenser Landstr. 77, 37079 Göttingen, Germany; and third author: Alere Technologies GmbH, Löbstedter Str. 105, 07743 Jena, Germany, and InfectoGnostics Research Campus Jena, Germany
| | - Mark Varrelmann
- First, second, and fourth authors: Institute of Sugar Beet Research, Holtenser Landstr. 77, 37079 Göttingen, Germany; and third author: Alere Technologies GmbH, Löbstedter Str. 105, 07743 Jena, Germany, and InfectoGnostics Research Campus Jena, Germany
| |
Collapse
|
4
|
Spjuth O, Bongcam-Rudloff E, Hernández GC, Forer L, Giovacchini M, Guimera RV, Kallio A, Korpelainen E, Kańduła MM, Krachunov M, Kreil DP, Kulev O, Łabaj PP, Lampa S, Pireddu L, Schönherr S, Siretskiy A, Vassilev D. Experiences with workflows for automating data-intensive bioinformatics. Biol Direct 2015; 10:43. [PMID: 26282399 PMCID: PMC4539931 DOI: 10.1186/s13062-015-0071-8] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Accepted: 08/03/2015] [Indexed: 11/16/2022] Open
Abstract
High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate tasks, support reproducibility and provide measures for fault-tolerance. However, workflow systems can incur significant development and administration overhead so bioinformatics pipelines are often still built without them. We present the experiences with workflows and workflow systems within the bioinformatics community participating in a series of hackathons and workshops of the EU COST action SeqAhead. The organizations are working on similar problems, but we have addressed them with different strategies and solutions. This fragmentation of efforts is inefficient and leads to redundant and incompatible solutions. Based on our experiences we define a set of recommendations for future systems to enable efficient yet simple bioinformatics workflow construction and execution. Reviewers This article was reviewed by Dr Andrew Clark.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, SE-75124, Uppsala, P.O. Box 591, Sweden.
| | - Erik Bongcam-Rudloff
- SLU-Global Bioinformatics Centre, Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden.
| | | | - Lukas Forer
- Division of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, 6020, Austria.
| | - Mario Giovacchini
- Science for Life Laboratory, Karolinska Institutet, SE-17121, Stockholm, P.O. Box 1031, Sweden.
| | - Roman Valls Guimera
- Science for Life Laboratory, Karolinska Institutet, SE-17121, Stockholm, P.O. Box 1031, Sweden.
| | - Aleksi Kallio
- CSC - IT Center for Science Ltd., FI-02101, Espoo, P.O. Box 405, Finland.
| | - Eija Korpelainen
- CSC - IT Center for Science Ltd., FI-02101, Espoo, P.O. Box 405, Finland.
| | - Maciej M Kańduła
- Chair of Bioinformatics Research Group, Boku University, Vienna, Austria.
| | - Milko Krachunov
- Faculty of Mathematics and Informatics, Sofia University, Sofia, Bulgaria.
| | - David P Kreil
- Chair of Bioinformatics Research Group, Boku University, Vienna, Austria.
| | - Ognyan Kulev
- Faculty of Mathematics and Informatics, Sofia University, Sofia, Bulgaria.
| | - Paweł P Łabaj
- Chair of Bioinformatics Research Group, Boku University, Vienna, Austria.
| | - Samuel Lampa
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, SE-75124, Uppsala, P.O. Box 591, Sweden.
| | | | - Sebastian Schönherr
- Division of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, 6020, Austria.
| | - Alexey Siretskiy
- Department of Information Technology, Uppsala University, SE-75105, Uppsala, P.O. Box 337, Sweden.
| | | |
Collapse
|
5
|
Banu M, Simion M, Ratiu AC, Popescu M, Romanitan C, Danila M, Radoi A, Ecovoiu AA, Kusko M. Enhanced nucleotide mismatch detection based on a 3D silicon nanowire microarray. RSC Adv 2015. [DOI: 10.1039/c5ra14442f] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
|
6
|
Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures. Nat Commun 2014; 5:5125. [DOI: 10.1038/ncomms6125] [Citation(s) in RCA: 103] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 09/01/2014] [Indexed: 02/05/2023] Open
|
7
|
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat Biotechnol 2014; 32:903-14. [PMID: 25150838 PMCID: PMC4321899 DOI: 10.1038/nbt.2957] [Citation(s) in RCA: 654] [Impact Index Per Article: 59.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2013] [Accepted: 05/11/2014] [Indexed: 02/07/2023]
Abstract
We present primary results from the Sequencing Quality Control (SEQC) project, coordinated by the US Food and Drug Administration. Examining Illumina HiSeq, Life Technologies SOLiD and Roche 454 platforms at multiple laboratory sites using reference RNA samples with built-in controls, we assess RNA sequencing (RNA-seq) performance for junction discovery and differential expression profiling and compare it to microarray and quantitative PCR (qPCR) data using complementary metrics. At all sequencing depths, we discover unannotated exon-exon junctions, with >80% validated by qPCR. We find that measurements of relative expression are accurate and reproducible across sites and platforms if specific filters are used. In contrast, RNA-seq and microarrays do not provide accurate absolute measurements, and gene-specific biases are observed for all examined platforms, including qPCR. Measurement performance depends on the platform and data analysis pipeline, and variation is large for transcript-level profiling. The complete SEQC data sets, comprising >100 billion reads (10Tb), provide unique resources for evaluating RNA-seq analyses for clinical and regulatory settings.
Collapse
|
8
|
Pozhitkov AE, Noble PA, Bryk J, Tautz D. A revised design for microarray experiments to account for experimental noise and uncertainty of probe response. PLoS One 2014; 9:e91295. [PMID: 24618910 PMCID: PMC3949741 DOI: 10.1371/journal.pone.0091295] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Accepted: 02/11/2014] [Indexed: 11/18/2022] Open
Abstract
Background Although microarrays are analysis tools in biomedical research, they are known to yield noisy output that usually requires experimental confirmation. To tackle this problem, many studies have developed rules for optimizing probe design and devised complex statistical tools to analyze the output. However, less emphasis has been placed on systematically identifying the noise component as part of the experimental procedure. One source of noise is the variance in probe binding, which can be assessed by replicating array probes. The second source is poor probe performance, which can be assessed by calibrating the array based on a dilution series of target molecules. Using model experiments for copy number variation and gene expression measurements, we investigate here a revised design for microarray experiments that addresses both of these sources of variance. Results Two custom arrays were used to evaluate the revised design: one based on 25 mer probes from an Affymetrix design and the other based on 60 mer probes from an Agilent design. To assess experimental variance in probe binding, all probes were replicated ten times. To assess probe performance, the probes were calibrated using a dilution series of target molecules and the signal response was fitted to an adsorption model. We found that significant variance of the signal could be controlled by averaging across probes and removing probes that are nonresponsive or poorly responsive in the calibration experiment. Taking this into account, one can obtain a more reliable signal with the added option of obtaining absolute rather than relative measurements. Conclusion The assessment of technical variance within the experiments, combined with the calibration of probes allows to remove poorly responding probes and yields more reliable signals for the remaining ones. Once an array is properly calibrated, absolute quantification of signals becomes straight forward, alleviating the need for normalization and reference hybridizations.
Collapse
Affiliation(s)
- Alex E. Pozhitkov
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- Department of Periodontics, School of Dentistry, University of Washington, Seattle, Washington, United States of America
| | - Peter A. Noble
- Department of Periodontics, School of Dentistry, University of Washington, Seattle, Washington, United States of America
- Ph.D Microbiology Program, Department of Biological Sciences, Alabama State University, Montgomery, Alabama, United States of America
| | - Jarosław Bryk
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- National Centre for Biotechnology Education, University of Reading, Reading, United Kingdom
| | - Diethard Tautz
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- * E-mail:
| |
Collapse
|
9
|
Fenart S, Chabi M, Gallina S, Huis R, Neutelings G, Riviere N, Thomasset B, Hawkins S, Lucau-Danila A. Intra-platform comparison of 25-mer and 60-mer oligonucleotide Nimblegen DNA microarrays. BMC Res Notes 2013; 6:43. [PMID: 23375116 PMCID: PMC3608165 DOI: 10.1186/1756-0500-6-43] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2012] [Accepted: 01/30/2013] [Indexed: 01/02/2023] Open
Abstract
Background We performed a Nimblegen intra-platform microarray comparison by assessing two categories of flax target probes (short 25-mers oligonucleotides and long 60-mers oligonucleotides) in identical conditions of target production, design, labelling, hybridization, image analyses, and data filtering. We compared technical parameters of array hybridizations, precision and accuracy as well as specific gene expression profiles. Results Comparison of the hybridization quality, precision and accuracy of expression measurements, as well as an interpretation of differential gene expression in flax tissues were performed. Both array types yielded reproducible, accurate and comparable data that are coherent for expression measurements and identification of differentially expressed genes. 60-mers arrays gave higher hybridization efficiencies and therefore were more sensitive allowing the detection of a higher number of unigenes involved in the same biological process and/or belonging to the same multigene family. Conclusion The two flax arrays provide a good resolution of expressed functions; however the 60-mers arrays are more sensitive and provide a more in-depth coverage of candidate genes potentially involved in different biological processes.
Collapse
Affiliation(s)
- Stephane Fenart
- Université Lille Nord de France, Lille 1, UMR INRA 1281, SADV, F- 59650 Villeneuve d'Ascq cedex, France
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Harrison A, Binder H, Buhot A, Burden CJ, Carlon E, Gibas C, Gamble LJ, Halperin A, Hooyberghs J, Kreil DP, Levicky R, Noble PA, Ott A, Pettitt BM, Tautz D, Pozhitkov AE. Physico-chemical foundations underpinning microarray and next-generation sequencing experiments. Nucleic Acids Res 2013; 41:2779-96. [PMID: 23307556 PMCID: PMC3597649 DOI: 10.1093/nar/gks1358] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized.
Collapse
Affiliation(s)
- Andrew Harrison
- University of Essex-Mathematical Sciences, Colchester CO4 3SQ, Essex, United Kingdom
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Jakubek YA, Cutler DJ. A model of binding on DNA microarrays: understanding the combined effect of probe synthesis failure, cross-hybridization, DNA fragmentation and other experimental details of affymetrix arrays. BMC Genomics 2012; 13:737. [PMID: 23270536 PMCID: PMC3548757 DOI: 10.1186/1471-2164-13-737] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 12/16/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND DNA microarrays are used both for research and for diagnostics. In research, Affymetrix arrays are commonly used for genome wide association studies, resequencing, and for gene expression analysis. These arrays provide large amounts of data. This data is analyzed using statistical methods that quite often discard a large portion of the information. Most of the information that is lost comes from probes that systematically fail across chips and from batch effects. The aim of this study was to develop a comprehensive model for hybridization that predicts probe intensities for Affymetrix arrays and that could provide a basis for improved microarray analysis and probe development. The first part of the model calculates probe binding affinities to all the possible targets in the hybridization solution using the Langmuir isotherm. In the second part of the model we integrate details that are specific to each experiment and contribute to the differences between hybridization in solution and on the microarray. These details include fragmentation, wash stringency, temperature, salt concentration, and scanner settings. Furthermore, the model fits probe synthesis efficiency and target concentration parameters directly to the data. All the parameters used in the model have a well-established physical origin. RESULTS For the 302 chips that were analyzed the mean correlation between expected and observed probe intensities was 0.701 with a range of 0.88 to 0.55. All available chips were included in the analysis regardless of the data quality. Our results show that batch effects arise from differences in probe synthesis, scanner settings, wash strength, and target fragmentation. We also show that probe synthesis efficiencies for different nucleotides are not uniform. CONCLUSIONS To date this is the most complete model for binding on microarrays. This is the first model that includes both probe synthesis efficiency and hybridization kinetics/cross-hybridization. These two factors are sequence dependent and have a large impact on probe intensity. The results presented here provide novel insight into the effect of probe synthesis errors on Affymetrix microarrays; furthermore, the algorithms developed in this work provide useful tools for the analysis of cross-hybridization, probe synthesis efficiency, fragmentation, wash stringency, temperature, and salt concentration on microarray intensities.
Collapse
Affiliation(s)
- Yasminka A Jakubek
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | | |
Collapse
|
12
|
Loh WY, Piper ME, Schlam TR, Fiore MC, Smith SS, Jorenby DE, Cook JW, Bolt DM, Baker TB. Should all smokers use combination smoking cessation pharmacotherapy? Using novel analytic methods to detect differential treatment effects over 8 weeks of pharmacotherapy. Nicotine Tob Res 2012; 14:131-41. [PMID: 22180577 PMCID: PMC3265742 DOI: 10.1093/ntr/ntr147] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2011] [Accepted: 06/09/2011] [Indexed: 12/18/2022]
Abstract
INTRODUCTION Combination pharmacotherapy for smoking cessation has been shown to be more effective than monotherapy in meta-analyses. We address the question of whether combination pharmacotherapy should be used routinely with smokers or if some types of smokers show little or no benefit from combination pharmacotherapy versus monotherapy. METHODS Two smoking cessation trials were conducted using the same assessments and medications (bupropion, nicotine lozenge, nicotine patch, bupropion + lozenge, and patch + lozenge). Participants were smokers presenting either to primary care clinics in southeastern Wisconsin for medical treatment (Effectiveness trial, N = 1,346) or volunteering for smoking cessation treatment at smoking cessation clinics in Madison and Milwaukee, WI (Efficacy trial, N = 1,504). For each trial, decision tree analyses identified variables predicting outcome from combination pharmacotherapy versus monotherapy at the end of treatment (smoking 8 weeks after the target quit day). RESULTS All smokers tended to benefit from combination pharmacotherapy except those low in nicotine dependence (longer latency to smoke in the morning as per item 1 of the Fagerström Test of Nicotine Dependence) who also lived with a spouse or partner who smoked. CONCLUSIONS Combination pharmacotherapy was generally more effective than monotherapy among smokers, but one group of smokers, those who were low in nicotine dependence and who lived with a smoking spouse, did not show greater benefit from using combination pharmacotherapy. Use of monotherapy with these smokers might be justified considering the expense and side effects of combination pharmacotherapy.
Collapse
Affiliation(s)
- Wei-Yin Loh
- Department of Statistics, University of Wisconsin, Madison, WI
| | - Megan E. Piper
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Tanya R. Schlam
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Michael C. Fiore
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Stevens S. Smith
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Douglas E. Jorenby
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Jessica W. Cook
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Daniel M. Bolt
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
- Department of Educational Psychology, University of Wisconsin School of Medicine and Public Health, Madison, WI
| | - Timothy B. Baker
- Center for Tobacco Research and Intervention, Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, WI
| |
Collapse
|
13
|
Ophir R, Sherman A. Self-custom-made SFP arrays for nonmodel organisms. Methods Mol Biol 2012; 815:39-47. [PMID: 22130982 DOI: 10.1007/978-1-61779-424-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Successful genetic mapping is dependent upon a high-density set of markers. Therefore, tools for high-throughput discovery of genetic variation are essential. The most abundant genetic marker is the single-nucleotide polymorphism (SNP). However, except for model organisms, genomic information is still limited. Although high-throughput genomic sequencing technologies are becoming relatively inexpensive, only low-throughput genetic markers are accessible (e.g., simple sequence repeats). The use of sequencing for the discovery and screening of high-density genetic variation in whole populations is still expensive. Alternatively, hybridization of genomic DNA (gDNA) on a reference (either genome or transcriptome) is an efficient approach for genetic screening without knowing the alleles in advance (Borevitz et al. Proc Natl Acad Sci USA 104:12057-12062). We describe a protocol for the design of probes for a high-throughput genetic-marker discovery microarray, termed single feature polymorphism (SFP) array. Starting with consensus cDNA sequences (UniGenes), we use OligoWiz to design T (m)-optimized 50-bp long oligonucleotide probes (Ophir et al. BMC Genomics 11:269, 2010). This design is similar to expression arrays and we point out the differences.
Collapse
Affiliation(s)
- Ron Ophir
- Institute of Plant Sciences, Agricultural Research Organization, Volcani Research Center, Bet Dagan 50250, Israel.
| | | |
Collapse
|
14
|
Sykacek P, Kreil DP, Meadows LA, Auburn RP, Fischer B, Russell S, Micklem G. The impact of quantitative optimization of hybridization conditions on gene expression analysis. BMC Bioinformatics 2011; 12:73. [PMID: 21401920 PMCID: PMC3065421 DOI: 10.1186/1471-2105-12-73] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Accepted: 03/14/2011] [Indexed: 12/05/2022] Open
Abstract
Background With the growing availability of entire genome sequences, an increasing number of scientists can exploit oligonucleotide microarrays for genome-scale expression studies. While probe-design is a major research area, relatively little work has been reported on the optimization of microarray protocols. Results As shown in this study, suboptimal conditions can have considerable impact on biologically relevant observations. For example, deviation from the optimal temperature by one degree Celsius lead to a loss of up to 44% of differentially expressed genes identified. While genes from thousands of Gene Ontology categories were affected, transcription factors and other low-copy-number regulators were disproportionately lost. Calibrated protocols are thus required in order to take full advantage of the large dynamic range of microarrays. For an objective optimization of protocols we introduce an approach that maximizes the amount of information obtained per experiment. A comparison of two typical samples is sufficient for this calibration. We can ensure, however, that optimization results are independent of the samples and the specific measures used for calibration. Both simulations and spike-in experiments confirmed an unbiased determination of generally optimal experimental conditions. Conclusions Well calibrated hybridization conditions are thus easily achieved and necessary for the efficient detection of differential expression. They are essential for the sensitive pro filing of low-copy-number molecules. This is particularly critical for studies of transcription factor expression, or the inference and study of regulatory networks.
Collapse
Affiliation(s)
- Peter Sykacek
- Department of Biotechnology, Boku University, Vienna, A-1190 Muthgasse 18, Austria.
| | | | | | | | | | | | | |
Collapse
|
15
|
Terrat S, Peyretaillade E, Gonçalves O, Dugat-Bony E, Gravelat F, Moné A, Biderre-Petit C, Boucher D, Troquet J, Peyret P. Detecting variants with Metabolic Design, a new software tool to design probes for explorative functional DNA microarray development. BMC Bioinformatics 2010; 11:478. [PMID: 20860850 PMCID: PMC2955052 DOI: 10.1186/1471-2105-11-478] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 09/23/2010] [Indexed: 12/15/2022] Open
Abstract
Background Microorganisms display vast diversity, and each one has its own set of genes, cell components and metabolic reactions. To assess their huge unexploited metabolic potential in different ecosystems, we need high throughput tools, such as functional microarrays, that allow the simultaneous analysis of thousands of genes. However, most classical functional microarrays use specific probes that monitor only known sequences, and so fail to cover the full microbial gene diversity present in complex environments. We have thus developed an algorithm, implemented in the user-friendly program Metabolic Design, to design efficient explorative probes. Results First we have validated our approach by studying eight enzymes involved in the degradation of polycyclic aromatic hydrocarbons from the model strain Sphingomonas paucimobilis sp. EPA505 using a designed microarray of 8,048 probes. As expected, microarray assays identified the targeted set of genes induced during biodegradation kinetics experiments with various pollutants. We have then confirmed the identity of these new genes by sequencing, and corroborated the quantitative discrimination of our microarray by quantitative real-time PCR. Finally, we have assessed metabolic capacities of microbial communities in soil contaminated with aromatic hydrocarbons. Results show that our probe design (sensitivity and explorative quality) can be used to study a complex environment efficiently. Conclusions We successfully use our microarray to detect gene expression encoding enzymes involved in polycyclic aromatic hydrocarbon degradation for the model strain. In addition, DNA microarray experiments performed on soil polluted by organic pollutants without prior sequence assumptions demonstrate high specificity and sensitivity for gene detection. Metabolic Design is thus a powerful, efficient tool that can be used to design explorative probes and monitor metabolic pathways in complex environments, and it may also be used to study any group of genes. The Metabolic Design software is freely available from the authors and can be downloaded and modified under general public license.
Collapse
Affiliation(s)
- Sébastien Terrat
- Clermont Université, Université d'Auvergne, Laboratoire: Microorganismes Génome et Environnement, BP 10448, F-63000 Clermont-Ferrand, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|