1
|
Nimse SB, Song KS, Warkad SD, Kim T. A Novel Method That Allows SNP Discrimination with 160:1 Ratio for Biosensors Based on DNA-DNA Hybridization. BIOSENSORS-BASEL 2021; 11:bios11080265. [PMID: 34436067 PMCID: PMC8391390 DOI: 10.3390/bios11080265] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/19/2022]
Abstract
Highly sensitive (high SBR) and highly specific (high SNP discrimination ratio) DNA hybridization is essential for a biosensor with clinical application. Herein, we propose a method that allows detecting multiple pathogens on a single platform with the SNP discrimination ratios over 160:1 in the dynamic range of 101 to 104 copies per test. The newly developed SWAT method allows achieving highly sensitive and highly specific DNA hybridizations. The detection and discrimination of the MTB and NTM strain in the clinical samples with the SBR and SNP discrimination ratios higher than 160:1 indicate the high clinical applicability of the SWAT.
Collapse
Affiliation(s)
- Satish Balasaheb Nimse
- Department of Chemistry, Institute for Applied Chemistry, Hallym University, Chuncheon 200-702, Korea;
| | - Keum-Soo Song
- Biometrix Technology, Inc. 202 BioVenture Plaza, Chuncheon 200-161, Korea; (K.-S.S.); (S.D.W.)
| | | | - Taisun Kim
- Department of Chemistry, Institute for Applied Chemistry, Hallym University, Chuncheon 200-702, Korea;
- Correspondence:
| |
Collapse
|
2
|
Stirmanov YV, Matveeva OV, Nechipurenko YD. Two-dimensional Ising model for microarray hybridization: cooperative interactions between bound target molecules. J Biomol Struct Dyn 2018; 37:3103-3108. [PMID: 30081753 DOI: 10.1080/07391102.2018.1508370] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
The Langmuir adsorption model is widely used for description and quantification of microarray oligo-target hybridization. According to the model, the binding centers for adsorption of target molecules from solution are represented by oligo-probes. However, the Langmuir model does not consider the interactions between the targets adsorbed at the neighboring binding centers, which are possible due to high-density of array-bound probes. We have shown that the two-dimensional Ising model, which takes into account the nearest neighboring target molecules interactions, better describes the experimental data of oligo-target hybridization in comparison with the Langmuir model. Thus, we found an evidence for existence of positive cooperative interactions between adsorbed target molecules: so, binding of the first target molecules facilitates the binding of subsequent ones to the neighboring probes. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Y V Stirmanov
- a Engelhardt Institute of Molecular Biology , Russian Academy of Sciences , Moscow , Russia
| | - O V Matveeva
- a Engelhardt Institute of Molecular Biology , Russian Academy of Sciences , Moscow , Russia
| | - Y D Nechipurenko
- a Engelhardt Institute of Molecular Biology , Russian Academy of Sciences , Moscow , Russia
| |
Collapse
|
3
|
Pozhitkov AE, Noble PA. Gene Meter: Accurate abundance calculations of gene expression. Commun Integr Biol 2017; 10:e1329785. [PMID: 28919937 PMCID: PMC5595416 DOI: 10.1080/19420889.2017.1329785] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 05/05/2017] [Accepted: 05/08/2017] [Indexed: 12/28/2022] Open
Abstract
We previously reported that thousands of transcripts in the mouse and zebrafish significantly increased in abundance in a time series spanning from life to several days after death. Transcript abundances were determined by: calibrating each microarray probe using a dilution series of pooled RNAs, fitting the probe-responses to adsorption models, and back-calculating abundances using the probe signal intensity of a sample and the best fitting model. The accuracy of the abundance measurements was not assessed in our previous study because individual transcript concentrations in the calibration pool were not known. Accurate transcript abundances are highly desired for modeling the dynamics of biological systems and investigating how systems respond to perturbations. In this study, we show that accurate transcript abundances can be determined by calibrating the probes using a calibration pool of transcripts with known concentrations. Instructions for determining accurate transcript abundances using the Gene Meter approach are provided.
Collapse
Affiliation(s)
- Alexander E Pozhitkov
- City of Hope, Information Sciences-Beckman Research Institute, Irwindale, CA.,Max-Planck-Institute for Evolutionary Biology, Ploen, Germany
| | - Peter A Noble
- Department of Periodontics, University of Washington, Seattle, WA, USA
| |
Collapse
|
4
|
Erburu M, Cajaleon L, Guruceaga E, Venzala E, Muñoz-Cobo I, Beltrán E, Puerta E, Tordera R. Chronic mild stress and imipramine treatment elicit opposite changes in behavior and in gene expression in the mouse prefrontal cortex. Pharmacol Biochem Behav 2015; 135:227-36. [DOI: 10.1016/j.pbb.2015.06.001] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/14/2015] [Revised: 05/28/2015] [Accepted: 06/01/2015] [Indexed: 01/22/2023]
|
5
|
Dally S, Rupp S, Lemuth K, Hartmann SC, Hiller E, Bailer SM, Knabbe C, Weile J. Single-stranded DNA catalyzes hybridization of PCR-products to microarray capture probes. PLoS One 2014; 9:e102338. [PMID: 25025686 PMCID: PMC4099319 DOI: 10.1371/journal.pone.0102338] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Accepted: 06/18/2014] [Indexed: 11/18/2022] Open
Abstract
Since its development, microarray technology has evolved to a standard method in the biotechnological and medical field with a broad range of applications. Nevertheless, the underlying mechanism of the hybridization process of PCR-products to microarray capture probes is still not completely understood, and several observed phenomena cannot be explained with current models. We investigated the influence of several parameters on the hybridization reaction and identified ssDNA to play a major role in the process. An increase of the ssDNA content in a hybridization reaction strongly enhanced resulting signal intensities. A strong influence could also be observed when unlabeled ssDNA was added to the hybridization reaction. A reduction of the ssDNA content resulted in a massive decrease of the hybridization efficiency. According to these data, we developed a novel model for the hybridization mechanism. This model is based on the assumption that single stranded DNA is necessary as catalyst to induce the hybridization of dsDNA. The developed hybridization model is capable of giving explanations for several yet unresolved questions regarding the functionality of microarrays. Our findings not only deepen the understanding of the hybridization process, but also have immediate practical use in data interpretation and the development of new microarrays.
Collapse
Affiliation(s)
- Simon Dally
- Institute for Laboratory and Transfusion Medicine, Heart and Diabetes Center North Rhine-Westphalia, Bad Oeynhausen, Germany
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Steffen Rupp
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Karin Lemuth
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Stefan C. Hartmann
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Ekkehard Hiller
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Susanne M. Bailer
- Department of Molecular Biotechnology, Fraunhofer Institute for Interfacial Engineering and Biotechnology, Stuttgart, Germany
| | - Cornelius Knabbe
- Institute for Laboratory and Transfusion Medicine, Heart and Diabetes Center North Rhine-Westphalia, Bad Oeynhausen, Germany
| | - Jan Weile
- Institute for Laboratory and Transfusion Medicine, Heart and Diabetes Center North Rhine-Westphalia, Bad Oeynhausen, Germany
- * E-mail:
| |
Collapse
|
6
|
Pozhitkov AE, Noble PA, Bryk J, Tautz D. A revised design for microarray experiments to account for experimental noise and uncertainty of probe response. PLoS One 2014; 9:e91295. [PMID: 24618910 PMCID: PMC3949741 DOI: 10.1371/journal.pone.0091295] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2013] [Accepted: 02/11/2014] [Indexed: 11/18/2022] Open
Abstract
Background Although microarrays are analysis tools in biomedical research, they are known to yield noisy output that usually requires experimental confirmation. To tackle this problem, many studies have developed rules for optimizing probe design and devised complex statistical tools to analyze the output. However, less emphasis has been placed on systematically identifying the noise component as part of the experimental procedure. One source of noise is the variance in probe binding, which can be assessed by replicating array probes. The second source is poor probe performance, which can be assessed by calibrating the array based on a dilution series of target molecules. Using model experiments for copy number variation and gene expression measurements, we investigate here a revised design for microarray experiments that addresses both of these sources of variance. Results Two custom arrays were used to evaluate the revised design: one based on 25 mer probes from an Affymetrix design and the other based on 60 mer probes from an Agilent design. To assess experimental variance in probe binding, all probes were replicated ten times. To assess probe performance, the probes were calibrated using a dilution series of target molecules and the signal response was fitted to an adsorption model. We found that significant variance of the signal could be controlled by averaging across probes and removing probes that are nonresponsive or poorly responsive in the calibration experiment. Taking this into account, one can obtain a more reliable signal with the added option of obtaining absolute rather than relative measurements. Conclusion The assessment of technical variance within the experiments, combined with the calibration of probes allows to remove poorly responding probes and yields more reliable signals for the remaining ones. Once an array is properly calibrated, absolute quantification of signals becomes straight forward, alleviating the need for normalization and reference hybridizations.
Collapse
Affiliation(s)
- Alex E. Pozhitkov
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- Department of Periodontics, School of Dentistry, University of Washington, Seattle, Washington, United States of America
| | - Peter A. Noble
- Department of Periodontics, School of Dentistry, University of Washington, Seattle, Washington, United States of America
- Ph.D Microbiology Program, Department of Biological Sciences, Alabama State University, Montgomery, Alabama, United States of America
| | - Jarosław Bryk
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- National Centre for Biotechnology Education, University of Reading, Reading, United Kingdom
| | - Diethard Tautz
- Max-Planck-Institut für Evolutionsbiologie, Plön, Germany
- * E-mail:
| |
Collapse
|
7
|
Hitzemann R, Darakjian P, Walter N, Iancu OD, Searles R, McWeeney S. Introduction to sequencing the brain transcriptome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:1-19. [PMID: 25172469 DOI: 10.1016/b978-0-12-801105-8.00001-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
High-throughput next-generation sequencing is now entering its second decade. However, it was not until 2008 that the first report of sequencing the brain transcriptome appeared (Mortazavi, Williams, Mccue, Schaeffer, & Wold, 2008). These authors compared short-read RNA-Seq data for mouse whole brain with microarray results for the same sample and noted both the advantages and disadvantages of the RNA-Seq approach. While RNA-Seq provided exon level resolution, the majority of the reads were provided by a small proportion of highly expressed genes and the data analysis was exceedingly complex. Over the past 6 years, there have been substantial improvements in both RNA-Seq technology and data analysis. This volume contains 11 chapters that detail various aspects of sequencing the brain transcriptome. Some of the chapters are very methods driven, while others focus on the use of RNA-Seq to study such diverse areas as development, schizophrenia, and drug abuse. This chapter briefly reviews the transition from microarrays to RNA-Seq as the preferred method for analyzing the brain transcriptome. Compared with microarrays, RNA-Seq has a greater dynamic range, detects both coding and noncoding RNAs, is superior for gene network construction, detects alternative spliced transcripts, and can be used to extract genotype information, e.g., nonsynonymous coding single nucleotide polymorphisms. RNA-Seq embraces the complexity of the brain transcriptome and provides a mechanism to understand the underlying regulatory code; the potential to inform the brain-behavior-disease relationships is substantial.
Collapse
Affiliation(s)
- Robert Hitzemann
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, Oregon, USA; Research Service, Veterans Affairs Medical Center, Portland, Oregon, USA.
| | - Priscila Darakjian
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, Oregon, USA
| | - Nikki Walter
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, Oregon, USA; Research Service, Veterans Affairs Medical Center, Portland, Oregon, USA
| | - Ovidiu Dan Iancu
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, Oregon, USA
| | - Robert Searles
- Integrative Genomics Laboratory, Oregon Health & Science University, Portland, Oregon, USA
| | - Shannon McWeeney
- Oregon Clinical and Translational Research Institute, Oregon Health & Science University, Portland, Oregon, USA; Division of Biostatistics, Public Health & Preventative Medicine, Oregon Health & Science University, Portland, Oregon, USA
| |
Collapse
|
8
|
Wen Y, Li M, Fu WJ. Catching the genomic wave in oligonucleotide single-nucleotide polymorphism arrays by modeling sequence binding. J Comput Biol 2013; 20:514-23. [PMID: 23763671 DOI: 10.1089/cmb.2011.0102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The genomic wave has been identified as a major artifact in genome data and is highly correlated with the sequence GC content. Although statistical methods have been developed to filter this artifact, the mechanism underlying the genomic wave has not been studied yet. Understanding of the artifact, specifically the sources of the artifact, may lead to successful separation of biological signals from the artifact and improve array design, modeling, and association studies. We develop an approach to catching the genomic wave in the oligonucleotide single-nucleotide polymorphism (SNP) arrays by separating biological signals from the array baseline background through modeling sequence binding with a newly developed probe intensity composite representation (PICR) model. The PICR model decomposes the probe intensity of each SNP probe set into the target sequence concentrations, SNP-specific background (nonsignal) and measurement error, and identifies the biological signals through the target concentration for each allele. We demonstrate with the Affymetrix GeneChip 500K HapMap data and the Wellcome Trust Case-Control Study data that the genomic wave is captured through the SNP-specific background term of the PICR model, and is separated successfully from the allelic target concentrations-the biological signals. We further identify two important sources of the genomic waves, the GC content and the fragment length (FL) of the sequence, and conclude that (1) the genomic wave artifact can be removed from the genome data with the PICR model, and (2) in addition to the GC content, the genomic wave also has a component of nonlinear effect of the FL.
Collapse
Affiliation(s)
- Yalu Wen
- The Computational Genomics Lab, Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan 48824, USA
| | | | | |
Collapse
|
9
|
Hitzemann R, Bottomly D, Darakjian P, Walter N, Iancu O, Searles R, Wilmot B, McWeeney S. Genes, behavior and next-generation RNA sequencing. GENES, BRAIN, AND BEHAVIOR 2013; 12:1-12. [PMID: 23194347 PMCID: PMC6050050 DOI: 10.1111/gbb.12007] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2012] [Revised: 10/31/2012] [Accepted: 11/21/2012] [Indexed: 12/30/2022]
Abstract
Advances in next-generation sequencing suggest that RNA-Seq is poised to supplant microarray-based approaches for transcriptome analysis. This article briefly reviews the use of microarrays in the brain-behavior context and then illustrates why RNA-Seq is a superior strategy. Compared with microarrays, RNA-Seq has a greater dynamic range, detects both coding and noncoding RNAs, is superior for gene network construction, detects alternative spliced transcripts, detects allele specific expression and can be used to extract genotype information, e.g. nonsynonymous coding single nucleotide polymorphisms. Examples of where RNA-Seq has been used to assess brain gene expression are provided. Despite the advantages of RNA-Seq, some disadvantages remain. These include the high cost of RNA-Seq and the computational complexities associated with data analysis. RNA-Seq embraces the complexity of the transcriptome and provides a mechanism to understand the underlying regulatory code; the potential to inform the brain-behavior relationship is substantial.
Collapse
Affiliation(s)
- R Hitzemann
- Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR 97239-3098, USA.
| | | | | | | | | | | | | | | |
Collapse
|
10
|
Harrison A, Binder H, Buhot A, Burden CJ, Carlon E, Gibas C, Gamble LJ, Halperin A, Hooyberghs J, Kreil DP, Levicky R, Noble PA, Ott A, Pettitt BM, Tautz D, Pozhitkov AE. Physico-chemical foundations underpinning microarray and next-generation sequencing experiments. Nucleic Acids Res 2013; 41:2779-96. [PMID: 23307556 PMCID: PMC3597649 DOI: 10.1093/nar/gks1358] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Hybridization of nucleic acids on solid surfaces is a key process involved in high-throughput technologies such as microarrays and, in some cases, next-generation sequencing (NGS). A physical understanding of the hybridization process helps to determine the accuracy of these technologies. The goal of a widespread research program is to develop reliable transformations between the raw signals reported by the technologies and individual molecular concentrations from an ensemble of nucleic acids. This research has inputs from many areas, from bioinformatics and biostatistics, to theoretical and experimental biochemistry and biophysics, to computer simulations. A group of leading researchers met in Ploen Germany in 2011 to discuss present knowledge and limitations of our physico-chemical understanding of high-throughput nucleic acid technologies. This meeting inspired us to write this summary, which provides an overview of the state-of-the-art approaches based on physico-chemical foundation to modeling of the nucleic acids hybridization process on solid surfaces. In addition, practical application of current knowledge is emphasized.
Collapse
Affiliation(s)
- Andrew Harrison
- University of Essex-Mathematical Sciences, Colchester CO4 3SQ, Essex, United Kingdom
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Crabbe JC, Kendler KS, Hitzemann RJ. Modeling the diagnostic criteria for alcohol dependence with genetic animal models. Curr Top Behav Neurosci 2013; 13:187-221. [PMID: 21910077 PMCID: PMC3371181 DOI: 10.1007/7854_2011_162] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
A diagnosis of alcohol dependence (AD) using the DSM-IV-R is categorical, based on an individual's manifestation of three or more symptoms from a list of seven. AD risk can be traced to both genetic and environmental sources. Most genetic studies of AD risk implicitly assume that an AD diagnosis represents a single underlying genetic factor. We recently found that the criteria for an AD diagnosis represent three somewhat distinct genetic paths to individual risk. Specifically, heavy use and tolerance versus withdrawal and continued use despite problems reflected separate genetic factors. However, some data suggest that genetic risk for AD is adequately described with a single underlying genetic risk factor. Rodent animal models for alcohol-related phenotypes typically target discrete aspects of the complex human AD diagnosis. Here, we review the literature derived from genetic animal models in an attempt to determine whether they support a single-factor or multiple-factor genetic structure. We conclude that there is modest support in the animal literature that alcohol tolerance and withdrawal reflect distinct genetic risk factors, in agreement with our human data. We suggest areas where more research could clarify this attempt to align the rodent and human data.
Collapse
Affiliation(s)
- John C Crabbe
- Portland Alcohol Research Center, Department of Behavioral Neuroscience, Oregon Health & Science University, Portland, OR 97239, USA.
| | | | | |
Collapse
|
12
|
Hadiwikarta WW, Walter JC, Hooyberghs J, Carlon E. Probing hybridization parameters from microarray experiments: nearest-neighbor model and beyond. Nucleic Acids Res 2012; 40:e138. [PMID: 22661582 PMCID: PMC3467032 DOI: 10.1093/nar/gks475] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In this article, it is shown how optimized and dedicated microarray experiments can be used to study the thermodynamics of DNA hybridization for a large number of different conformations in a highly parallel fashion. In particular, free energy penalties for mismatches are obtained in two independent ways and are shown to be correlated with values from melting experiments in solution reported in the literature. The additivity principle, which is at the basis of the nearest-neighbor model, and according to which the penalty for two isolated mismatches is equal to the sum of the independent penalties, is thoroughly tested. Additivity is shown to break down for a mismatch distance below 5 nt. The behavior of mismatches in the vicinity of the helix edges, and the behavior of tandem mismatches are also investigated. Finally, some thermodynamic outlying sequences are observed and highlighted. These sequences contain combinations of GA mismatches. The analysis of the microarray data reported in this article provides new insights on the DNA hybridization parameters and can help to increase the accuracy of hybridization-based technologies.
Collapse
Affiliation(s)
- W W Hadiwikarta
- Flemish Institute for Technological Research, VITO, Boeretang 200, B-2400 Mol, Belgium
| | | | | | | |
Collapse
|
13
|
Identification of non-specific hybridization using an empirical equation fitted to non-equilibrium dissociation curves. J Microbiol Methods 2012; 90:29-35. [PMID: 22537822 DOI: 10.1016/j.mimet.2012.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2011] [Revised: 04/04/2012] [Accepted: 04/06/2012] [Indexed: 11/22/2022]
Abstract
Non-equilibrium dissociation curves (NEDCs) have the potential to identify non-specific hybridizations on high throughput, diagnostic microarrays. We report a simple method for the identification of non-specific signals by using a new parameter that does not rely on comparison of perfect match and mismatch dissociations. The parameter is the ratio of specific dissociation temperature (T(d-w)) to theoretical melting temperature (T(m)) and can be obtained by automated fitting of a four-parameter, sigmoid, empirical equation to the thousands of curves generated in a typical experiment. The curves fit perfect match NEDCs from an initial experiment with an R(2) of 0.998±0.006 and root mean square of 108±91 fluorescent units. Receiver operating characteristic curve analysis showed low temperature hybridization signals (20-48°C) to be as effective as area under the curve as primary data filters. Evaluation of three datasets that target 16S rRNA and functional genes with varying degrees of target sequence similarity showed that filtering out hybridizations with T(d-w)/T(m)<0.78 greatly reduced false positive results. In conclusion, T(d-w)/T(m) successfully screened many non-specific hybridizations that could not be identified using single temperature signal intensities alone, while the empirical modeling allowed a simplified approach to the high throughput analysis of thousands of NEDCs.
Collapse
|
14
|
Loakes D. Nucleotides and nucleic acids; oligo- and polynucleotides. ORGANOPHOSPHORUS CHEMISTRY 2012. [DOI: 10.1039/9781849734875-00169] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- David Loakes
- Medical Research Council Laboratory of Molecular Biology, Hills Road Cambridge CB2 2QH UK
| |
Collapse
|
15
|
Berger F, Carlon E. From hybridization theory to microarray data analysis: performance evaluation. BMC Bioinformatics 2011; 12:464. [PMID: 22136743 PMCID: PMC3267830 DOI: 10.1186/1471-2105-12-464] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 12/02/2011] [Indexed: 02/05/2023] Open
Abstract
Background Several preprocessing methods are available for the analysis of Affymetrix Genechips arrays. The most popular algorithms analyze the measured fluorescence intensities with statistical methods. Here we focus on a novel algorithm, AffyILM, available from Bioconductor, which relies on inputs from hybridization thermodynamics and uses an extended Langmuir isotherm model to compute transcript concentrations. These concentrations are then employed in the statistical analysis. We compared the performance of AffyILM and other traditional methods both in the old and in the newest generation of GeneChips. Results Tissue mixture and Latin Square datasets (provided by Affymetrix) were used to assess the performances of the differential expression analysis depending on the preprocessing strategy. A correlation analysis conducted on the tissue mixture data reveals that the median-polish algorithm allows to best summarize AffyILM concentrations computed at the probe-level. Those correlation results are equivalent to the best correlations observed using popular preprocessing methods relying on intensity values. The performances of each tested preprocessing algorithm were quantified using the Latin Square HG-U133A dataset, thanks to the comparison of differential analysis results with the list of spiked genes. The figures of merit generated illustrates that the performances associated to AffyILM(medianpolish), inferred from the present statistical analysis, are comparable to the best performing strategies previously reported. Conclusions Converting probe intensities to estimates of target concentrations prior to the statistical analysis, AffyILM(medianpolish) is one of the best performing strategy currently available. Using hybridization theory, probe-level estimates of target concentrations should be identically distributed. In the future, a probe-level multivariate analysis of the concentrations should be compared to the univariate analysis of probe-set summarized expression data.
Collapse
Affiliation(s)
- Fabrice Berger
- Institute for Theoretical Physics, KULeuven, Celestijnenlaan 200D, B-3001 Leuven, Belgium.
| | | |
Collapse
|
16
|
Overall CC, Carr DA, Tabari ES, Thompson KJ, Weller JW. ArrayInitiative - a tool that simplifies creating custom Affymetrix CDFs. BMC Bioinformatics 2011; 12:136. [PMID: 21548938 PMCID: PMC3113937 DOI: 10.1186/1471-2105-12-136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2010] [Accepted: 05/06/2011] [Indexed: 01/05/2023] Open
Abstract
Background Probes on a microarray represent a frozen view of a genome and are quickly outdated when new sequencing studies extend our knowledge, resulting in significant measurement error when analyzing any microarray experiment. There are several bioinformatics approaches to improve probe assignments, but without in-house programming expertise, standardizing these custom array specifications as a usable file (e.g. as Affymetrix CDFs) is difficult, owing mostly to the complexity of the specification file format. However, without correctly standardized files there is a significant barrier for testing competing analysis approaches since this file is one of the required inputs for many commonly used algorithms. The need to test combinations of probe assignments and analysis algorithms led us to develop ArrayInitiative, a tool for creating and managing custom array specifications. Results ArrayInitiative is a standalone, cross-platform, rich client desktop application for creating correctly formatted, custom versions of manufacturer-provided (default) array specifications, requiring only minimal knowledge of the array specification rules and file formats. Users can import default array specifications, import probe sequences for a default array specification, design and import a custom array specification, export any array specification to multiple output formats, export the probe sequences for any array specification and browse high-level information about the microarray, such as version and number of probes. The initial release of ArrayInitiative supports the Affymetrix 3' IVT expression arrays we currently analyze, but as an open source application, we hope that others will contribute modules for other platforms. Conclusions ArrayInitiative allows researchers to create new array specifications, in a standard format, based upon their own requirements. This makes it easier to test competing design and analysis strategies that depend on probe definitions. Since the custom array specifications are easily exported to the manufacturer's standard format, researchers can analyze these customized microarray experiments using established software tools, such as those available in Bioconductor.
Collapse
Affiliation(s)
- Christopher C Overall
- Department of Bioinformatics and Genomics, University of North Carolina - Charlotte, Charlotte, NC 28223-0001, USA
| | | | | | | | | |
Collapse
|
17
|
Bottomly D, Walter NAR, Hunter JE, Darakjian P, Kawane S, Buck KJ, Searles RP, Mooney M, McWeeney SK, Hitzemann R. Evaluating gene expression in C57BL/6J and DBA/2J mouse striatum using RNA-Seq and microarrays. PLoS One 2011; 6:e17820. [PMID: 21455293 PMCID: PMC3063777 DOI: 10.1371/journal.pone.0017820] [Citation(s) in RCA: 177] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Accepted: 02/10/2011] [Indexed: 12/14/2022] Open
Abstract
C57BL/6J (B6) and DBA/2J (D2) are two of the most commonly used inbred mouse strains in neuroscience research. However, the only currently available mouse genome is based entirely on the B6 strain sequence. Subsequently, oligonucleotide microarray probes are based solely on this B6 reference sequence, making their application for gene expression profiling comparisons across mouse strains dubious due to their allelic sequence differences, including single nucleotide polymorphisms (SNPs). The emergence of next-generation sequencing (NGS) and the RNA-Seq application provides a clear alternative to oligonucleotide arrays for detecting differential gene expression without the problems inherent to hybridization-based technologies. Using RNA-Seq, an average of 22 million short sequencing reads were generated per sample for 21 samples (10 B6 and 11 D2), and these reads were aligned to the mouse reference genome, allowing 16,183 Ensembl genes to be queried in striatum for both strains. To determine differential expression, ‘digital mRNA counting’ is applied based on reads that map to exons. The current study compares RNA-Seq (Illumina GA IIx) with two microarray platforms (Illumina MouseRef-8 v2.0 and Affymetrix MOE 430 2.0) to detect differential striatal gene expression between the B6 and D2 inbred mouse strains. We show that by using stringent data processing requirements differential expression as determined by RNA-Seq is concordant with both the Affymetrix and Illumina platforms in more instances than it is concordant with only a single platform, and that instances of discordance with respect to direction of fold change were rare. Finally, we show that additional information is gained from RNA-Seq compared to hybridization-based techniques as RNA-Seq detects more genes than either microarray platform. The majority of genes differentially expressed in RNA-Seq were only detected as present in RNA-Seq, which is important for studies with smaller effect sizes where the sensitivity of hybridization-based techniques could bias interpretation.
Collapse
Affiliation(s)
- Daniel Bottomly
- Oregon Clinical and Translational Research Institute, Oregon Health & Science University, Portland, Oregon, United States of America.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Gharaibeh RZ, Fodor AA, Gibas CJ. Accurate estimates of microarray target concentration from a simple sequence-independent Langmuir model. PLoS One 2010; 5:e14464. [PMID: 21209932 PMCID: PMC3012684 DOI: 10.1371/journal.pone.0014464] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Accepted: 12/01/2010] [Indexed: 11/18/2022] Open
Abstract
Background Microarray technology is a commonly used tool for assessing global gene expression. Many models for estimation of target concentration based on observed microarray signal have been proposed, but, in general, these models have been complex and platform-dependent. Principal Findings We introduce a universal Langmuir model for estimation of absolute target concentration from microarray experiments. We find that this sequence-independent model, characterized by only three free parameters, yields excellent predictions for four microarray platforms, including Affymetrix, Agilent, Illumina and a custom-printed microarray. The model also accurately predicts concentration for the MAQC data sets. This approach significantly reduces the computational complexity of quantitative target concentration estimates. Conclusions Using a simple form of the Langmuir isotherm model, with a minimum of parameters and assumptions, and without explicit modeling of individual probe properties, we were able to recover absolute transcript concentrations with high R2 on four different array platforms. The results obtained here suggest that with a “spiked-in” concentration series targeting as few as 5–10 genes, reliable estimation of target concentration can be achieved for the entire microarray.
Collapse
Affiliation(s)
- Raad Z. Gharaibeh
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
| | - Anthony A. Fodor
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
| | - Cynthia J. Gibas
- Department of Bioinformatics and Genomics, The University of North Carolina at Charlotte, Charlotte, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
19
|
Abstract
Background Methods Results Conclusions
Collapse
|
20
|
Li S, Pozhitkov A, Brouwer M. Linking probe thermodynamics to microarray quantification. Phys Biol 2010; 7:048001; discussion 048002. [DOI: 10.1088/1478-3975/7/4/048001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
21
|
Binder H, Krohn K, Burden CJ. Washing scaling of GeneChip microarray expression. BMC Bioinformatics 2010; 11:291. [PMID: 20509934 PMCID: PMC2901370 DOI: 10.1186/1471-2105-11-291] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2009] [Accepted: 05/28/2010] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Post-hybridization washing is an essential part of microarray experiments. Both the quality of the experimental washing protocol and adequate consideration of washing in intensity calibration ultimately affect the quality of the expression estimates extracted from the microarray intensities. RESULTS We conducted experiments on GeneChip microarrays with altered protocols for washing, scanning and staining to study the probe-level intensity changes as a function of the number of washing cycles. For calibration and analysis of the intensity data we make use of the 'hook' method which allows intensity contributions due to non-specific and specific hybridization of perfect match (PM) and mismatch (MM) probes to be disentangled in a sequence specific manner. On average, washing according to the standard protocol removes about 90% of the non-specific background and about 30-50% and less than 10% of the specific targets from the MM and PM, respectively. Analysis of the washing kinetics shows that the signal-to-noise ratio doubles roughly every ten stringent washing cycles. Washing can be characterized by time-dependent rate constants which reflect the heterogeneous character of target binding to microarray probes. We propose an empirical washing function which estimates the survival of probe bound targets. It depends on the intensity contribution due to specific and non-specific hybridization per probe which can be estimated for each probe using existing methods. The washing function allows probe intensities to be calibrated for the effect of washing. On a relative scale, proper calibration for washing markedly increases expression measures, especially in the limit of small and large values. CONCLUSIONS Washing is among the factors which potentially distort expression measures. The proposed first-order correction method allows direct implementation in existing calibration algorithms for microarray data. We provide an experimental 'washing data set' which might be used by the community for developing amendments of the washing correction.
Collapse
Affiliation(s)
- Hans Binder
- Interdisciplinary Centre for Bioinformatics; Universität Leipzig, D-4107 Leipzig, Haertelstr. 16-18, Germany
- LIFE Center; Universität Leipzig, D-4103 Leipzig, Philipp-Rosenthalstr. 27, Germany
| | - Knut Krohn
- Interdisciplinary Center for Clinical Research, Medical Faculty; Universität Leipzig, D-04107 Leipzig, Inselstr. 22, Germany
| | - Conrad J Burden
- Mathematical Sciences Institute, Australian National University, Canberra, A.C.T.0200, Australia
| |
Collapse
|