1
|
Detection of breast cancer-related point-mutations using screen-printed and gold-plated electrochemical sensor arrays suitable for point-of-care applications. TALANTA OPEN 2022. [DOI: 10.1016/j.talo.2022.100150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
2
|
Comparison of Reproducibility, Accuracy, Sensitivity, and Specificity of miRNA Quantification Platforms. Cell Rep 2020; 29:4212-4222.e5. [PMID: 31851944 PMCID: PMC7499898 DOI: 10.1016/j.celrep.2019.11.078] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 10/17/2019] [Accepted: 11/19/2019] [Indexed: 01/08/2023] Open
Abstract
Given the increasing interest in their use as disease biomarkers, the establishment of reproducible, accurate, sensitive, and specific platforms for microRNA (miRNA) quantification in biofluids is of high priority. We compare four platforms for these characteristics: small RNA sequencing (RNA-seq), FirePlex, EdgeSeq, and nCounter. For a pool of synthetic miRNAs, coefficients of variation for technical replicates are lower for EdgeSeq (6.9%) and RNA-seq (8.2%) than for FirePlex (22.4%); nCounter replicates are not performed. Receiver operating characteristic analysis for distinguishing present versus absent miRNAs shows small RNA-seq (area under curve 0.99) is superior to EdgeSeq (0.97), nCounter (0.94), and FirePlex (0.81). Expected differences in expression of placenta-associated miRNAs in plasma from pregnant and non-pregnant women are observed with RNA-seq and EdgeSeq, but not FirePlex or nCounter. These results indicate that differences in performance among miRNA profiling platforms impact ability to detect biological differences among samples and thus their relative utility for research and clinical use. Using pools of synthetic RNA oligonucleotides and standardized extracellular RNA samples, Godoy et al. compare small RNA sequencing to three targeted miRNA quantification platforms to evaluate reproducibility, bias, specificity and sensitivity, and accuracy. Each platform has strengths and limitations important to consider for biomarker discovery, clinical validation, and broad clinical use.
Collapse
|
3
|
Matveeva OV, Ogurtsov AY, Nazipova NN, Shabalina SA. Sequence characteristics define trade-offs between on-target and genome-wide off-target hybridization of oligoprobes. PLoS One 2018; 13:e0199162. [PMID: 29928000 PMCID: PMC6013149 DOI: 10.1371/journal.pone.0199162] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 06/02/2018] [Indexed: 12/20/2022] Open
Abstract
Off-target oligoprobe's interaction with partially complementary nucleotide sequences represents a problem for many bio-techniques. The goal of the study was to identify oligoprobe sequence characteristics that control the ratio between on-target and off-target hybridization. To understand the complex interplay between specific and genome-wide off-target (cross-hybridization) signals, we analyzed a database derived from genomic comparison hybridization experiments performed with an Affymetrix tiling array. The database included two types of probes with signals derived from (i) a combination of specific signal and cross-hybridization and (ii) genomic cross-hybridization only. All probes from the database were grouped into bins according to their sequence characteristics, where both hybridization signals were averaged separately. For selection of specific probes, we analyzed the following sequence characteristics: vulnerability to self-folding, nucleotide composition bias, numbers of G nucleotides and GGG-blocks, and occurrence of probe's k-mers in the human genome. Increases in bin ranges for these characteristics are simultaneously accompanied by a decrease in hybridization specificity-the ratio between specific and cross-hybridization signals. However, both averaged hybridization signals exhibit growing trends along with an increase of probes' binding energy, where the hybridization specific signal increases significantly faster in comparison to the cross-hybridization. The same trend is evident for the S function, which serves as a combined evaluation of probe binding energy and occurrence of probe's k-mers in the genome. Application of S allows extracting a larger number of specific probes, as compared to using only binding energy. Thus, we showed that high values of specific and cross-hybridization signals are not mutually exclusive for probes with high values of binding energy and S. In this study, the application of a new set of sequence characteristics allows detection of probes that are highly specific to their targets for array design and other bio-techniques that require selection of specific probes.
Collapse
Affiliation(s)
- Olga V. Matveeva
- Biopolymer Design LLC, Acton, Massachusetts, United States of America
- * E-mail: (OVM); (SAS)
| | - Aleksey Y. Ogurtsov
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Nafisa N. Nazipova
- Institute of Mathematical Problems of Biology, RAS – the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Moscow Region, Russia
| | - Svetlana A. Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (OVM); (SAS)
| |
Collapse
|
4
|
Masaki Y, Cayer D, McBride R, Ghadiri MR. A kinetically controlled, isothermal method for the detection of single nucleotide mismatches. Bioorg Med Chem Lett 2018; 28:2754-2758. [PMID: 29500066 DOI: 10.1016/j.bmcl.2018.02.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 02/13/2018] [Indexed: 11/28/2022]
Abstract
We describe an isothermal, enzyme-free method to detect single nucleotide differences between oligonucleotides of close homology. The approach exploits kinetic differences in toe-hold-mediated, nucleic acid strand-displacement reactions to detect single nucleotide polymorphisms (SNPs) with essentially "digital" precision. The theoretical underpinning, experimental analyses, predictability, and accuracy of this new method are reported. We demonstrate detection of biologically relevant SNPs and single nucleotide differences in the let-7 family of microRNAs. The method is adaptable to microarray formats, as demonstrated with on-chip detection of SNP variants involved in susceptibility to the therapeutic agents abacavir, Herceptin, and simvastatin.
Collapse
Affiliation(s)
- Yoshiaki Masaki
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - Devon Cayer
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - Ryan McBride
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - M Reza Ghadiri
- Department of Chemistry and The Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States.
| |
Collapse
|
5
|
Li R, Fristensky B, Wang G. Sequence data analysis and preprocessing for oligo probe design in microbial genomes. AIMS BIOENGINEERING 2017. [DOI: 10.3934/bioeng.2017.1.28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
6
|
Wan YW, Mach CM, Allen GI, Anderson ML, Liu Z. On the reproducibility of TCGA ovarian cancer microRNA profiles. PLoS One 2014; 9:e87782. [PMID: 24489963 PMCID: PMC3906208 DOI: 10.1371/journal.pone.0087782] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Accepted: 01/01/2014] [Indexed: 01/11/2023] Open
Abstract
Dysregulated microRNA (miRNA) expression is a well-established feature of human cancer. However, the role of specific miRNAs in determining cancer outcomes remains unclear. Using Level 3 expression data from the Cancer Genome Atlas (TCGA), we identified 61 miRNAs that are associated with overall survival in 469 ovarian cancers profiled by microarray (p<0.01). We also identified 12 miRNAs that are associated with survival when miRNAs were profiled in the same specimens using Next Generation Sequencing (miRNA-Seq) (p<0.01). Surprisingly, only 1 miRNA transcript is associated with ovarian cancer survival in both datasets. Our analyses indicate that this discrepancy is due to the fact that miRNA levels reported by the two platforms correlate poorly, even after correcting for potential issues inherent to signal detection algorithms. Corrections for false discovery and microRNA abundance had minimal impact on this discrepancy. Further investigation is warranted.
Collapse
Affiliation(s)
- Ying-Wooi Wan
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas, United States of America
| | - Claire M. Mach
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas, United States of America
- College of Pharmacy, University of Houston, Houston, Texas, United States of America
| | - Genevera I. Allen
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Neurological Research Institute, Texas Children's Hospital, Houston, Texas, United States of America
- Department of Statistics and Electrical & Computer Engineering, Rice University, Houston, Texas, United States of America
| | - Matthew L. Anderson
- Department of Obstetrics and Gynecology, Baylor College of Medicine, Houston, Texas, United States of America
- Department of Pathology and Immunology, Baylor College of Medicine, Houston, Texas, United States of America
- Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, Texas, United States of America
- * E-mail: (MLA); (ZL)
| | - Zhandong Liu
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, United States of America
- Dan L. Duncan Cancer Center, Baylor College of Medicine, Houston, Texas, United States of America
- Computational and Integrative Biomedical Research (CIBR) Center, Baylor College of Medicine, Houston, Texas, United States of America
- Neurological Research Institute, Texas Children's Hospital, Houston, Texas, United States of America
- * E-mail: (MLA); (ZL)
| |
Collapse
|
7
|
Correction of spatial bias in oligonucleotide array data. Adv Bioinformatics 2013; 2013:167915. [PMID: 23573083 PMCID: PMC3610395 DOI: 10.1155/2013/167915] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 02/02/2013] [Indexed: 01/17/2023] Open
Abstract
Background. Oligonucleotide microarrays allow for high-throughput gene expression profiling assays. The technology relies on the fundamental assumption that observed hybridization signal intensities (HSIs) for each intended target, on average, correlate with their target's true concentration in the sample. However, systematic, nonbiological variation from several sources undermines this hypothesis. Background hybridization signal has been previously identified as one such important source, one manifestation of which appears in the form of spatial autocorrelation. Results. We propose an algorithm, pyn, for the elimination of spatial autocorrelation in HSIs, exploiting the duality of desirable mutual information shared by probes in a common probe set and undesirable mutual information shared by spatially proximate probes. We show that this correction procedure reduces spatial autocorrelation in HSIs; increases HSI reproducibility across replicate arrays; increases differentially expressed gene detection power; and performs better than previously published methods. Conclusions. The proposed algorithm increases both precision and accuracy, while requiring virtually no changes to users' current analysis pipelines: the correction consists merely of a transformation of raw HSIs (e.g., CEL files for Affymetrix arrays). A free, open-source implementation is provided as an R package, compatible with standard Bioconductor tools. The approach may also be tailored to other platform types and other sources of bias.
Collapse
|
8
|
Ono N, Suzuki S, Furusawa C, Shimizu H, Yomo T. Development of a physical model-based algorithm for the detection of single-nucleotide substitutions by using tiling microarrays. PLoS One 2013; 8:e54571. [PMID: 23382915 PMCID: PMC3557292 DOI: 10.1371/journal.pone.0054571] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2012] [Accepted: 12/12/2012] [Indexed: 11/18/2022] Open
Abstract
High-density DNA microarrays are useful tools for analyzing sequence changes in DNA samples. Although microarray analysis provides informative signals from a large number of probes, the analysis and interpretation of these signals have certain inherent limitations, namely, complex dependency of signals on the probe sequences and the existence of false signals arising from non-specific binding between probe and target. In this study, we have developed a novel algorithm to detect the single-base substitutions by using microarray data based on a thermodynamic model of hybridization. We modified the thermodynamic model by introducing a penalty for mismatches that represent the effects of substitutions on hybridization affinity. This penalty results in significantly higher detection accuracy than other methods, indicating that the incorporation of hybridization free energy can improve the analysis of sequence variants by using microarray data.
Collapse
Affiliation(s)
- Naoaki Ono
- Graduate School of Information Science, Nara Institute of Science and Technology, Ikoma, Nara, Japan
| | - Shingo Suzuki
- Quantitative Biology Center, RIKEN, Suita, Osaka, Japan
| | - Chikara Furusawa
- Quantitative Biology Center, RIKEN, Suita, Osaka, Japan
- * E-mail:
| | - Hiroshi Shimizu
- Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan
| | - Tetsuya Yomo
- Graduate School of Information Science and Technology, Osaka University, Suita, Osaka, Japan
- Graduate School of Frontier Biosciences, Osaka University, Suita, Osaka, Japan
- ERATO, JST, Suita, Osaka, Japan
| |
Collapse
|
9
|
Jakubek YA, Cutler DJ. A model of binding on DNA microarrays: understanding the combined effect of probe synthesis failure, cross-hybridization, DNA fragmentation and other experimental details of affymetrix arrays. BMC Genomics 2012; 13:737. [PMID: 23270536 PMCID: PMC3548757 DOI: 10.1186/1471-2164-13-737] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 12/16/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND DNA microarrays are used both for research and for diagnostics. In research, Affymetrix arrays are commonly used for genome wide association studies, resequencing, and for gene expression analysis. These arrays provide large amounts of data. This data is analyzed using statistical methods that quite often discard a large portion of the information. Most of the information that is lost comes from probes that systematically fail across chips and from batch effects. The aim of this study was to develop a comprehensive model for hybridization that predicts probe intensities for Affymetrix arrays and that could provide a basis for improved microarray analysis and probe development. The first part of the model calculates probe binding affinities to all the possible targets in the hybridization solution using the Langmuir isotherm. In the second part of the model we integrate details that are specific to each experiment and contribute to the differences between hybridization in solution and on the microarray. These details include fragmentation, wash stringency, temperature, salt concentration, and scanner settings. Furthermore, the model fits probe synthesis efficiency and target concentration parameters directly to the data. All the parameters used in the model have a well-established physical origin. RESULTS For the 302 chips that were analyzed the mean correlation between expected and observed probe intensities was 0.701 with a range of 0.88 to 0.55. All available chips were included in the analysis regardless of the data quality. Our results show that batch effects arise from differences in probe synthesis, scanner settings, wash strength, and target fragmentation. We also show that probe synthesis efficiencies for different nucleotides are not uniform. CONCLUSIONS To date this is the most complete model for binding on microarrays. This is the first model that includes both probe synthesis efficiency and hybridization kinetics/cross-hybridization. These two factors are sequence dependent and have a large impact on probe intensity. The results presented here provide novel insight into the effect of probe synthesis errors on Affymetrix microarrays; furthermore, the algorithms developed in this work provide useful tools for the analysis of cross-hybridization, probe synthesis efficiency, fragmentation, wash stringency, temperature, and salt concentration on microarray intensities.
Collapse
Affiliation(s)
- Yasminka A Jakubek
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | | |
Collapse
|
10
|
Obana E, Hada T, Yamamoto T, Kakuhata R, Saze T, Miyoshi H, Hori T, Shinohara Y. Properties of signal intensities observed with individual probes of GeneChip Rat Gene 1.0 ST Array, an affymetric microarray system. Biotechnol Lett 2012; 34:213-9. [DOI: 10.1007/s10529-011-0776-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Accepted: 10/05/2011] [Indexed: 11/30/2022]
|
11
|
Pozhitkov AE, Beikler T, Flemmig T, Noble PA. High-throughput methods for analysis of the human oral microbiome. Periodontol 2000 2011; 55:70-86. [PMID: 21134229 DOI: 10.1111/j.1600-0757.2010.00380.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
12
|
Painter MW, Davis S, Hardy RR, Mathis D, Benoist C. Transcriptomes of the B and T lineages compared by multiplatform microarray profiling. THE JOURNAL OF IMMUNOLOGY 2011; 186:3047-57. [PMID: 21307297 DOI: 10.4049/jimmunol.1002695] [Citation(s) in RCA: 79] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
T and B lymphocytes are developmentally and functionally related cells of the immune system, representing the two major branches of adaptive immunity. Although originating from a common precursor, they play very different roles: T cells contribute to and drive cell-mediated immunity, whereas B cells secrete Abs. Because of their functional importance and well-characterized differentiation pathways, T and B lymphocytes are ideal cell types with which to understand how functional differences are encoded at the transcriptional level. Although there has been a great deal of interest in defining regulatory factors that distinguish T and B cells, a truly genomewide view of the transcriptional differences between these two cells types has not yet been taken. To obtain a more global perspective of the transcriptional differences underlying T and B cells, we exploited the statistical power of combinatorial profiling on different microarray platforms, and the breadth of the Immunological Genome Project gene expression database, to generate robust differential signatures. We find that differential expression in T and B cells is pervasive, with the majority of transcripts showing statistically significant differences. These distinguishing characteristics are acquired gradually, through all stages of B and T differentiation. In contrast, very few T versus B signature genes are uniquely expressed in these lineages, but are shared throughout immune cells.
Collapse
Affiliation(s)
- Michio W Painter
- Department of Pathology, Harvard Medical School, Boston, MA 02215, USA
| | | | | | | | | | | |
Collapse
|
13
|
Lahti L, Elo LL, Aittokallio T, Kaski S. Probabilistic analysis of probe reliability in differential gene expression studies with short oligonucleotide arrays. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:217-225. [PMID: 21071809 DOI: 10.1109/tcbb.2009.38] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Probe defects are a major source of noise in gene expression studies. While existing approaches detect noisy probes based on external information such as genomic alignments, we introduce and validate a targeted probabilistic method for analyzing probe reliability directly from expression data and independently of the noise source. This provides insights into the various sources of probe-level noise and gives tools to guide probe design.
Collapse
Affiliation(s)
- Leo Lahti
- Helsinki Institute for Information Technology, Department of Information and Computer Science, Aalto University School of Science and Technology, PO Box 15400, FI-00076 Aalto, Finland.
| | | | | | | |
Collapse
|
14
|
Ohtaki M, Otani K, Hiyama K, Kamei N, Satoh K, Hiyama E. A robust method for estimating gene expression states using Affymetrix microarray probe level data. BMC Bioinformatics 2010; 11:183. [PMID: 20380745 PMCID: PMC2873532 DOI: 10.1186/1471-2105-11-183] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2009] [Accepted: 04/12/2010] [Indexed: 12/04/2022] Open
Abstract
Background Microarray technology is a high-throughput method for measuring the expression levels of thousand of genes simultaneously. The observed intensities combine a non-specific binding, which is a major disadvantage with microarray data. The Affymetrix GeneChip assigned a mismatch (MM) probe with the intention of measuring non-specific binding, but various opinions exist regarding usefulness of MM measures. It should be noted that not all observed intensities are associated with expressed genes and many of those are associated with unexpressed genes, of which measured values express mere noise due to non-specific binding, cross-hybridization, or stray signals. The implicit assumption that all genes are expressed leads to poor performance of microarray data analyses. We assume two functional states of a gene - expressed or unexpressed - and propose a robust method to estimate gene expression states using an order relationship between PM and MM measures. Results An indicator 'probability of a gene being expressed' was obtained using the number of probe pairs within a probe set where the PM measure exceeds the MM measure. We examined the validity of the proposed indicator using Human Genome U95 data sets provided by Affymetrix. The usefulness of 'probability of a gene being expressed' is illustrated through an exploration of candidate genes involved in neuroblastoma prognosis. We identified the candidate genes for which expression states differed (un-expressed or expressed) when compared between two outcomes. The validity of this result was subsequently confirmed by quantitative RT-PCR. Conclusion The proposed qualitative evaluation, 'probability of a gene being expressed', is a useful indicator for improving microarray data analysis. It is useful to reduce the number of false discoveries. Expression states - expressed or unexpressed - correspond to the most fundamental gene function 'On' and 'Off', which can lead to biologically meaningful results.
Collapse
Affiliation(s)
- Megu Ohtaki
- Department of Environmetrics and Biometrics, Research Institute for Radiation Biology and Medicine, Hiroshima University, 1-2-3 Kasumi, Minami-ku, Hiroshima, 734-8551, Japan.
| | | | | | | | | | | |
Collapse
|
15
|
Eklund AC, Friis P, Wernersson R, Szallasi Z. Optimization of the BLASTN substitution matrix for prediction of non-specific DNA microarray hybridization. Nucleic Acids Res 2009; 38:e27. [PMID: 19969549 PMCID: PMC2831327 DOI: 10.1093/nar/gkp1116] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
DNA microarray measurements are susceptible to error caused by non-specific hybridization between a probe and a target (cross-hybridization), or between two targets (bulk-hybridization). Search algorithms such as BLASTN can quickly identify potentially hybridizing sequences. We set out to improve BLASTN accuracy by modifying the substitution matrix and gap penalties. We generated gene expression microarray data for samples in which 1 or 10% of the target mass was an exogenous spike of known sequence. We found that the 10% spike induced 2-fold intensity changes in 3% of the probes, two-third of which were decreases in intensity likely caused by bulk-hybridization. These changes were correlated with similarity between the spike and probe sequences. Interestingly, even very weak similarities tended to induce a change in probe intensity with the 10% spike. Using this data, we optimized the BLASTN substitution matrix to more accurately identify probes susceptible to non-specific hybridization with the spike. Relative to the default substitution matrix, the optimized matrix features a decreased score for A–T base pairs relative to G–C base pairs, resulting in a 5–15% increase in area under the ROC curve for identifying affected probes. This optimized matrix may be useful in the design of microarray probes, and in other BLASTN-based searches for hybridization partners.
Collapse
Affiliation(s)
- Aron C Eklund
- Center for Biological Sequence Analysis, Technical University of Denmark, DK-2800 Lyngby, Denmark.
| | | | | | | |
Collapse
|
16
|
EvoOligo: Oligonucleotide Probe Design With Multiobjective Evolutionary Algorithms. ACTA ACUST UNITED AC 2009; 39:1606-16. [DOI: 10.1109/tsmcb.2009.2023078] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
17
|
Nurtdinov RN, Vasiliev MO, Ershova AS, Lossev IS, Karyagina AS. PLANdbAffy: probe-level annotation database for Affymetrix expression microarrays. Nucleic Acids Res 2009; 38:D726-30. [PMID: 19906711 PMCID: PMC2808952 DOI: 10.1093/nar/gkp969] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Standard Affymetrix technology evaluates gene expression by measuring the intensity of mRNA hybridization with a panel of the 25-mer oligonucleotide probes, and summarizing the probe signal intensities by a robust average method. However, in many cases, signal intensity of the probe does not correlate with gene expression. This could be due to the hybridization of the probe to a transcript of another gene, mapping of the probe to an intron, alternative splicing, single nucleotide polymorphisms and other reasons. We have developed a database, PLANdbAffy (available at http://affymetrix2.bioinf.fbb.msu.ru), that contains the results of the alignment of probe sequences from five Affymetrix expression microarrays to the human genome. We have determined the probes matching the transcript-coding regions in the correct orientation. For each such probe alignment region, we determined the mRNA and EST sequences that contain the probe sequence. In the textual part of the database interface we summarize the data on the sequences that cover the probe alignment region and SNPs that are located inside it. The graphical part of our database interface is implemented as custom tracks to the UCSC genome browser that allows one to utilize all the data that are offered by UCSC browser.
Collapse
Affiliation(s)
- Ramil N Nurtdinov
- Departament of Bioengineering and Bioinformatics, MV Lomonosov Moscow State University, Vorbyevy Gory 1-73, Moscow 119992, Russia.
| | | | | | | | | |
Collapse
|
18
|
She Y, Hubbell E, Wang H. Resolving deconvolution ambiguity in gene alternative splicing. BMC Bioinformatics 2009; 10:237. [PMID: 19653895 PMCID: PMC2739860 DOI: 10.1186/1471-2105-10-237] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2009] [Accepted: 08/04/2009] [Indexed: 11/16/2022] Open
Abstract
Background For many gene structures it is impossible to resolve intensity data uniquely to establish abundances of splice variants. This was empirically noted by Wang et al. in which it was called a "degeneracy problem". The ambiguity results from an ill-posed problem where additional information is needed in order to obtain an unique answer in splice variant deconvolution. Results In this paper, we analyze the situations under which the problem occurs and perform a rigorous mathematical study which gives necessary and sufficient conditions on how many and what type of constraints are needed to resolve all ambiguity. This analysis is generally applicable to matrix models of splice variants. We explore the proposal that probe sequence information may provide sufficient additional constraints to resolve real-world instances. However, probe behavior cannot be predicted with sufficient accuracy by any existing probe sequence model, and so we present a Bayesian framework for estimating variant abundances by incorporating the prediction uncertainty from the micro-model of probe responsiveness into the macro-model of probe intensities. Conclusion The matrix analysis of constraints provides a tool for detecting real-world instances in which additional constraints may be necessary to resolve splice variants. While purely mathematical constraints can be stated without error, real-world constraints may themselves be poorly resolved. Our Bayesian framework provides a generic solution to the problem of uniquely estimating transcript abundances given additional constraints that themselves may be uncertain, such as regression fit to probe sequence models. We demonstrate the efficacy of it by extensive simulations as well as various biological data.
Collapse
Affiliation(s)
- Yiyuan She
- Affymetrix Inc, Santa Clara, CA 95051, USA.
| | | | | |
Collapse
|
19
|
Thomassen GOS, Rowe AD, Lagesen K, Lindvall JM, Rognes T. Custom design and analysis of high-density oligonucleotide bacterial tiling microarrays. PLoS One 2009; 4:e5943. [PMID: 19536279 PMCID: PMC2691959 DOI: 10.1371/journal.pone.0005943] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Accepted: 05/18/2009] [Indexed: 11/21/2022] Open
Abstract
Background High-density tiling microarrays are a powerful tool for the characterization of complete genomes. The two major computational challenges associated with custom-made arrays are design and analysis. Firstly, several genome dependent variables, such as the genome's complexity and sequence composition, need to be considered in the design to ensure a high quality microarray. Secondly, since tiling projects today very often exceed the limits of conventional array-experiments, researchers cannot use established computer tools designed for commercial arrays, and instead have to redesign previous methods or create novel tools. Principal Findings Here we describe the multiple aspects involved in the design of tiling arrays for transcriptome analysis and detail the normalisation and analysis procedures for such microarrays. We introduce a novel design method to make two 280,000 feature microarrays covering the entire genome of the bacterial species Escherichia coli and Neisseria meningitidis, respectively, as well as the use of multiple copies of control probe-sets on tiling microarrays. Furthermore, a novel normalisation and background estimation procedure for tiling arrays is presented along with a method for array analysis focused on detection of short transcripts. The design, normalisation and analysis methods have been applied in various experiments and several of the detected novel short transcripts have been biologically confirmed by Northern blot tests. Conclusions Tiling-arrays are becoming increasingly applicable in genomic research, but researchers still lack both the tools for custom design of arrays, as well as the systems and procedures for analysis of the vast amount of data resulting from such experiments. We believe that the methods described herein will be a useful contribution and resource for researchers designing and analysing custom tiling arrays for both bacteria and higher organisms.
Collapse
Affiliation(s)
- Gard O. S. Thomassen
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, University of Oslo, Oslo, Norway
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Alexander D. Rowe
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | - Karin Lagesen
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
| | | | - Torbjørn Rognes
- Centre for Molecular Biology and Neuroscience (CMBN), Institute of Medical Microbiology, Oslo University Hospital, Rikshospitalet, Oslo, Norway
- Department of Informatics, University of Oslo, Oslo, Norway
- * E-mail:
| |
Collapse
|
20
|
Langdon WB, Upton GJG, Harrison AP. Probes containing runs of guanines provide insights into the biophysics and bioinformatics of Affymetrix GeneChips. Brief Bioinform 2009; 10:259-77. [PMID: 19359259 DOI: 10.1093/bib/bbp018] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The reliable interpretation of Affymetrix GeneChip data is a multi-faceted problem. The interplay between biophysics, bioinformatics and mining of GeneChip surveys is leading to new insights into how best to analyse the data. Many of the molecular processes occurring on the surfaces of GeneChips result from the high surface density of probes. Interactions between neighbouring adjacent probes affect their rate and strength of hybridization to targets. Competing targets may hybridize to the same probe, and targets may partially bind to more than one probe. The formation of these partial hybrids results in a number of probes not reaching thermodynamic equilibrium during hybridization. Moreover, some targets fold up, or cross-hybridize to other targets. Furthermore, probes may fold and can undergo chemical saturation. There are also sequence-dependent differences in the rates of target desorption during the washing stage. Improvements in the mappings between probe sequence and biological databases are leading to more accurate gene expression profiles. Moreover, algorithms that combine the intensities of multiple probes into single measures of expression are increasingly dependent upon models of the hybridization processes occurring on GeneChips. The large repositories of GeneChip data can be searched for systematic effects across many experiments. This data mining has led to the discovery of a family of thousands of probes, which show correlated expression across thousands of GeneChip experiments. These probes contain runs of guanines, suggesting that G-quadruplexes are able to form on GeneChips. We discuss the impact of these structures on the interpretation of data from GeneChip experiments.
Collapse
Affiliation(s)
- William B Langdon
- Department of Mathematical Sciences and Department of Biological Sciences, University of Essex, Wivenhoe Park, Colchester, Essex CO4 3SQ, UK
| | | | | |
Collapse
|
21
|
Rangel-López A, Méndez-Tenorio A, Beattie KL, Maldonado R, Mendoza P, Vázquez G, Pérez-Plasencia C, Sánchez M, Navarro G, Salcedo M. Specific mutation screening of TP53 gene by low-density DNA microarray. Nanotechnol Sci Appl 2009; 2:1-12. [PMID: 24198462 PMCID: PMC3781765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023] Open
Abstract
TP53 is the most commonly mutated gene in human cancers. Approximately 90% of mutations in this gene are localized between domains encoding exons 5 to 8. The aim of this investigation was to examine the ability of the low density DNA microarray with the assistance of double tandem hybridization platform to characterize TP53 mutational hotspots in exons 5, 7, and 8 of the TP53. Nineteen capture probes specific to each potential mutation site were designed to hybridize to specific site. Virtual hybridization was used to predict the stability of hybridization of each capture probe with the target. Thirty-three DNA samples from different sources were analyzed for mutants in these exons. A total of 32 codon substitutions were found by DNA sequencing. 24 of them a showed a perfect correlation with the hybridization pattern system and DNA sequencing analysis of the regions scanned. Although in this work we directed our attention to some of the most representative mutations of the TP53 gene, the results suggest that this microarray system proved to be a rapid, reliable, and effective method for screening all the mutations in TP53 gene.
Collapse
Affiliation(s)
- Angélica Rangel-López
- Laboratorio de Oncología Genómica, Unidad de Investigación Médica en Enfermedades Oncológicas, Hospital de Oncología, CMN Siglo XXI-IMSS, Mexico City, Mexico
- Unidad de Investigación Médica en Enfermedades Nefrológicas, Hospital de Especialidades, CMN Siglo XXI-IMSS, Mexico City; Mexico
- Laboratorio de Biotecnología y Bioinformática Genómica, Escuela Nacional de Ciencias Biológicas, IPN Mexico City, Mexico
| | - Alfonso Méndez-Tenorio
- Laboratorio de Biotecnología y Bioinformática Genómica, Escuela Nacional de Ciencias Biológicas, IPN Mexico City, Mexico
| | | | - Rogelio Maldonado
- Laboratorio de Biotecnología y Bioinformática Genómica, Escuela Nacional de Ciencias Biológicas, IPN Mexico City, Mexico
| | - Patricia Mendoza
- Laboratorio de Oncología Genómica, Unidad de Investigación Médica en Enfermedades Oncológicas, Hospital de Oncología, CMN Siglo XXI-IMSS, Mexico City, Mexico
| | - Guelaguetza Vázquez
- Laboratorio de Oncología Genómica, Unidad de Investigación Médica en Enfermedades Oncológicas, Hospital de Oncología, CMN Siglo XXI-IMSS, Mexico City, Mexico
| | - Carlos Pérez-Plasencia
- Unidad de Investigación Biomédica en Cáncer, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México UNAM, Instituto Nacional de Cancerología INCAN, Mexico City, Mexico
| | - Martha Sánchez
- Unidad de Investigación Médica en Enfermedades Nefrológicas, Hospital de Especialidades, CMN Siglo XXI-IMSS, Mexico City; Mexico
| | | | - Mauricio Salcedo
- Laboratorio de Oncología Genómica, Unidad de Investigación Médica en Enfermedades Oncológicas, Hospital de Oncología, CMN Siglo XXI-IMSS, Mexico City, Mexico
| |
Collapse
|
22
|
Seringhaus M, Rozowsky J, Royce T, Nagalakshmi U, Jee J, Snyder M, Gerstein M. Mismatch oligonucleotides in human and yeast: guidelines for probe design on tiling microarrays. BMC Genomics 2008; 9:635. [PMID: 19117516 PMCID: PMC2642824 DOI: 10.1186/1471-2164-9-635] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2008] [Accepted: 12/31/2008] [Indexed: 11/12/2022] Open
Abstract
Background Mismatched oligonucleotides are widely used on microarrays to differentiate specific from nonspecific hybridization. While many experiments rely on such oligos, the hybridization behavior of various degrees of mismatch (MM) structure has not been extensively studied. Here, we present the results of two large-scale microarray experiments on S. cerevisiae and H. sapiens genomic DNA, to explore MM oligonucleotide behavior with real sample mixtures under tiling-array conditions. Results We examined all possible nucleotide substitutions at the central position of 36-nucleotide probes, and found that nonspecific binding by MM oligos depends upon the individual nucleotide substitutions they incorporate: C→A, C→G and T→A (yielding purine-purine mispairs) are most disruptive, whereas A→X were least disruptive. We also quantify a marked GC skew effect: substitutions raising probe GC content exhibit higher intensity (and vice versa). This skew is small in highly-expressed regions (± 0.5% of total intensity range) and large (± 2% or more) elsewhere. Multiple mismatches per oligo are largely additive in effect: each MM added in a distributed fashion causes an additional 21% intensity drop relative to PM, three-fold more disruptive than adding adjacent mispairs (7% drop per MM). Conclusion We investigate several parameters for oligonucleotide design, including the effects of each central nucleotide substitution on array signal intensity and of multiple MM per oligo. To avoid GC skew, individual substitutions should not alter probe GC content. RNA sample mixture complexity may increase the amount of nonspecific hybridization, magnify GC skew and boost the intensity of MM oligos at all levels.
Collapse
Affiliation(s)
- Michael Seringhaus
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA.
| | | | | | | | | | | | | |
Collapse
|
23
|
A short-oligonucleotide microarray that allows improved detection of gastrointestinal tract microbial communities. BMC Microbiol 2008; 8:195. [PMID: 19014434 PMCID: PMC2628385 DOI: 10.1186/1471-2180-8-195] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2008] [Accepted: 11/11/2008] [Indexed: 01/01/2023] Open
Abstract
Background The human gastrointestinal (GI) tract contains a diverse collection of bacteria, most of which are unculturable by conventional microbiological methods. Increasingly molecular profiling techniques are being employed to examine this complex microbial community. The purpose of this study was to develop a microarray technique based on 16S ribosomal gene sequences for rapidly monitoring the microbial population of the GI tract. Results We have developed a culture-independent, semi-quantitative, rapid method for detection of gut bacterial populations based on 16S rDNA probes using a DNA microarray. We compared the performance of microarrays based on long (40- and 50-mer) and short (16–21-mer) oligonucleotides. Short oligonucleotides consistently gave higher specificity. Optimal DNA amplification and labelling, hybridisation and washing conditions were determined using a probe with an increasing number of nucleotide mismatches, identifying the minimum number of nucleotides needed to distinguish between perfect and mismatch probes. An independent PCR-based control was used to normalise different hybridisation results, and to make comparisons between different samples, greatly improving the detection of changes in the gut bacterial population. The sensitivity of the microarray was determined to be 8.8 × 104 bacterial cells g-1 faecal sample, which is more sensitive than a number of existing profiling methods. The short oligonucleotide microarray was used to compare the faecal flora from healthy individuals and a patient suffering from Ulcerative Colitis (UC) during the active and remission states. Differences were identified in the bacterial profiles between healthy individuals and a UC patient. These variations were verified by Denaturing Gradient Gel Electrophoresis (DGGE) and DNA sequencing. Conclusion In this study we demonstrate the design, testing and application of a highly sensitive, short oligonucleotide community microarray. Our approach allows the rapid discrimination of bacteria inhabiting the human GI tract, at taxonomic levels ranging from species to the superkingdom bacteria. The optimised protocol is available at: . It offers a high throughput method for studying the dynamics of the bacterial population over time and between individuals.
Collapse
|
24
|
Abstract
Motivation: Microarray designs have become increasingly probe-rich, enabling targeting of specific features, such as individual exons or single nucleotide polymorphisms. These arrays have the potential to achieve quantitative high-throughput estimates of transcript abundances, but currently these estimates are affected by biases due to cross-hybridization, in which probes hybridize to off-target transcripts. Results: To study cross-hybridization, we map Affymetrix exon array probes to a set of annotated mRNA transcripts, allowing a small number of mismatches or insertion/deletions between the two sequences. Based on a systematic study of the degree to which probes with a given match type to a transcript are affected by cross-hybridization, we developed a strategy to correct for cross-hybridization biases of gene-level expression estimates. Comparison with Solexa ultra high-throughput sequencing data demonstrates that correction for cross-hybridization leads to a significant improve-ment of gene expression estimates. Availability: We provide mappings between human and mouse exon array probes and off-target transcripts and provide software extending the GeneBASE program for generating gene-level expression estimates including the cross-hybridization correction http://biogibbs.stanford.edu/~kkapur/GeneBase/. Contact:whwong@stanford.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Karen Kapur
- Department of Statistics, Institute for Computational and Mathematical Engineering, Stanford University, Stanford, CA, USA
| | | | | | | |
Collapse
|
25
|
Furusawa C, Ono N, Suzuki S, Agata T, Shimizu H, Yomo T. Model-based analysis of non-specific binding for background correction of high-density oligonucleotide microarrays. ACTA ACUST UNITED AC 2008; 25:36-41. [PMID: 18977779 DOI: 10.1093/bioinformatics/btn570] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
MOTIVATION High-density DNA microarrays provide us with useful tools for analyzing DNA and RNA comprehensively. However, the background signal caused by the non-specific binding (NSB) between probe and target makes it difficult to obtain accurate measurements. To remove the background signal, there is a set of background probes on Affymetrix Exon arrays to represent the amount of non-specific signals, and an accurate estimation of non-specific signals using these background probes is desirable for improvement of microarray analyses. RESULTS We developed a thermodynamic model of NSB on short nucleotide microarrays in which the NSBs are modeled by duplex formation of probes and multiple hypothetical targets. We fitted the observed signal intensities of the background probes with those expected by the model to obtain the model parameters. As a result, we found that the presented model can improve the accuracy of prediction of non-specific signals in comparison with previously proposed methods. This result will provide a useful method to correct for the background signal in oligonucleotide microarray analysis. AVAILABILITY The software is implemented in the R language and can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/MSNS/).
Collapse
Affiliation(s)
- Chikara Furusawa
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, Suita, Osaka 565-0871, Japan.
| | | | | | | | | | | |
Collapse
|
26
|
Potter DP, Yan P, Huang THM, Lin S. Probe signal correction for differential methylation hybridization experiments. BMC Bioinformatics 2008; 9:453. [PMID: 18947421 PMCID: PMC2603337 DOI: 10.1186/1471-2105-9-453] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2008] [Accepted: 10/23/2008] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Non-biological signal (or noise) has been the bane of microarray analysis. Hybridization effects related to probe-sequence composition and DNA dye-probe interactions have been observed in differential methylation hybridization (DMH) microarray experiments as well as other effects inherent to the DMH protocol. RESULTS We suggest two models to correct for non-biologically relevant probe signal with an overarching focus on probe-sequence composition. The estimated effects are evaluated and the strengths of the models are considered in the context of DMH analyses. CONCLUSION The majority of estimated parameters were statistically significant in all considered models. Model selection for signal correction is based on interpretation of the estimated values and their biological significance.
Collapse
Affiliation(s)
- Dustin P Potter
- Human Cancer Genetics Program, OSU Comprehensive Cancer Center, The Ohio State University, Columbus, OH, USA.
| | | | | | | |
Collapse
|
27
|
McGann P, Raengpradub S, Ivanek R, Wiedmann M, Boor KJ. Differential regulation of Listeria monocytogenes internalin and internalin-like genes by sigmaB and PrfA as revealed by subgenomic microarray analyses. Foodborne Pathog Dis 2008; 5:417-35. [PMID: 18713061 DOI: 10.1089/fpd.2008.0085] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The Listeria monocytogenes genome contains more than 20 genes that encode cell surface-associated internalins. To determine the contributions of the alternative sigma factor sigma(B) and the virulence gene regulator PrfA to internalin gene expression, a subgenomic microarray was designed to contain two probes for each of 24 internalin-like genes identified in the L. monocytogenes 10403S genome. Competitive microarray hybridization was performed on RNA extracted from (i) the 10403S parent strain and an isogenic Delta sigB strain; (ii) 10403S and an isogenic Delta prfA strain; (iii) a (G155S) 10403S derivative that expresses the constitutively active PrfA (PrfA*) and the Delta prfA strain; and (iv) 10403S and an isogenic Delta sigB Delta prfA strain. Sigma(B)- and PrfA-dependent transcription of selected genes was further confirmed by quantitative reverse-transcriptase polymerase chain reaction. For the 24 internalin-like genes examined, (i) both sigma(B) and PrfA contributed to transcription of inlA and inlB, (ii) only sigma(B) contributed to transcription of inlC2, inlD, lmo0331, and lmo0610; (iii) only PrfA contributed to transcription of inlC and lmo2445; and (iv) neither sigma(B) nor PrfA contributed to transcription of the remaining 16 internalin-like genes under the conditions tested.
Collapse
Affiliation(s)
- Patrick McGann
- Department of Food Science, Cornell University, Ithaca, New York 14853, USA
| | | | | | | | | |
Collapse
|
28
|
Siegmund K, Ahlborn C, Richert C. ChipCheckII - predicting binding curves for multiple analyte strands on small DNA microarrays. NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2008; 27:376-88. [PMID: 18404572 DOI: 10.1080/15257770801944147] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Incomplete binding, saturation, and cross-hybridization between partially complementary strands complicate the parallel detection of nucleic acids via DNA microarrays. Treating the competing equilibria governing binding to microarrays requires computational tools. We have developed the web-based program ChipCheckII that calculates total hybridization matrices for target strands interacting with probes on small DNA microarrays. The program can be used to compute the extent of cross-hybridization and other phenomena affecting fidelity of detection based on sequences, quantities of strands, and hybridization conditions as inputs. Enthalpy and entropy of duplex formation are generated locally with UNAfold, including those for complexes that are partially matched. Simulated binding versus temperature curves for portions of a commercial genome chip demonstrate the extent to which cross-hybridization can complicate DNA detection. ChipCheckII is expected to aid nucleic acid chemists in developing high fidelity DNA microarrays.
Collapse
Affiliation(s)
- Karsten Siegmund
- Institute for Organic Chemistry, University of Karlsruhe, Karlsruhe, Germany
| | | | | |
Collapse
|
29
|
Pozhitkov AE, Nies G, Kleinhenz B, Tautz D, Noble PA. Simultaneous quantification of multiple nucleic acid targets in complex rRNA mixtures using high density microarrays and nonspecific hybridization as a source of information. J Microbiol Methods 2008; 75:92-102. [PMID: 18579240 DOI: 10.1016/j.mimet.2008.05.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Revised: 05/01/2008] [Accepted: 05/07/2008] [Indexed: 11/26/2022]
Abstract
To date, it has been problematic to accurately quantify multiple nucleic acid sequences, representing microbial targets, in multi-target mixtures using oligonucleotide microarrays, primarily due to nonspecific target binding (i.e., cross-hybridization). While some studies ignore the effects of nonspecific binding, other studies have developed approaches to minimize nonspecific binding, such as physical modeling to design highly specific probes, subtracting nonspecific signal using mismatch probes, and/or removing nonspecific duplexes by scanning through a range of wash stringencies. We have developed an alternative approach that, in contrast to previous approaches, uses nonspecific target binding as a source of information. Specifically, the new approach uses hybridization patterns (fingerprints) to quantify specific nucleic acid targets in complex target mixtures. We evaluated the approach by mixing together in vitro transcribed 28S rRNA targets at varying concentrations (up to 1.0 nM), and hybridizing the 24 mixtures to microarrays (n=3160 probes, in duplicate). Three independent Latin-square-designed experiments revealed accurate quantification of the targets. The regression between actual concentration of targets and those determined by the approach were highly positively correlated with high R(2) values (e.g., R(2)=0.90, n=6 targets; R(2)=0.84, n=8 targets; R(2)=0.82, n=10 targets).
Collapse
Affiliation(s)
- Alex E Pozhitkov
- College of Marine Sciences, P.O. Box 7000, University of Southern Mississippi, Ocean Springs, MS 39566, USA.
| | | | | | | | | |
Collapse
|
30
|
Malanoski AP, Lin B, Stenger DA. A model of base-call resolution on broad-spectrum pathogen detection resequencing DNA microarrays. Nucleic Acids Res 2008; 36:3194-201. [PMID: 18413341 PMCID: PMC2425482 DOI: 10.1093/nar/gkm1156] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Oligonucleotide microarrays offer the potential to efficiently test for multiple organisms, an excellent feature for surveillance applications. Among these, resequencing microarrays are of particular interest, as they possess additional unique capabilities to track pathogens' genetic variations and perform detailed discrimination of closely related organisms. However, this potential can only be realized if the costs of developing the detection microarray are kept at a manageable level. Selection and verification of the probes are key factors affecting microarray design costs that can be reduced through the development and use of in silico modeling. Models created for other types of microarrays do not meet all the required criteria for this type of microarray. We describe here in silico methods for designing resequencing microarrays targeted for multiple organism detection. The model development presented here has focused on accurate base-call prediction in regions that are applicable to resequencing microarrays designed for multiple organism detection, a variation from other uses of a predictive model in which perfect prediction of all hybridization events is necessary. The model will assist in simplifying the design of resequencing microarrays and in reduction of the time and costs required for their development for new applications.
Collapse
Affiliation(s)
- Anthony P Malanoski
- Center for Bio/Molecular Science & Engineering, Code 6900, Naval Research Laboratory, Washington DC 20375, USA.
| | | | | |
Collapse
|
31
|
Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T. An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays. Bioinformatics 2008; 24:1278-85. [PMID: 18378525 PMCID: PMC2373920 DOI: 10.1093/bioinformatics/btn109] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Motivation: High-density DNA microarrays provide useful tools to analyze gene expression comprehensively. However, it is still difficult to obtain accurate expression levels from the observed microarray data because the signal intensity is affected by complicated factors involving probe–target hybridization, such as non-linear behavior of hybridization, non-specific hybridization, and folding of probe and target oligonucleotides. Various methods for microarray data analysis have been proposed to address this problem. In our previous report, we presented a benchmark analysis of probe–target hybridization using artificially synthesized oligonucleotides as targets, in which the effect of non-specific hybridization was negligible. The results showed that the preceding models explained the behavior of probe–target hybridization only within a narrow range of target concentrations. More accurate models are required for quantitative expression analysis. Results: The experiments showed that finiteness of both probe and target molecules should be considered to explain the hybridization behavior. In this article, we present an extension of the Langmuir model that reproduces the experimental results consistently. In this model, we introduced the effects of secondary structure formation, and dissociation of the probe–target duplex during washing after hybridization. The results will provide useful methods for the understanding and analysis of microarray experiments. Availability: The method was implemented for the R software and can be downloaded from our website (http://www-shimizu.ist.osaka-u.ac.jp/shimizu_lab/FHarray/). Contact:furusawa@ist.osaka-u.ac.jp Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Naoaki Ono
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | | | | | | | | | |
Collapse
|
32
|
|
33
|
Koltai H, Weingarten-Baror C. Specificity of DNA microarray hybridization: characterization, effectors and approaches for data correction. Nucleic Acids Res 2008; 36:2395-405. [PMID: 18299281 PMCID: PMC2367720 DOI: 10.1093/nar/gkn087] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Microarray-hybridization specificity is one of the main effectors of microarray result quality. In the present review, we suggest a definition for specificity that spans four hybridization levels, from the single probe to the microarray platform. For increased hybridization specificity, it is important to quantify the extent of the specificity at each of these levels, and correct the data accordingly. We outline possible effects of low hybridization specificity on the obtained results and list possible effectors of hybridization specificity. In addition, we discuss several studies in which theoretical approaches, empirical means or data filtration were used to identify specificity effectors, and increase the specificity of the hybridization results. However, these various approaches may not yet provide an ultimate solution; rather, further tool development is needed to enhance microarray-hybridization specificity.
Collapse
Affiliation(s)
- Hinanit Koltai
- Department of Ornamental Horticulture, ARO Volcani Center, Bet Dagan, Israel.
| | | |
Collapse
|
34
|
Casneuf T, Van de Peer Y, Huber W. In situ analysis of cross-hybridisation on microarrays and the inference of expression correlation. BMC Bioinformatics 2007; 8:461. [PMID: 18039370 PMCID: PMC2213692 DOI: 10.1186/1471-2105-8-461] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2007] [Accepted: 11/26/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray co-expression signatures are an important tool for studying gene function and relations between genes. In addition to genuine biological co-expression, correlated signals can result from technical deficiencies like hybridization of reporters with off-target transcripts. An approach that is able to distinguish these factors permits the detection of more biologically relevant co-expression signatures. RESULTS We demonstrate a positive relation between off-target reporter alignment strength and expression correlation in data from oligonucleotide genechips. Furthermore, we describe a method that allows the identification, from their expression data, of individual probe sets affected by off-target hybridization. CONCLUSION The effects of off-target hybridization on expression correlation coefficients can be substantial, and can be alleviated by more accurate mapping between microarray reporters and the target transcriptome. We recommend attention to the mapping for any microarray analysis of gene expression patterns.
Collapse
Affiliation(s)
- Tineke Casneuf
- Department of Plant Systems Biology, VIB, B-9052 Ghent, Belgium.
| | | | | |
Collapse
|
35
|
Suzuki S, Ono N, Furusawa C, Kashiwagi A, Yomo T. Experimental optimization of probe length to increase the sequence specificity of high-density oligonucleotide microarrays. BMC Genomics 2007; 8:373. [PMID: 17939865 PMCID: PMC2180184 DOI: 10.1186/1471-2164-8-373] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2007] [Accepted: 10/16/2007] [Indexed: 11/10/2022] Open
Abstract
Background High-density oligonucleotide arrays are widely used for analysis of genome-wide expression and genetic variation. Affymetrix GeneChips – common high-density oligonucleotide arrays – contain perfect match (PM) and mismatch (MM) probes generated by changing a single nucleotide of the PMs, to estimate cross-hybridization. However, a fraction of MM probes exhibit larger signal intensities than PMs, when the difference in the amount of target specific hybridization between PM and MM probes is smaller than the variance in the amount of cross-hybridization. Thus, pairs of PM and MM probes with greater specificity for single nucleotide mismatches are desirable for accurate analysis. Results To investigate the specificity for single nucleotide mismatches, we designed a custom array with probes of different length (14- to 25-mer) tethered to the surface of the array and all possible single nucleotide mismatches, and hybridized artificially synthesized 25-mer oligodeoxyribonucleotides as targets in bulk solution to avoid the effects of cross-hybridization. The results indicated the finite availability of target molecules as the probe length increases. Due to this effect, the sequence specificity of the longer probes decreases, and this was also confirmed even under the usual background conditions for transcriptome analysis. Conclusion Our study suggests that the optimal probe length for specificity is 19–21-mer. This conclusion will assist in improvement of microarray design for both transcriptome analysis and mutation screening.
Collapse
Affiliation(s)
- Shingo Suzuki
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan.
| | | | | | | | | |
Collapse
|
36
|
Suzuki S, Furusawa C, Ono N, Kashiwagi A, Urabe I, Yomo T. Insight into the sequence specificity of a probe on an Affymetrix GeneChip by titration experiments using only one oligonucleotide. Biophysics (Nagoya-shi) 2007; 3:47-56. [PMID: 27857566 PMCID: PMC5036658 DOI: 10.2142/biophysics.3.47] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2006] [Accepted: 07/20/2007] [Indexed: 12/01/2022] Open
Abstract
High-density oligonucleotide arrays are powerful tools for the analysis of genome-wide expression of genes and for genome-wide screens of genetic variation in living organisms. One of the critical problems in high-density oligonucleotide arrays is how to identify the actual amounts of a transcript due to noise and cross-hybridization involved in the observed signal intensities. Although mismatch (MM) probes are spotted on Affymetrix GeneChips to evaluate the noise and cross-hybridization embedded in perfect match (PM) probes, the behavior of probe-level signal intensities remains unclear. In the present study, we hybridized only one complement 25-mer oligonucleotide to characterize the behavior of duplex formation between target and probe in the complete absence of cross-hybridization. Titration experiments using only one oligonucleotide demonstrated that a substantial amount of intact target was hybridized not only to the PM but also the MM probe and that duplex formation between intact target and MM probe was efficiently reduced by increasing the stringency of hybridization conditions and shortening probe length. In addition, we discuss the correlation between potential for secondary structure of target oligonucleotide and hybridization intensity. These findings will be useful for the development of genome-wide analysis of gene expression and genetic variations by optimization of hybridization and probe conditions.
Collapse
Affiliation(s)
- Shingo Suzuki
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Chikara Furusawa
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan; Complex Systems Biology Project, ERATO, Japan Science and Technology Corporation, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Naoaki Ono
- Complex Systems Biology Project, ERATO, Japan Science and Technology Corporation, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Akiko Kashiwagi
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Itaru Urabe
- Department of Biotechnology, Graduate School of Engineering, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Tetsuya Yomo
- Department of Bioinformatics Engineering, Graduate School of Information Science and Technology, Osaka University, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan; Complex Systems Biology Project, ERATO, Japan Science and Technology Corporation, 2-1 Yamadaoka, Suita, Osaka 565-0871, Japan; Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita, Osaka 565-0871, Japan
| |
Collapse
|
37
|
Analysis of probe level patterns in Affymetrix microarray data. BMC Bioinformatics 2007; 8:146. [PMID: 17480226 PMCID: PMC1884176 DOI: 10.1186/1471-2105-8-146] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2006] [Accepted: 05/04/2007] [Indexed: 11/18/2022] Open
Abstract
Background Microarrays have been used extensively to analyze the expression profiles for thousands of genes in parallel. Most of the widely used methods for analyzing Affymetrix Genechip microarray data, including RMA, GCRMA and Model Based Expression Index (MBEI), summarize probe signal intensity data to generate a single measure of expression for each transcript on the array. In contrast, other methods are applied directly to probe intensities, negating the need for a summarization step. Results In this study, we used the Affymetrix rat genome Genechip to explore variability in probe response patterns within transcripts. We considered a number of possible sources of variability in probe sets including probe location within the transcript, middle base pair of the probe sequence, probe overlap, sequence homology and affinity. Although affinity, middle base pair and probe location effects may be seen at the gross array level, these factors only account for a small proportion of the variation observed at the gene level. A BLAST search and the presence of probe by treatment interactions for selected differentially expressed genes showed high sequence homology for many probes to non-target genes. Conclusion We suggest that examination and modeling of probe level intensities can be used to guide researchers in refining their conclusions regarding differentially expressed genes. We discuss implications for probe sequence selection for confirmatory analysis using real time PCR.
Collapse
|
38
|
Brukner I, El-Ramahi R, Gorska-Flipot I, Krajinovic M, Labuda D. An in vitro selection scheme for oligonucleotide probes to discriminate between closely related DNA sequences. Nucleic Acids Res 2007; 35:e66. [PMID: 17426126 PMCID: PMC1888810 DOI: 10.1093/nar/gkm156] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Using an in vitro selection, we have obtained oligonucleotide probes with high discriminatory power against multiple, similar nucleic acid sequences, which is often required in diagnostic applications for simultaneous testing of such sequences. We have tested this approach, referred to as iterative hybridizations, by selecting probes against six 22-nt-long sequence variants representing human papillomavirus, (HPV). We have obtained probes that efficiently discriminate between HPV types that differ by 3–7 nt. The probes were found effective to recognize HPV sequences of the type 6, 11, 16, 18 and a pair of type 31 and 33, either when immobilized on a solid support or in a reverse configuration, as well to discriminate HPV types from the clinical samples. This methodology can be extended to generate diagnostic kits that rely on nucleic acid hybridization between closely related sequences. In this approach, instead of adjusting hybridization conditions to the intended set of probe–target pairs, we ‘adjust’, through in vitro selection, the probes to the conditions we have chosen. Importantly, these conditions have to be ‘relaxed’, allowing the formation of a variety of not fully complementary complexes from which those that efficiently recognize and discriminate intended from non-intended targets can be readily selected.
Collapse
Affiliation(s)
- Ivan Brukner
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, QC, Canada, Centre de Recherche, Hôpital Hôtel-Dieu, Montréal, QC, Canada, Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada and Départment de pathologie, Université de Montréal, Montréal, PQ, Canada
- *To whom correspondence should be addressed. (514) 345-4931 ext. 3586/3282(514) 345-4731 Correspondence may also be addressed to Damian Labuda. (514) 345-4931 ext. 3586/3282 (514) 345-4731
| | - Razan El-Ramahi
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, QC, Canada, Centre de Recherche, Hôpital Hôtel-Dieu, Montréal, QC, Canada, Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada and Départment de pathologie, Université de Montréal, Montréal, PQ, Canada
| | - Izabella Gorska-Flipot
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, QC, Canada, Centre de Recherche, Hôpital Hôtel-Dieu, Montréal, QC, Canada, Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada and Départment de pathologie, Université de Montréal, Montréal, PQ, Canada
| | - Maja Krajinovic
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, QC, Canada, Centre de Recherche, Hôpital Hôtel-Dieu, Montréal, QC, Canada, Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada and Départment de pathologie, Université de Montréal, Montréal, PQ, Canada
| | - Damian Labuda
- Centre de Recherche, Hôpital Sainte-Justine, Montréal, QC, Canada, Centre de Recherche, Hôpital Hôtel-Dieu, Montréal, QC, Canada, Département de Pédiatrie, Université de Montréal, Montréal, QC, Canada and Départment de pathologie, Université de Montréal, Montréal, PQ, Canada
- *To whom correspondence should be addressed. (514) 345-4931 ext. 3586/3282(514) 345-4731 Correspondence may also be addressed to Damian Labuda. (514) 345-4931 ext. 3586/3282 (514) 345-4731
| |
Collapse
|
39
|
Fodor AA, Tickle TL, Richardson C. Towards the uniform distribution of null P values on Affymetrix microarrays. Genome Biol 2007; 8:R69. [PMID: 17472745 PMCID: PMC1929139 DOI: 10.1186/gb-2007-8-5-r69] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2006] [Revised: 02/08/2007] [Accepted: 05/01/2007] [Indexed: 11/16/2022] Open
Abstract
Methods to control false-positive rates require that P values of genes that are not differentially expressed follow a uniform distribution. Commonly used microarray statistics can generate P values that do not meet this assumption. We show that poorly characterized variance, imperfect normalization, and cross-hybridization are among the many causes of this non-uniform distribution. We demonstrate a simple technique that produces P values that are close to uniform for nondifferentially expressed genes in control datasets.
Collapse
Affiliation(s)
- Anthony A Fodor
- Bioinformatics Resource Center, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA
| | - Timothy L Tickle
- Bioinformatics Resource Center, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA
| | - Christine Richardson
- Bioinformatics Resource Center, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA
- Department of Biology, The University of North Carolina at Charlotte, University City Boulevard, Charlotte, North Carolina 28223, USA
| |
Collapse
|
40
|
Eklund AC, Turner LR, Chen P, Jensen RV, deFeo G, Kopf-Sill AR, Szallasi Z. Replacing cRNA targets with cDNA reduces microarray cross-hybridization. Nat Biotechnol 2006; 24:1071-3. [PMID: 16964210 DOI: 10.1038/nbt0906-1071] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
41
|
Marcelino LA, Backman V, Donaldson A, Steadman C, Thompson JR, Preheim SP, Lien C, Lim E, Veneziano D, Polz MF. Accurately quantifying low-abundant targets amid similar sequences by revealing hidden correlations in oligonucleotide microarray data. Proc Natl Acad Sci U S A 2006; 103:13629-34. [PMID: 16950880 PMCID: PMC1559406 DOI: 10.1073/pnas.0601476103] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Microarrays have enabled the determination of how thousands of genes are expressed to coordinate function within single organisms. Yet applications to natural or engineered communities where different organisms interact to produce complex properties are hampered by theoretical and technological limitations. Here we describe a general method to accurately identify low-abundant targets in systems containing complex mixtures of homologous targets. We combined an analytical predictor of nonspecific probe-target interactions (cross-hybridization) with an optimization algorithm that iteratively deconvolutes true probe-target signal from raw signal affected by spurious contributions (cross-hybridization, noise, background, and unequal specific hybridization response). The method was capable of quantifying, with unprecedented specificity and accuracy, ribosomal RNA (rRNA) sequences in artificial and natural communities. Controlled experiments with spiked rRNA into artificial and natural communities demonstrated the accuracy of identification and quantitative behavior over different concentration ranges. Finally, we illustrated the power of this methodology for accurate detection of low-abundant targets in natural communities. We accurately identified Vibrio taxa in coastal marine samples at their natural concentrations (<0.05% of total bacteria), despite the high potential for cross-hybridization by hundreds of different coexisting rRNAs, suggesting this methodology should be expandable to any microarray platform and system requiring accurate identification of low-abundant targets amid pools of similar sequences.
Collapse
Affiliation(s)
- Luisa A. Marcelino
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Vadim Backman
- Biomedical Engineering Department, Northwestern University, Evanston, IL 60208; and
| | - Andres Donaldson
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Claudia Steadman
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Janelle R. Thompson
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Sarah Pacocha Preheim
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Cynthia Lien
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Eelin Lim
- Department of Biology, Temple University, Philadelphia, PA 19122
| | - Daniele Veneziano
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Martin F. Polz
- *Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
42
|
Okoniewski MJ, Miller CJ. Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations. BMC Bioinformatics 2006; 7:276. [PMID: 16749918 PMCID: PMC1513401 DOI: 10.1186/1471-2105-7-276] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2006] [Accepted: 06/02/2006] [Indexed: 11/10/2022] Open
Abstract
Background Microarrays measure the binding of nucleotide sequences to a set of sequence specific probes. This information is combined with annotation specifying the relationship between probes and targets and used to make inferences about transcript- and, ultimately, gene expression. In some situations, a probe is capable of hybridizing to more than one transcript, in others, multiple probes can target a single sequence. These 'multiply targeted' probes can result in non-independence between measured expression levels. Results An analysis of these relationships for Affymetrix arrays considered both the extent and influence of exact matches between probe and transcript sequences. For the popular HGU133A array, approximately half of the probesets were found to interact in this way. Both real and simulated expression datasets were used to examine how these effects influenced the expression signal. It was found not only to lead to increased signal strength for the affected probesets, but the major effect is to significantly increase their correlation, even in situations when only a single probe from a probeset was involved. By building a network of probe-probeset-transcript relationships, it is possible to identify families of interacting probesets. More than 10% of the families contain members annotated to different genes or even different Unigene clusters. Within a family, a mixture of genuine biological and artefactual correlations can occur. Conclusion Multiple targeting is not only prevalent, but also significant. The ability of probesets to hybridize to more than one gene product can lead to false positives when analysing gene expression. Comprehensive annotation describing multiple targeting is required when interpreting array data.
Collapse
Affiliation(s)
- Michał J Okoniewski
- Paterson Institute For Cancer Research, Christie Hospital site, University of Manchester, Wilmslow Road, Manchester, M20 4BX, UK
| | - Crispin J Miller
- Paterson Institute For Cancer Research, Christie Hospital site, University of Manchester, Wilmslow Road, Manchester, M20 4BX, UK
| |
Collapse
|
43
|
Held GA, Grinstein G, Tu Y. Relationship between gene expression and observed intensities in DNA microarrays--a modeling study. Nucleic Acids Res 2006; 34:e70. [PMID: 16723429 PMCID: PMC1472623 DOI: 10.1093/nar/gkl122] [Citation(s) in RCA: 69] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
A theoretical study of the physical properties which determine the variation in signal strength from probe to probe on a microarray is presented. A model which incorporates probe-target hybridization, as well as the subsequent dissociation which occurs during stringent washing of the microarray, is introduced and shown to reasonably describe publicly available spike-in experiments carried out at Affymetrix. In particular, this model suggests that probe-target dissociation during the stringent wash plays a critical role in determining the observed hybridization intensities. In addition, it is demonstrated that non-specific hybridization introduces uncertainties which significantly limit the ability of any model to accurately quantify absolute gene expression levels while, in contrast, target folding appears to have little effect on these results. Finally, for data from target spike-in experiments, our model is shown to compare favorably with an existing statistical model in determining target concentration levels.
Collapse
Affiliation(s)
- G A Held
- IBM TJ Watson Research Center, PO Box 218, Yorktown Heights, NY 10598, USA.
| | | | | |
Collapse
|
44
|
Tetko IV, Haberer G, Rudd S, Meyers B, Mewes HW, Mayer KFX. Spatiotemporal expression control correlates with intragenic scaffold matrix attachment regions (S/MARs) in Arabidopsis thaliana. PLoS Comput Biol 2006; 2:e21. [PMID: 16604187 PMCID: PMC1420657 DOI: 10.1371/journal.pcbi.0020021] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2005] [Accepted: 02/07/2006] [Indexed: 11/18/2022] Open
Abstract
Scaffold/matrix attachment regions (S/MARs) are essential for structural organization of the chromatin within the nucleus and serve as anchors of chromatin loop domains. A significant fraction of genes in Arabidopsis thaliana contains intragenic S/MAR elements and a significant correlation of S/MAR presence and overall expression strength has been demonstrated. In this study, we undertook a genome scale analysis of expression level and spatiotemporal expression differences in correlation with the presence or absence of genic S/MAR elements. We demonstrate that genes containing intragenic S/MARs are prone to pronounced spatiotemporal expression regulation. This characteristic is found to be even more pronounced for transcription factor genes. Our observations illustrate the importance of S/MARs in transcriptional regulation and the role of chromatin structural characteristics for gene regulation. Our findings open new perspectives for the understanding of tissue- and organ-specific regulation of gene expression. Scaffold/matrix attachment regions (S/MARs) are AT-rich DNA sequences that mediate structural organization of the chromatin within the nucleus. These elements constitute anchor points of the DNA for the chromatin scaffold and serve to organize the chromatin into structural domains. Studies on individual genes led to the conclusion that the dynamic and complex organization of the chromatin mediated by S/MAR elements plays an important role in the regulation of gene expression. In addition to intergenic S/MARs, which likely exert import insulator effects, more than 2,000 intragenic S/MARs have been shown to be present within the Arabidopsis genome. In this study, the authors set out to analyze the effects of these intragenic S/MAR elements on the regulation of the genes affected. Making use of exhaustive and multidimensional expression datasets available for Arabidopsis, the authors analyzed overall expression differences and correlation of intragenic S/MARs with spatiotemporal expression of genes. On a genome scale, pronounced tissue- and organ-specific and developmental expression patterns of S/MAR-containing genes have been detected. Notably, transcription factor genes contain a significant higher portion of S/MARs. The pronounced difference in expression characteristics of S/MAR-containing genes emphasizes their functional importance and the importance of structural chromosomal characteristics for gene regulation in plants as well as within other eukaryotes.
Collapse
Affiliation(s)
- Igor V Tetko
- GSF National Research Center for Environment and Health, MIPS, Institute for Bioinformatics, Neuherberg, Germany
| | - Georg Haberer
- GSF National Research Center for Environment and Health, MIPS, Institute for Bioinformatics, Neuherberg, Germany
| | - Stephen Rudd
- Bioinformatics Group, Turku Centre for Biotechnology, Tykistokatu, Turku, Finland
| | - Blake Meyers
- Department of Plant and Soil Sciences, Delaware Biotechnology Institute, Newark, New Jersey, United States of America
| | - Hans-Werner Mewes
- GSF National Research Center for Environment and Health, MIPS, Institute for Bioinformatics, Neuherberg, Germany
- Department of Genome-Oriented Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, Freising, Germany
| | - Klaus F. X Mayer
- GSF National Research Center for Environment and Health, MIPS, Institute for Bioinformatics, Neuherberg, Germany
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
45
|
Draghici S, Khatri P, Eklund AC, Szallasi Z. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet 2005; 22:101-9. [PMID: 16380191 PMCID: PMC2386979 DOI: 10.1016/j.tig.2005.12.005] [Citation(s) in RCA: 402] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2005] [Revised: 11/16/2005] [Accepted: 12/08/2005] [Indexed: 11/16/2022]
Abstract
DNA microarrays enable researchers to monitor the expression of thousands of genes simultaneously. However, the current technology has several limitations. Here we discuss problems related to the sensitivity, accuracy, specificity and reproducibility of microarray results. The existing data suggest that for relatively abundant transcripts the existence and direction (but not the magnitude) of expression changes can be reliably detected. However, accurate measurements of absolute expression levels and the reliable detection of low abundance genes are difficult to achieve. The main problems seem to be the sub-optimal design or choice of probes and some incorrect probe annotations. Well-designed data-analysis approaches can rectify some of these problems.
Collapse
Affiliation(s)
- Sorin Draghici
- Department of Computer Science, Wayne State University, 431 State Hall, Detroit, MI 48202, USA.
| | | | | | | |
Collapse
|
46
|
MacLennan NK, Rahib L, Shin C, Fang Z, Horvath S, Dean J, Liao JC, McCabe ERB, Dipple KM. Targeted disruption of glycerol kinase gene in mice: expression analysis in liver shows alterations in network partners related to glycerol kinase activity. Hum Mol Genet 2005; 15:405-15. [PMID: 16368706 DOI: 10.1093/hmg/ddi457] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Glycerol kinase deficiency (GKD) is an X-linked inborn error of metabolism with metabolic and neurological crises. Liver shows the highest level of glycerol kinase (GK) activity in humans and mice. Absence of genotype-phenotype correlations in patients with GKD indicates the involvement of modifier genes, including other network partners. To understand the molecular pathogenesis of GKD, we performed microarray analysis on liver mRNA from neonatal glycerol kinase (Gyk) knockout (KO) and wild-type (WT) mice. Unsupervised learning revealed that the overall gene expression profile of the KO mice was different from that of WT. Real-time PCR confirmed the differences for selected genes. Functional gene enrichment analysis was used to find 56 increased and 37 decreased gene functional categories. PathwayAssist analysis identified changes in gene expression levels of genes involved in organic acid metabolism indicating that GK was part of the same metabolic network which correlates well with the patients with GKD having metabolic acidemia during their episodic crises. Network component analysis (NCA) showed that transcription factors sterol regulatory element-binding protein (SREBP)-1c, carbohydrate response element-binding protein (ChREBP), hepatocyte nuclear factor-4 alpha (HNF-4alpha) and peroxisome proliferative-activated receptor-alpha (PPARalpha) had increased activity in the Gyk KO mice compared with WT mice, whereas SREBP-2 was less active in the Gyk KO mice. These studies show that Gyk deletion causes alterations in expression of genes in several regulatory networks and is the first time NCA has been used to expand on microarray data from a mouse KO model of a human disease.
Collapse
Affiliation(s)
- Nicole K MacLennan
- Department of Pediatrics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-7088, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Bishop J, Blair S, Chagovetz AM. A competitive kinetic model of nucleic acid surface hybridization in the presence of point mutants. Biophys J 2005; 90:831-40. [PMID: 16284267 PMCID: PMC1367108 DOI: 10.1529/biophysj.105.072314] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Microarray analysis has become increasingly complex due to the growing size of arrays and the inherent cross-binding of targets. In this work, we explore the effects of matched and mismatched target species concentrations, temperature, and the time of hybridization on sensing specificity in two-component systems. A finite element software is used to simulate the diffusion of DNA through a microfluidic chamber to the sensing surface where hybridization of DNA is modeled using the corresponding kinetic equation. Comparison between a single-component system, where only one target is allowed to bind to a specific zone, and a two-component system, where more than one target can hybridize in a sensing zone, uncovers significant kinetic disparities during the transitory state; however, at thermodynamic equilibrium a modified Langmuir isotherm governs the bound amount of both species. The results presented suggest that it may be more appropriate to consider collective rather than quasi-independent interaction of targets in multicomponent systems.
Collapse
Affiliation(s)
- J Bishop
- Department of Electrical and Computer Engineering University of Utah, Salt Lake City, Utah, USA
| | | | | |
Collapse
|
48
|
Imbeaud S, Auffray C. 'The 39 steps' in gene expression profiling: critical issues and proposed best practices for microarray experiments. Drug Discov Today 2005; 10:1175-82. [PMID: 16182210 DOI: 10.1016/s1359-6446(05)03565-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Gene expression microarrays have been used widely to address increasingly complex biological questions and to produce an unprecedented amount of data, but have yet to realize their full potential. The interpretation of microarray data remains a major challenge because of the complexity of the underlying biological networks. To gather meaningful expression data, it is crucial to develop standardized approaches for vigilant study design, controlled annotation of resources, careful quality control of experiments, robust statistics, and data registration and storage. This article reviews the steps needed in the design and execution of valid microarray experiments so that global gene expression data can play a major role in the pursuit of future biological discoveries that will impact drug development.
Collapse
Affiliation(s)
- Sandrine Imbeaud
- Array s/IMAGE, Genexpress, Functional Genomics and Systems Biology for Health, LGN UMR 7091, CNRS and Pierre and Marie Curie University, Paris VI, Villejuif, France.
| | | |
Collapse
|