1
|
Transcriptome analyses of liver in newly-hatched chicks during the metabolic perturbation of fasting and re-feeding reveals THRSPA as the key lipogenic transcription factor. BMC Genomics 2020; 21:109. [PMID: 32005146 PMCID: PMC6995218 DOI: 10.1186/s12864-020-6525-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 01/22/2020] [Indexed: 12/17/2022] Open
Abstract
Background The fasting-refeeding perturbation has been used extensively to reveal specific genes and metabolic pathways that control energy metabolism in the chicken. Most global transcriptional scans of the fasting-refeeding response in liver have focused on juvenile chickens that were 1, 2 or 4 weeks old. The present study was aimed at the immediate post-hatch period, in which newly-hatched chicks were subjected to fasting for 4, 24 or 48 h, then refed for 4, 24 or 48 h, and compared with a fully-fed control group at each age (D1-D4). Results Visual analysis of hepatic gene expression profiles using hierarchical and K-means clustering showed two distinct patterns, genes with higher expression during fasting and depressed expression upon refeeding and those with an opposing pattern of expression, which exhibit very low expression during fasting and more abundant expression with refeeding. Differentially-expressed genes (DEGs), identified from five prominent pair-wise contrasts of fed, fasted and refed conditions, were subjected to Ingenuity Pathway Analysis. This enabled mapping of analysis-ready (AR)-DEGs to canonical and metabolic pathways controlled by distinct gene interaction networks. The largest number of hepatic DEGs was identified by two contrasts: D2FED48h/D2FAST48h (968 genes) and D2FAST48h/D3REFED24h (1198 genes). The major genes acutely depressed by fasting and elevated upon refeeding included ANGTPL, ATPCL, DIO2, FASN, ME1, SCD, PPARG, SREBP2 and THRSPA—a primary lipogenic transcription factor. In contrast, major lipolytic genes were up-regulated by fasting or down-regulated after refeeding, including ALDOB, IL-15, LDHB, LPIN2, NFE2L2, NR3C1, NR0B1, PANK1, PPARA, SERTAD2 and UPP2. Conclusions Transcriptional profiling of liver during fasting/re-feeding of newly-hatched chicks revealed several highly-expressed upstream regulators, which enable the metabolic switch from fasted (lipolytic/gluconeogenic) to fed or refed (lipogenic/thermogenic) states. This rapid homeorhetic shift of whole-body metabolism from a catabolic-fasting state to an anabolic-fed state appears precisely orchestrated by a small number of ligand-activated transcription factors that provide either a fasting-lipolytic state (PPARA, NR3C1, NFE2L2, SERTAD2, FOX01, NR0B1, RXR) or a fully-fed and refed lipogenic/thermogenic state (THRSPA, SREBF2, PPARG, PPARD, JUN, ATF3, CTNNB1). THRSPA has emerged as the key transcriptional regulator that drives lipogenesis and thermogenesis in hatchling chicks, as shown here in fed and re-fed states.
Collapse
|
2
|
Baciu C, Thompson KJ, Mougeot JL, Brooks BR, Weller JW. The LO-BaFL method and ALS microarray expression analysis. BMC Bioinformatics 2012; 13:244. [PMID: 23006766 PMCID: PMC3526454 DOI: 10.1186/1471-2105-13-244] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2012] [Accepted: 09/05/2012] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Sporadic Amyotrophic Lateral Sclerosis (sALS) is a devastating, complex disease of unknown etiology. We studied this disease with microarray technology to capture as much biological complexity as possible. The Affymetrix-focused BaFL pipeline takes into account problems with probes that arise from physical and biological properties, so we adapted it to handle the long-oligonucleotide probes on our arrays (hence LO-BaFL). The revised method was tested against a validated array experiment and then used in a meta-analysis of peripheral white blood cells from healthy control samples in two experiments. We predicted differentially expressed (DE) genes in our sALS data, combining the results obtained using the TM4 suite of tools with those from the LO-BaFL method. Those predictions were tested using qRT-PCR assays. RESULTS LO-BaFL filtering and DE testing accurately predicted previously validated DE genes in a published experiment on coronary artery disease (CAD). Filtering healthy control data from the sALS and CAD studies with LO-BaFL resulted in highly correlated expression levels across many genes. After bioinformatics analysis, twelve genes from the sALS DE gene list were selected for independent testing using qRT-PCR assays. High-quality RNA from six healthy Control and six sALS samples yielded the predicted differential expression for 7 genes: TARDBP, SKIV2L2, C12orf35, DYNLT1, ACTG1, B2M, and ILKAP. Four of the seven have been previously described in sALS studies, while ACTG1, B2M and ILKAP appear in the context of this disease for the first time. Supplementary material can be accessed at: http://webpages.uncc.edu/~cbaciu/LO-BaFL/supplementary_data.html. CONCLUSION LO-BaFL predicts DE results that are broadly similar to those of other methods. The small healthy control cohort in the sALS study is a reasonable foundation for predicting DE genes. Modifying the BaFL pipeline allowed us to remove noise and systematic errors, improving the power of this study, which had a small sample size. Each bioinformatics approach revealed DE genes not predicted by the other; subsequent PCR assays confirmed seven of twelve candidates, a relatively high success rate.
Collapse
Affiliation(s)
- Cristina Baciu
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Kevin J Thompson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jean-Luc Mougeot
- ALS Biomarker Laboratory, Carolinas Neuromuscular/ALS-MDA Center, Department of Neurology, Carolinas Medical Center, Charlotte, NC, 28207, USA
- University of North Carolina School of Medicine, Charlotte Campus, Charlotte, NC, 28203, USA
| | - Benjamin R Brooks
- ALS Biomarker Laboratory, Carolinas Neuromuscular/ALS-MDA Center, Department of Neurology, Carolinas Medical Center, Charlotte, NC, 28207, USA
- University of North Carolina School of Medicine, Charlotte Campus, Charlotte, NC, 28203, USA
| | - Jennifer W Weller
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| |
Collapse
|
3
|
Sporer KRB, Tempelman RJ, Ernst CW, Reed KM, Velleman SG, Strasburg GM. Transcriptional profiling identifies differentially expressed genes in developing turkey skeletal muscle. BMC Genomics 2011; 12:143. [PMID: 21385442 PMCID: PMC3060885 DOI: 10.1186/1471-2164-12-143] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2010] [Accepted: 03/08/2011] [Indexed: 11/12/2022] Open
Abstract
Background Skeletal muscle growth and development from embryo to adult consists of a series of carefully regulated changes in gene expression. Understanding these developmental changes in agriculturally important species is essential to the production of high quality meat products. For example, consumer demand for lean, inexpensive meat products has driven the turkey industry to unprecedented production through intensive genetic selection. However, achievements of increased body weight and muscle mass have been countered by an increased incidence of myopathies and meat quality defects. In a previous study, we developed and validated a turkey skeletal muscle-specific microarray as a tool for functional genomics studies. The goals of the current study were to utilize this microarray to elucidate functional pathways of genes responsible for key events in turkey skeletal muscle development and to compare differences in gene expression between two genetic lines of turkeys. To achieve these goals, skeletal muscle samples were collected at three critical stages in muscle development: 18d embryo (hyperplasia), 1d post-hatch (shift from myoblast-mediated growth to satellite cell-modulated growth by hypertrophy), and 16wk (market age) from two genetic lines: a randombred control line (RBC2) maintained without selection pressure, and a line (F) selected from the RBC2 line for increased 16wk body weight. Array hybridizations were performed in two experiments: Experiment 1 directly compared the developmental stages within genetic line, while Experiment 2 directly compared the two lines within each developmental stage. Results A total of 3474 genes were differentially expressed (false discovery rate; FDR < 0.001) by overall effect of development, while 16 genes were differentially expressed (FDR < 0.10) by overall effect of genetic line. Ingenuity Pathways Analysis was used to group annotated genes into networks, functions, and canonical pathways. The expression of 28 genes involved in extracellular matrix regulation, cell death/apoptosis, and calcium signaling/muscle function, as well as genes with miscellaneous function was confirmed by qPCR. Conclusions The current study identified gene pathways and uncovered novel genes important in turkey muscle growth and development. Future experiments will focus further on several of these candidate genes and the expression and mechanism of action of their protein products.
Collapse
Affiliation(s)
- Kelly R B Sporer
- Department of Food Science and Human Nutrition, Michigan State University, East Lansing, Michigan 48824, USA
| | | | | | | | | | | |
Collapse
|
4
|
Uhlmann NK, Beckles DM. Storage products and transcriptional analysis of the endosperm of cultivated wheat and two wild wheat species. J Appl Genet 2011; 51:431-47. [PMID: 21063061 DOI: 10.1007/bf03208873] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
The starch and protein in wheat (Triticum aestivum L.) endosperm provide 20 percent of the calories eaten by humans and were heavily selected for during domestication. We examined the main storage products and gene expression patterns that may embody compositional differences between two wild species Aegilops crassa and Aegilops tauschii and cultivated bread wheat. The storage product profiles differed significantly with T. aestivum accumulating twice as much carbon as the wild species, while the latter had 1.5 to 2-fold more total nitrogen per seed. Transcriptional analyses of endosperms of similar fresh weight were compared using a cDNA macroarray. Aegilops tauschii, and especially Ae. crassa had stronger hybridizations with storage protein sequences, but while there were differences in transcripts for starch biosynthetic genes, they were less dramatic. Of these, we cloned the Starch Branching Enzymes (SBE) IIa promoter region and the genomic clone of the Brittle-1 (Bt1) ADPglucose transporter. While Ae. crassa SBEIIa sequence was more divergent than that of Ae. tauschii’s compared to bread wheat, there were no sequence polymorphisms that would explain the observed expression differences in Bt1 between these species. Furthermore, while there were nucleotide differences between Bt1 in Ae. crassa and bread wheat, they were synonymous at the amino acid level. Some of transcriptional differences identified here, however, deserve further examination as part of a strategy to manipulate wheat starch and protein composition.
Collapse
Affiliation(s)
- N K Uhlmann
- DuPont-Pioneer, Crop Genetics Research, Experimental Station, Wilmington, USA
| | | |
Collapse
|
5
|
Sporer KRB, Chiang W, Tempelman RJ, Ernst CW, Reed KM, Velleman SG, Strasburg GM. Characterization of a 6K oligonucleotide turkey skeletal muscle microarray. Anim Genet 2011; 42:75-82. [DOI: 10.1111/j.1365-2052.2010.02085.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
Healy TM, Tymchuk WE, Osborne EJ, Schulte PM. Heat shock response of killifish (Fundulus heteroclitus): candidate gene and heterologous microarray approaches. Physiol Genomics 2010; 41:171-84. [PMID: 20103695 DOI: 10.1152/physiolgenomics.00209.2009] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Northern and southern subspecies of the Atlantic killifish, Fundulus heteroclitus, differ in maximal thermal tolerance. To determine whether these subspecies also differ in their heat shock response (HSR), we exposed 20°C acclimated killifish to a 2 h heat shock at 34°C and examined gene expression in fish from both subspecies during heat shock and recovery using real-time quantitative PCR and a heterologous cDNA microarray designed for salmonid fishes. The heat shock proteins Hsp70-1, hsp27, and hsp30 were upregulated to a greater extent in the high temperature-tolerant southern subspecies than in the less tolerant northern subspecies, whereas hsp70-2 (which showed the largest upregulation of all the heat shock proteins) in both gill and muscle and hsp90α in muscle was upregulated to a greater extent in northern than in southern fish. These data demonstrate that differences in the HSR between subspecies cannot be due to changes in a single global regulator but must occur via gene-specific mechanisms. They also suggest that the role, if any, of hsps in establishing thermal tolerance is complex and varies from gene to gene. Heterologous microarray hybridization provided interpretable gene expression signatures, detecting differential regulation of genes known to be involved in the heat shock response in other species. Under control conditions, a variety of genes were differentially expressed in muscle between subspecies that suggest differences in muscle fiber type and could relate to previously observed differences between subspecies in the thermal sensitivity of swimming performance and metabolism.
Collapse
Affiliation(s)
- Timothy M. Healy
- Department of Zoology, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Wendy E. Tymchuk
- Department of Zoology, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Edward J. Osborne
- Department of Zoology, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Patricia M. Schulte
- Department of Zoology, The University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
7
|
Williams A, Thomson EM. Effects of scanning sensitivity and multiple scan algorithms on microarray data quality. BMC Bioinformatics 2010; 11:127. [PMID: 20226031 PMCID: PMC2846908 DOI: 10.1186/1471-2105-11-127] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Accepted: 03/12/2010] [Indexed: 11/10/2022] Open
Abstract
Background Maximizing the utility of DNA microarray data requires optimization of data acquisition through selection of an appropriate scanner setting. To increase the amount of useable data, several approaches have been proposed that incorporate multiple scans at different sensitivities to reduce the quantification error and to minimize effects of saturation. However, no direct comparison of their efficacy has been made. In the present study we compared individual scans at low, medium and high sensitivity with three methods for combining data from multiple scans (either 2-scan or 3-scan cases) using an actual dataset comprising 40 technical replicates of a reference RNA standard. Results Of the individual scans, the low scan exhibited the lowest background signal, the highest signal-to-noise ratio, and equivalent reproducibility to the medium and high scans. Most multiple scan approaches increased the range of probe intensities compared to the individual scans, but did not increase the dynamic range (the proportion of useable data). Approaches displayed striking differences in the background signal and signal-to-noise ratio. However, increased probe intensity range and improved signal-to-noise ratios did not necessarily correlate with improved reproducibility. Importantly, for one multiple scan method that combined 3 scans, reproducibility was significantly improved relative to individual scans and all other multiple scan approaches. The same method using 2 scans yielded significantly lower reproducibility, attributable to a lack-of-fit of the statistical model. Conclusions Our data indicate that implementation of a suitable multiple scan approach can improve reproducibility, but that model validation is critical to ensure accurate estimates of probe intensity.
Collapse
Affiliation(s)
- Andrew Williams
- Population Health Studies Division, Environmental Health Science and Research Bureau, Health Canada, Ottawa, K1A 0K9, Canada.
| | | |
Collapse
|
8
|
Oberg AL, Vitek O. Statistical design of quantitative mass spectrometry-based proteomic experiments. J Proteome Res 2009; 8:2144-56. [PMID: 19222236 DOI: 10.1021/pr8010099] [Citation(s) in RCA: 190] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We review the fundamental principles of statistical experimental design, and their application to quantitative mass spectrometry-based proteomics. We focus on class comparison using Analysis of Variance (ANOVA), and discuss how randomization, replication and blocking help avoid systematic biases due to the experimental procedure, and help optimize our ability to detect true quantitative changes between groups. We also discuss the issues of pooling multiple biological specimens for a single mass analysis, and calculation of the number of replicates in a future study. When applicable, we emphasize the parallels between designing quantitative proteomic experiments and experiments with gene expression microarrays, and give examples from that area of research. We illustrate the discussion using theoretical considerations, and using real-data examples of profiling of disease.
Collapse
Affiliation(s)
- Ann L Oberg
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN 55905, USA
| | | |
Collapse
|
9
|
Glatigny A, Delacroix H, Tang T, François N, Aggerbeck L, Mucchielli-Giorgi MH. Characterisation and correction of signal fluctuations in successive acquisitions of microarray images. BMC Bioinformatics 2009; 10:98. [PMID: 19331668 PMCID: PMC2681461 DOI: 10.1186/1471-2105-10-98] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2008] [Accepted: 03/30/2009] [Indexed: 11/25/2022] Open
Abstract
Background There are many sources of variation in dual labelled microarray experiments, including data acquisition and image processing. The final interpretation of experiments strongly relies on the accuracy of the measurement of the signal intensity. For low intensity spots in particular, accurately estimating gene expression variations remains a challenge as signal measurement is, in this case, highly subject to fluctuations. Results To evaluate the fluctuations in the fluorescence intensities of spots, we used series of successive scans, at the same settings, of whole genome arrays. We measured the decrease in fluorescence and we evaluated the influence of different parameters (PMT gain, resolution and chemistry of the slide) on the signal variability, at the level of the array as a whole and by intensity interval. Moreover, we assessed the effect of averaging scans on the fluctuations. We found that the extent of photo-bleaching was low and we established that 1) the fluorescence fluctuation is linked to the resolution e.g. it depends on the number of pixels in the spot 2) the fluorescence fluctuation increases as the scanner voltage increases and, moreover, is higher for the red as opposed to the green fluorescence which can introduce bias in the analysis 3) the signal variability is linked to the intensity level, it is higher for low intensities 4) the heterogeneity of the spots and the variability of the signal and the intensity ratios decrease when two or three scans are averaged. Conclusion Protocols consisting of two scans, one at low and one at high PMT gains, or multiple scans (ten scans) can introduce bias or be difficult to implement. We found that averaging two, or at most three, acquisitions of microarrays scanned at moderate photomultiplier settings (PMT gain) is sufficient to significantly improve the accuracy (quality) of the data and particularly those for spots having low intensities and we propose this as a general approach. For averaging and precise image alignment at sub-pixel levels we have made a program freely available on our web-site to facilitate implementation of this approach.
Collapse
Affiliation(s)
- Annie Glatigny
- Centre de Génétique Moléculaire, CNRS FRE3144, F-91198 Gif-sur-Yvette, France.
| | | | | | | | | | | |
Collapse
|
10
|
Tomaszycki ML, Peabody C, Replogle K, Clayton DF, Tempelman RJ, Wade J. Sexual differentiation of the zebra finch song system: potential roles for sex chromosome genes. BMC Neurosci 2009; 10:24. [PMID: 19309515 PMCID: PMC2664819 DOI: 10.1186/1471-2202-10-24] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2008] [Accepted: 03/23/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent evidence suggests that some sex differences in brain and behavior might result from direct genetic effects, and not solely the result of the organizational effects of steroid hormones. The present study examined the potential role for sex-biased gene expression during development of sexually dimorphic singing behavior and associated song nuclei in juvenile zebra finches. RESULTS A microarray screen revealed more than 2400 putative genes (with a false discovery rate less than 0.05) exhibiting sex differences in the telencephalon of developing zebra finches. Increased expression in males was confirmed in 12 of 20 by qPCR using cDNA from the whole telencephalon; all of these appeared to be located on the Z sex chromosome. Six of the genes also showed increased expression in one or more of the song control nuclei of males at post-hatching day 25. Although the function of half of the genes is presently unknown, we have identified three as: 17-beta-hydroxysteroid dehydrogenase type IV, methylcrotonyl-CoA carboxylase, and sorting nexin 2. CONCLUSION The data suggest potential influences of these genes in song learning and/or masculinization of song system morphology, both of which are occurring at this developmental stage.
Collapse
Affiliation(s)
- Michelle L Tomaszycki
- Department of Psychology & Program in Neuroscience, Michigan State University, East Lansing, MI, USA.
| | | | | | | | | | | |
Collapse
|
11
|
Toxicogenomic analysis of susceptibility to inhaled urban particulate matter in mice with chronic lung inflammation. Part Fibre Toxicol 2009; 6:6. [PMID: 19284582 PMCID: PMC2661044 DOI: 10.1186/1743-8977-6-6] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Accepted: 03/11/2009] [Indexed: 12/03/2022] Open
Abstract
Background Individuals with chronic lung disease are at increased risk of adverse health effects from airborne particulate matter. Characterization of underlying pollutant-phenotype interactions may require comprehensive strategies. Here, a toxicogenomic approach was used to investigate how inflammation modifies the pulmonary response to urban particulate matter. Results Transgenic mice with constitutive pulmonary overexpression of tumour necrosis factor (TNF)-α under the control of the surfactant protein C promoter and wildtype littermates (C57BL/6 background) were exposed by inhalation for 4 h to particulate matter (0 or 42 mg/m3 EHC-6802) and euthanized 0 or 24 h post-exposure. The low alveolar dose of particles (16 μg) did not provoke an inflammatory response in the lungs of wildtype mice, nor exacerbate the chronic inflammation in TNF animals. Real-time PCR confirmed particle-dependent increases of CYP1A1 (30–100%), endothelin-1 (20–40%), and metallothionein-II (20–40%) mRNA in wildtype and TNF mice (p < 0.05), validating delivery of a biologically-effective dose. Despite detection of striking genotype-related differences, including activation of immune and inflammatory pathways consistent with the TNF-induced pathology, and time-related effects attributable to stress from nose-only exposure, microarray analysis failed to identify effects of the inhaled particles. Remarkably, the presence of chronic inflammation did not measurably amplify the transcriptional response to particulate matter. Conclusion Our data support the hypothesis that health effects of acute exposure to urban particles are dominated by activation of specific physiological response cascades rather than widespread changes in gene expression.
Collapse
|
12
|
Rotter A, Hren M, Baebler S, Blejec A, Gruden K. Finding differentially expressed genes in two-channel DNA microarray datasets: how to increase reliability of data preprocessing. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2008; 12:171-82. [PMID: 18771401 DOI: 10.1089/omi.2008.0032] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Due to the great variety of preprocessing tools in two-channel expression microarray data analysis it is difficult to choose the most appropriate one for a given experimental setup. In our study, two independent two-channel inhouse microarray experiments as well as a publicly available dataset were used to investigate the influence of the selection of preprocessing methods (background correction, normalization, and duplicate spots correlation calculation) on the discovery of differentially expressed genes. Here we are showing that both the list of differentially expressed genes and the expression values of selected genes depend significantly on the preprocessing approach applied. The choice of normalization method to be used had the highest impact on the results. We propose a simple but efficient approach to increase the reliability of obtained results, where two normalization methods which are theoretically distinct from one another are used on the same dataset. Then the intersection of results, that is, the lists of differentially expressed genes, is used in order to get a more accurate estimation of the genes that were de facto differentially expressed.
Collapse
Affiliation(s)
- Ana Rotter
- Department of Biotechnology and Systems Biology, National Institute of Biology, 1000 Ljubljana, Slovenia.
| | | | | | | | | |
Collapse
|
13
|
Abstract
The increasing use of gene expression microarrays, and depositing of the resulting data into public repositories, means that more investigators are interested in using the technology either directly or through meta analysis of the publicly available data. The tools available for data analysis have generally been developed for use by experts in the field, making them difficult to use by the general research community. For those interested in entering the field, especially those without a background in statistics, it is difficult to understand why experimental results can be so variable. The purpose of this review is to go through the workflow of a typical microarray experiment, to show that decisions made at each step, from choice of platform through statistical analysis methods to biological interpretation, are all sources of this variability.
Collapse
|
14
|
Kerr KF. Extended analysis of benchmark datasets for Agilent two-color microarrays. BMC Bioinformatics 2007; 8:371. [PMID: 17915030 PMCID: PMC2174956 DOI: 10.1186/1471-2105-8-371] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2007] [Accepted: 10/03/2007] [Indexed: 11/25/2022] Open
Abstract
Background As part of its broad and ambitious mission, the MicroArray Quality Control (MAQC) project reported the results of experiments using External RNA Controls (ERCs) on five microarray platforms. For most platforms, several different methods of data processing were considered. However, there was no similar consideration of different methods for processing the data from the Agilent two-color platform. While this omission is understandable given the scale of the project, it can create the false impression that there is consensus about the best way to process Agilent two-color data. It is also important to consider whether ERCs are representative of all the probes on a microarray. Results A comparison of different methods of processing Agilent two-color data shows substantial differences among methods for low-intensity genes. The sensitivity and specificity for detecting differentially expressed genes varies substantially for different methods. Analysis also reveals that the ERCs in the MAQC data only span the upper half of the intensity range, and therefore cannot be representative of all genes on the microarray. Conclusion Although ERCs demonstrate good agreement between observed and expected log-ratios on the Agilent two-color platform, such an analysis is incomplete. Simple loess normalization outperformed data processing with Agilent's Feature Extraction software for accurate identification of differentially expressed genes. Results from studies using ERCs should not be over-generalized when ERCs are not representative of all probes on a microarray.
Collapse
Affiliation(s)
- Kathleen F Kerr
- Department of Biostatistics, University of Washington, Seattle, Washington, USA.
| |
Collapse
|