76
|
Woodcroft BJ, Radloff R, Yeoh LM, Scanlon KL, Doyle MA, van Dooren GG, McFadden GI, Tonkin CJ, Speed TP, Ralph SA. An integrative bioinformatic predictor of protein sub-cellular localisation in malaria. BMC Bioinformatics 2011. [PMCID: PMC3277248 DOI: 10.1186/1471-2105-12-s11-a6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
77
|
Gagnon-Bartsch JA, Speed TP. Using control genes to correct for unwanted variation in microarray data. Biostatistics 2011; 13:539-52. [PMID: 22101192 DOI: 10.1093/biostatistics/kxr034] [Citation(s) in RCA: 259] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Microarray expression studies suffer from the problem of batch effects and other unwanted variation. Many methods have been proposed to adjust microarray data to mitigate the problems of unwanted variation. Several of these methods rely on factor analysis to infer the unwanted variation from the data. A central problem with this approach is the difficulty in discerning the unwanted variation from the biological variation that is of interest to the researcher. We present a new method, intended for use in differential expression studies, that attempts to overcome this problem by restricting the factor analysis to negative control genes. Negative control genes are genes known a priori not to be differentially expressed with respect to the biological factor of interest. Variation in the expression levels of these genes can therefore be assumed to be unwanted variation. We name this method "Remove Unwanted Variation, 2-step" (RUV-2). We discuss various techniques for assessing the performance of an adjustment method and compare the performance of RUV-2 with that of other commonly used adjustment methods such as Combat and Surrogate Variable Analysis (SVA). We present several example studies, each concerning genes differentially expressed with respect to gender in the brain and find that RUV-2 performs as well or better than other methods. Finally, we discuss the possibility of adapting RUV-2 for use in studies not concerned with differential expression and conclude that there may be promise but substantial challenges remain.
Collapse
|
78
|
Renfree MB, Papenfuss AT, Deakin JE, Lindsay J, Heider T, Belov K, Rens W, Waters PD, Pharo EA, Shaw G, Wong ESW, Lefèvre CM, Nicholas KR, Kuroki Y, Wakefield MJ, Zenger KR, Wang C, Ferguson-Smith M, Nicholas FW, Hickford D, Yu H, Short KR, Siddle HV, Frankenberg SR, Chew KY, Menzies BR, Stringer JM, Suzuki S, Hore TA, Delbridge ML, Mohammadi A, Schneider NY, Hu Y, O'Hara W, Al Nadaf S, Wu C, Feng ZP, Cocks BG, Wang J, Flicek P, Searle SMJ, Fairley S, Beal K, Herrero J, Carone DM, Suzuki Y, Sugano S, Toyoda A, Sakaki Y, Kondo S, Nishida Y, Tatsumoto S, Mandiou I, Hsu A, McColl KA, Lansdell B, Weinstock G, Kuczek E, McGrath A, Wilson P, Men A, Hazar-Rethinam M, Hall A, Davis J, Wood D, Williams S, Sundaravadanam Y, Muzny DM, Jhangiani SN, Lewis LR, Morgan MB, Okwuonu GO, Ruiz SJ, Santibanez J, Nazareth L, Cree A, Fowler G, Kovar CL, Dinh HH, Joshi V, Jing C, Lara F, Thornton R, Chen L, Deng J, Liu Y, Shen JY, Song XZ, Edson J, Troon C, Thomas D, Stephens A, Yapa L, Levchenko T, Gibbs RA, Cooper DW, Speed TP, Fujiyama A, M Graves JA, O'Neill RJ, Pask AJ, Forrest SM, Worley KC. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development. Genome Biol 2011; 12:R81. [PMID: 21854559 PMCID: PMC3277949 DOI: 10.1186/gb-2011-12-8-r81] [Citation(s) in RCA: 147] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Revised: 07/22/2011] [Accepted: 08/19/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We present the genome sequence of the tammar wallaby, Macropus eugenii, which is a member of the kangaroo family and the first representative of the iconic hopping mammals that symbolize Australia to be sequenced. The tammar has many unusual biological characteristics, including the longest period of embryonic diapause of any mammal, extremely synchronized seasonal breeding and prolonged and sophisticated lactation within a well-defined pouch. Like other marsupials, it gives birth to highly altricial young, and has a small number of very large chromosomes, making it a valuable model for genomics, reproduction and development. RESULTS The genome has been sequenced to 2 × coverage using Sanger sequencing, enhanced with additional next generation sequencing and the integration of extensive physical and linkage maps to build the genome assembly. We also sequenced the tammar transcriptome across many tissues and developmental time points. Our analyses of these data shed light on mammalian reproduction, development and genome evolution: there is innovation in reproductive and lactational genes, rapid evolution of germ cell genes, and incomplete, locus-specific X inactivation. We also observe novel retrotransposons and a highly rearranged major histocompatibility complex, with many class I genes located outside the complex. Novel microRNAs in the tammar HOX clusters uncover new potential mammalian HOX regulatory elements. CONCLUSIONS Analyses of these resources enhance our understanding of marsupial gene evolution, identify marsupial-specific conserved non-coding elements and critical genes across a range of biological systems, including reproduction, development and immunity, and provide new insight into marsupial and mammalian biology and genome evolution.
Collapse
|
79
|
Renfree MB, Papenfuss AT, Deakin JE, Lindsay J, Heider T, Belov K, Rens W, Waters PD, Pharo EA, Shaw G, Wong ESW, Lefèvre CM, Nicholas KR, Kuroki Y, Wakefield MJ, Zenger KR, Wang C, Ferguson-Smith M, Nicholas FW, Hickford D, Yu H, Short KR, Siddle HV, Frankenberg SR, Chew KY, Menzies BR, Stringer JM, Suzuki S, Hore TA, Delbridge ML, Patel H, Mohammadi A, Schneider NY, Hu Y, O'Hara W, Al Nadaf S, Wu C, Feng ZP, Cocks BG, Wang J, Flicek P, Searle SMJ, Fairley S, Beal K, Herrero J, Carone DM, Suzuki Y, Sugano S, Toyoda A, Sakaki Y, Kondo S, Nishida Y, Tatsumoto S, Mandiou I, Hsu A, McColl KA, Lansdell B, Weinstock G, Kuczek E, McGrath A, Wilson P, Men A, Hazar-Rethinam M, Hall A, Davis J, Wood D, Williams S, Sundaravadanam Y, Muzny DM, Jhangiani SN, Lewis LR, Morgan MB, Okwuonu GO, Ruiz SJ, Santibanez J, Nazareth L, Cree A, Fowler G, Kovar CL, Dinh HH, Joshi V, Jing C, Lara F, Thornton R, Chen L, Deng J, Liu Y, Shen JY, Song XZ, Edson J, Troon C, Thomas D, Stephens A, Yapa L, Levchenko T, Gibbs RA, Cooper DW, Speed TP, Fujiyama A, M Graves JA, O'Neill RJ, Pask AJ, Forrest SM, Worley KC. Genome sequence of an Australian kangaroo, Macropus eugenii, provides insight into the evolution of mammalian reproduction and development. Genome Biol 2011. [PMCID: PMC3334613 DOI: 10.1186/gb-2011-12-12-414] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
80
|
Yu CY, Mayba O, Lee JV, Tran J, Harris C, Speed TP, Wang JC. Genome-wide analysis of glucocorticoid receptor binding regions in adipocytes reveal gene network involved in triglyceride homeostasis. PLoS One 2010; 5:e15188. [PMID: 21187916 PMCID: PMC3004788 DOI: 10.1371/journal.pone.0015188] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2010] [Accepted: 10/28/2010] [Indexed: 01/19/2023] Open
Abstract
Glucocorticoids play important roles in the regulation of distinct aspects of adipocyte biology. Excess glucocorticoids in adipocytes are associated with metabolic disorders, including central obesity, insulin resistance and dyslipidemia. To understand the mechanisms underlying the glucocorticoid action in adipocytes, we used chromatin immunoprecipitation sequencing to isolate genome-wide glucocorticoid receptor (GR) binding regions (GBRs) in 3T3-L1 adipocytes. Furthermore, gene expression analyses were used to identify genes that were regulated by glucocorticoids. Overall, 274 glucocorticoid-regulated genes contain or locate nearby GBR. We found that many GBRs were located in or nearby genes involved in triglyceride (TG) synthesis (Scd-1, 2, 3, GPAT3, GPAT4, Agpat2, Lpin1), lipolysis (Lipe, Mgll), lipid transport (Cd36, Lrp-1, Vldlr, Slc27a2) and storage (S3-12). Gene expression analysis showed that except for Scd-3, the other 13 genes were induced in mouse inguinal fat upon 4-day glucocorticoid treatment. Reporter gene assays showed that except Agpat2, the other 12 glucocorticoid-regulated genes contain at least one GBR that can mediate hormone response. In agreement with the fact that glucocorticoids activated genes in both TG biosynthetic and lipolytic pathways, we confirmed that 4-day glucocorticoid treatment increased TG synthesis and lipolysis concomitantly in inguinal fat. Notably, we found that 9 of these 12 genes were induced in transgenic mice that have constant elevated plasma glucocorticoid levels. These results suggested that a similar mechanism was used to regulate TG homeostasis during chronic glucocorticoid treatment. In summary, our studies have identified molecular components in a glucocorticoid-controlled gene network involved in the regulation of TG homeostasis in adipocytes. Understanding the regulation of this gene network should provide important insight for future therapeutic developments for metabolic diseases.
Collapse
|
81
|
Tan IKL, Mackin L, Wang N, Papenfuss AT, Elso CM, Ashton MP, Quirk F, Phipson B, Bahlo M, Speed TP, Smyth GK, Morahan G, Brodnicki TC. A recombination hotspot leads to sequence variability within a novel gene (AK005651) and contributes to type 1 diabetes susceptibility. Genome Res 2010; 20:1629-38. [PMID: 21051460 DOI: 10.1101/gr.101881.109] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
More than 25 loci have been linked to type 1 diabetes (T1D) in the nonobese diabetic (NOD) mouse, but identification of the underlying genes remains challenging. We describe here the positional cloning of a T1D susceptibility locus, Idd11, located on mouse chromosome 4. Sequence analysis of a series of congenic NOD mouse strains over a critical 6.9-kb interval in these mice and in 25 inbred strains identified several haplotypes, including a unique NOD haplotype, associated with varying levels of T1D susceptibility. Haplotype diversity within this interval between congenic NOD mouse strains was due to a recombination hotspot that generated four crossover breakpoints, including one with a complex conversion tract. The Idd11 haplotype and recombination hotspot are located within a predicted gene of unknown function, which exhibits decreased expression in relevant tissues of NOD mice. Notably, it was the recombination hotspot that aided our mapping of Idd11 and confirms that recombination hotspots can create genetic variation affecting a common polygenic disease. This finding has implications for human genetic association studies, which may be affected by the approximately 33,000 estimated hotspots in the genome.
Collapse
|
82
|
Robinson MD, Stirzaker C, Statham AL, Coolen MW, Song JZ, Nair SS, Strbenac D, Speed TP, Clark SJ. Evaluation of affinity-based genome-wide DNA methylation data: effects of CpG density, amplification bias, and copy number variation. Genome Res 2010; 20:1719-29. [PMID: 21045081 DOI: 10.1101/gr.110601.110] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
DNA methylation is an essential epigenetic modification that plays a key role associated with the regulation of gene expression during differentiation, but in disease states such as cancer, the DNA methylation landscape is often deregulated. There are now numerous technologies available to interrogate the DNA methylation status of CpG sites in a targeted or genome-wide fashion, but each method, due to intrinsic biases, potentially interrogates different fractions of the genome. In this study, we compare the affinity-purification of methylated DNA between two popular genome-wide techniques, methylated DNA immunoprecipitation (MeDIP) and methyl-CpG binding domain-based capture (MBDCap), and show that each technique operates in a different domain of the CpG density landscape. We explored the effect of whole-genome amplification and illustrate that it can reduce sensitivity for detecting DNA methylation in GC-rich regions of the genome. By using MBDCap, we compare and contrast microarray- and sequencing-based readouts and highlight the impact that copy number variation (CNV) can make in differential comparisons of methylomes. These studies reveal that the analysis of DNA methylation data and genome coverage is highly dependent on the method employed, and consideration must be made in light of the GC content, the extent of DNA amplification, and the copy number.
Collapse
|
83
|
Leitman DC, Paruthiyil S, Vivar OI, Saunier EF, Herber CB, Cohen I, Tagliaferri M, Speed TP. Regulation of specific target genes and biological responses by estrogen receptor subtype agonists. Curr Opin Pharmacol 2010; 10:629-36. [PMID: 20951642 DOI: 10.1016/j.coph.2010.09.009] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2010] [Revised: 09/20/2010] [Accepted: 09/20/2010] [Indexed: 02/07/2023]
Abstract
Estrogenic effects are mediated through two estrogen receptor (ER) subtypes, ERα and ERβ. Estrogens are the most commonly prescribed drugs to treat menopausal conditions, but by non-selectively triggering both ERα and ERβ pathways in different tissues they can cause serious adverse effects. The different sizes of the binding pockets and sequences of their activation function domains indicate that ERα and ERβ should have different specificities for ligands and biological responses that can be exploited for designing safer and more selective estrogens. ERα and ERβ regulate different genes by binding to different regulatory elements and recruiting different transcription and chromatin remodeling factors that are expressed in a cell-specific manner. ERα-selective and ERβ-selective agonists have been identified that demonstrate that the two ERs produce distinct biological effects. ERα and ERβ agonists are a promising new approach for treating specific conditions associated with menopause.
Collapse
|
84
|
Wang W, Shen P, Thiyagarajan S, Lin S, Palm C, Horvath R, Klopstock T, Cutler D, Pique L, Schrijver I, Davis RW, Mindrinos M, Speed TP, Scharfe C. Identification of rare DNA variants in mitochondrial disorders with improved array-based sequencing. Nucleic Acids Res 2010; 39:44-58. [PMID: 20843780 PMCID: PMC3017602 DOI: 10.1093/nar/gkq750] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
A common goal in the discovery of rare functional DNA variants via medical resequencing is to incur a relatively lower proportion of false positive base-calls. We developed a novel statistical method for resequencing arrays (SRMA, sequence robust multi-array analysis) to increase the accuracy of detecting rare variants and reduce the costs in subsequent sequence verifications required in medical applications. SRMA includes single and multi-array analysis and accounts for technical variables as well as the possibility of both low- and high-frequency genomic variation. The confidence of each base-call was ranked using two quality measures. In comparison to Sanger capillary sequencing, we achieved a false discovery rate of 2% (false positive rate 1.2 × 10−5, false negative rate 5%), which is similar to automated second-generation sequencing technologies. Applied to the analysis of 39 nuclear candidate genes in disorders of mitochondrial DNA (mtDNA) maintenance, we confirmed mutations in the DNA polymerase gamma POLG in positive control cases, and identified novel rare variants in previously undiagnosed cases in the mitochondrial topoisomerase TOP1MT, the mismatch repair enzyme MUTYH, and the apurinic-apyrimidinic endonuclease APEX2. Some patients carried rare heterozygous variants in several functionally interacting genes, which could indicate synergistic genetic effects in these clinically similar disorders.
Collapse
|
85
|
Speca DJ, Chihara D, Ashique AM, Bowers MS, Pierce-Shimomura JT, Lee J, Rabbee N, Speed TP, Gularte RJ, Chitwood J, Medrano JF, Liao M, Sonner JM, Eger EI, Peterson AS, McIntire SL. Conserved role of unc-79 in ethanol responses in lightweight mutant mice. PLoS Genet 2010; 6. [PMID: 20714347 PMCID: PMC2920847 DOI: 10.1371/journal.pgen.1001057] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2009] [Accepted: 07/08/2010] [Indexed: 11/18/2022] Open
Abstract
The mechanisms by which ethanol and inhaled anesthetics influence the nervous system are poorly understood. Here we describe the positional cloning and characterization of a new mouse mutation isolated in an N-ethyl-N-nitrosourea (ENU) forward mutagenesis screen for animals with enhanced locomotor activity. This allele, Lightweight (Lwt), disrupts the homolog of the Caenorhabditis elegans (C. elegans) unc-79 gene. While Lwt/Lwt homozygotes are perinatal lethal, Lightweight heterozygotes are dramatically hypersensitive to acute ethanol exposure. Experiments in C. elegans demonstrate a conserved hypersensitivity to ethanol in unc-79 mutants and extend this observation to the related unc-80 mutant and nca-1;nca-2 double mutants. Lightweight heterozygotes also exhibit an altered response to the anesthetic isoflurane, reminiscent of unc-79 invertebrate mutant phenotypes. Consistent with our initial mapping results, Lightweight heterozygotes are mildly hyperactive when exposed to a novel environment and are smaller than wild-type animals. In addition, Lightweight heterozygotes exhibit increased food consumption yet have a leaner body composition. Interestingly, Lightweight heterozygotes voluntarily consume more ethanol than wild-type littermates. The acute hypersensitivity to and increased voluntary consumption of ethanol observed in Lightweight heterozygous mice in combination with the observed hypersensitivity to ethanol in C. elegans unc-79, unc-80, and nca-1;nca-2 double mutants suggests a novel conserved pathway that might influence alcohol-related behaviors in humans.
Collapse
|
86
|
Sui Y, Zhao X, Speed TP, Wu Z. Background adjustment for DNA microarrays using a database of microarray experiments. J Comput Biol 2010; 16:1501-15. [PMID: 19958080 DOI: 10.1089/cmb.2009.0063] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
DNA microarrays have become an indispensable technique in biomedical research. The raw measurements from microarrays undergo a number of preprocessing steps before the data are converted to the genomic level for further analysis. Background adjustment is an important step in preprocessing. Estimating background noise has been challenging because background levels vary a lot from probe to probe, yet there are limited observations on each probe. Most current methods have used the empirical Bayes approach to borrow information across probes on the same array. These approaches shrink the background estimate for either the entire sample or probes sharing similar sequence structures. In this article, we present a solution that is truly probe specific by using a database of large number of microarray experiments. Information is borrowed across samples and background noise is estimated for each probe individually. The ability to obtain probe specific background distributions allows us to extend the dynamic range of gene expression levels. We illustrate the improvement in detecting gene expression variation on two datasets: a Latin Square spike-in experiment from Affymetrix and an Estrogen Receptor experiment with biological replicates. An R package dbRMA implementing our method can be obtained from the authors.
Collapse
|
87
|
Bengtsson H, Neuvial P, Speed TP. TumorBoost: normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays. BMC Bioinformatics 2010; 11:245. [PMID: 20462408 PMCID: PMC2894037 DOI: 10.1186/1471-2105-11-245] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Accepted: 05/12/2010] [Indexed: 12/15/2022] Open
Abstract
Background High-throughput genotyping microarrays assess both total DNA copy number and allelic composition, which makes them a tool of choice for copy number studies in cancer, including total copy number and loss of heterozygosity (LOH) analyses. Even after state of the art preprocessing methods, allelic signal estimates from genotyping arrays still suffer from systematic effects that make them difficult to use effectively for such downstream analyses. Results We propose a method, TumorBoost, for normalizing allelic estimates of one tumor sample based on estimates from a single matched normal. The method applies to any paired tumor-normal estimates from any microarray-based technology, combined with any preprocessing method. We demonstrate that it increases the signal-to-noise ratio of allelic signals, making it significantly easier to detect allelic imbalances. Conclusions TumorBoost increases the power to detect somatic copy-number events (including copy-neutral LOH) in the tumor from allelic signals of Affymetrix or Illumina origin. We also conclude that high-precision allelic estimates can be obtained from a single pair of tumor-normal hybridizations, if TumorBoost is combined with single-array preprocessing methods such as (allele-specific) CRMA v2 for Affymetrix or BeadStudio's (proprietary) XY-normalization method for Illumina. A bounded-memory implementation is available in the open-source and cross-platform R package aroma.cn, which is part of the Aroma Project (http://www.aroma-project.org/).
Collapse
|
88
|
Carmichael CL, Wilkins EJ, Bengtsson H, Horwitz MS, Speed TP, Vincent PC, Young G, Hahn CN, Escher R, Scott HS. Poor prognosis in familial acute myeloid leukaemia with combined biallelic CEBPA mutations and downstream events affecting the ATM, FLT3 and CDX2 genes. Br J Haematol 2010; 150:382-5. [PMID: 20456351 DOI: 10.1111/j.1365-2141.2010.08204.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
89
|
Vivar OI, Zhao X, Saunier EF, Griffin C, Mayba OS, Tagliaferri M, Cohen I, Speed TP, Leitman DC. Estrogen receptor beta binds to and regulates three distinct classes of target genes. J Biol Chem 2010; 285:22059-66. [PMID: 20404318 DOI: 10.1074/jbc.m110.114116] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Estrogen receptor beta (ERbeta) has potent antiproliferative and anti-inflammatory properties, suggesting that ERbeta-selective agonists might be a new class of therapeutic and chemopreventive agents. To understand how ERbeta regulates genes, we identified genes regulated by the unliganded and liganded forms of ERalpha and ERbeta in U2OS cells. Microarray data demonstrated that virtually no gene regulation occurred with unliganded ERalpha, whereas many genes were regulated by estradiol (E(2)). These results demonstrated that ERalpha requires a ligand to regulate a single class of genes. In contrast, ERbeta regulated three classes of genes. Class I genes were regulated primarily by unliganded ERbeta. Class II genes were regulated only with E(2), whereas class III genes were regulated by both unliganded ERbeta and E(2). There were 453 class I genes, 258 class II genes, and 83 class III genes. To explore the mechanism whereby ERbeta regulates different classes of genes, chromatin immunoprecipitation-sequencing was performed to identify ERbeta binding sites and adjacent transcription factor motifs in regulated genes. AP1 binding sites were more enriched in class I genes, whereas ERE, NFkappaB1, and SP1 sites were more enriched in class II genes. ERbeta bound to all three classes of genes, demonstrating that ERbeta binding is not responsible for differential regulation of genes by unliganded and liganded ERbeta. The coactivator NCOA2 was differentially recruited to several target genes. Our findings indicate that the unliganded and liganded forms of ERbeta regulate three classes of genes by interacting with different transcription factors and coactivators.
Collapse
|
90
|
Ge Y, Sealfon SC, Speed TP. Multiple testing and its applications to microarrays. Stat Methods Med Res 2010; 18:543-63. [PMID: 20048384 DOI: 10.1177/0962280209351899] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
The large-scale multiple testing problems resulting from the measurement of thousands of genes in microarray experiments have received increasing interest during the past several years. This article describes some commonly used criteria for controlling false positive errors, including familywise error rates, false discovery rates and false discovery proportion rates. Various statistical methods controlling these error rates are described. The advantages and disadvantages of these methods are discussed. These methods are applied to gene expression data from two microarray studies and the properties of these multiple testing procedures are compared.
Collapse
|
91
|
Irizarry RA, Wang C, Zhou Y, Speed TP. Gene set enrichment analysis made simple. Stat Methods Med Res 2010; 18:565-75. [PMID: 20048385 DOI: 10.1177/0962280209351908] [Citation(s) in RCA: 134] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Among the many applications of microarray technology, one of the most popular is the identification of genes that are differentially expressed in two conditions. A common statistical approach is to quantify the interest of each gene with a p-value, adjust these p-values for multiple comparisons, choose an appropriate cut-off, and create a list of candidate genes. This approach has been criticised for ignoring biological knowledge regarding how genes work together. Recently a series of methods, that do incorporate biological knowledge, have been proposed. However, the most popular method, gene set enrichment analysis (GSEA), seems overly complicated. Furthermore, GSEA is based on a statistical test known for its lack of sensitivity. In this article we compare the performance of a simple alternative to GSEA. We find that this simple solution clearly outperforms GSEA. We demonstrate this with eight different microarray datasets.
Collapse
|
92
|
Levy N, Paruthiyil S, Zhao X, Vivar OI, Saunier EF, Griffin C, Tagliaferri M, Cohen I, Speed TP, Leitman DC. Unliganded estrogen receptor-beta regulation of genes is inhibited by tamoxifen. Mol Cell Endocrinol 2010; 315:201-7. [PMID: 19744542 DOI: 10.1016/j.mce.2009.08.030] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/12/2009] [Revised: 08/19/2009] [Accepted: 08/31/2009] [Indexed: 12/29/2022]
Abstract
Tamoxifen can stimulate the growth of some breast tumors and others can become resistant to tamoxifen. We previously showed that unliganded ERbeta inhibits ERalpha-mediated proliferation of MCF-7 cells. We investigated if tamoxifen might have a potential negative effect on some breast cancer cells by blocking the effects of unliganded ERbeta on gene regulation. Gene expression profiles demonstrated that unliganded ERbeta upregulated 196 genes in MCF-7 cells. Tamoxifen significantly inhibited 73 of these genes by greater than 30%, including several growth-inhibitory genes. To explore the mechanism whereby unliganded ERbeta activates genes and how tamoxifen blocks this effect, we used doxycycline-inducible U2OS-ERbeta cells to produce unliganded ERbeta. Doxycycline produced a dose-dependent activation of the NKG2E, MSMB and TUB3A genes, which was abolished by tamoxifen. Unliganded ERbeta recruitment of SRC-2 to the NKG2E gene was blocked by tamoxifen. Our findings suggest that tamoxifen might exert a negative effect on ERbeta expressing tumors due to its antagonistic action on unliganded ERbeta.
Collapse
|
93
|
Verhaak RGW, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD, Miller CR, Ding L, Golub T, Mesirov JP, Alexe G, Lawrence M, O'Kelly M, Tamayo P, Weir BA, Gabriel S, Winckler W, Gupta S, Jakkula L, Feiler HS, Hodgson JG, James CD, Sarkaria JN, Brennan C, Kahn A, Spellman PT, Wilson RK, Speed TP, Gray JW, Meyerson M, Getz G, Perou CM, Hayes DN. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell 2010; 17:98-110. [PMID: 20129251 PMCID: PMC2818769 DOI: 10.1016/j.ccr.2009.12.020] [Citation(s) in RCA: 5247] [Impact Index Per Article: 374.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Revised: 09/03/2009] [Accepted: 12/04/2009] [Indexed: 12/11/2022]
Abstract
The Cancer Genome Atlas Network recently cataloged recurrent genomic abnormalities in glioblastoma multiforme (GBM). We describe a robust gene expression-based molecular classification of GBM into Proneural, Neural, Classical, and Mesenchymal subtypes and integrate multidimensional genomic data to establish patterns of somatic mutations and DNA copy number. Aberrations and gene expression of EGFR, NF1, and PDGFRA/IDH1 each define the Classical, Mesenchymal, and Proneural subtypes, respectively. Gene signatures of normal brain cell types show a strong relationship between subtypes and different neural lineages. Additionally, response to aggressive therapy differs by subtype, with the greatest benefit in the Classical subtype and no benefit in the Proneural subtype. We provide a framework that unifies transcriptomic and genomic dimensions for GBM molecular stratification with important implications for future studies.
Collapse
|
94
|
Lipson D, Speed TP, Taub M. Methods for Allocating Ambiguous Short-reads. COMMUNICATIONS IN INFORMATION AND SYSTEMS 2010. [DOI: 10.4310/cis.2010.v10.n2.a1] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
95
|
Paruthiyil S, Cvoro A, Zhao X, Wu Z, Sui Y, Staub RE, Baggett S, Herber CB, Griffin C, Tagliaferri M, Harris HA, Cohen I, Bjeldanes LF, Speed TP, Schaufele F, Leitman DC. Drug and cell type-specific regulation of genes with different classes of estrogen receptor beta-selective agonists. PLoS One 2009; 4:e6271. [PMID: 19609440 PMCID: PMC2707612 DOI: 10.1371/journal.pone.0006271] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2008] [Accepted: 06/08/2009] [Indexed: 12/02/2022] Open
Abstract
Estrogens produce biological effects by interacting with two estrogen receptors, ERα and ERβ. Drugs that selectively target ERα or ERβ might be safer for conditions that have been traditionally treated with non-selective estrogens. Several synthetic and natural ERβ-selective compounds have been identified. One class of ERβ-selective agonists is represented by ERB-041 (WAY-202041) which binds to ERβ much greater than ERα. A second class of ERβ-selective agonists derived from plants include MF101, nyasol and liquiritigenin that bind similarly to both ERs, but only activate transcription with ERβ. Diarylpropionitrile represents a third class of ERβ-selective compounds because its selectivity is due to a combination of greater binding to ERβ and transcriptional activity. However, it is unclear if these three classes of ERβ-selective compounds produce similar biological activities. The goals of these studies were to determine the relative ERβ selectivity and pattern of gene expression of these three classes of ERβ-selective compounds compared to estradiol (E2), which is a non-selective ER agonist. U2OS cells stably transfected with ERα or ERβ were treated with E2 or the ERβ-selective compounds for 6 h. Microarray data demonstrated that ERB-041, MF101 and liquiritigenin were the most ERβ-selective agonists compared to estradiol, followed by nyasol and then diarylpropionitrile. FRET analysis showed that all compounds induced a similar conformation of ERβ, which is consistent with the finding that most genes regulated by the ERβ-selective compounds were similar to each other and E2. However, there were some classes of genes differentially regulated by the ERβ agonists and E2. Two ERβ-selective compounds, MF101 and liquiritigenin had cell type-specific effects as they regulated different genes in HeLa, Caco-2 and Ishikawa cell lines expressing ERβ. Our gene profiling studies demonstrate that while most of the genes were commonly regulated by ERβ-selective agonists and E2, there were some genes regulated that were distinct from each other and E2, suggesting that different ERβ-selective agonists might produce distinct biological and clinical effects.
Collapse
|
96
|
Tonkin CJ, Carret CK, Duraisingh MT, Voss TS, Ralph SA, Hommel M, Duffy MF, da Silva LM, Scherf A, Ivens A, Speed TP, Beeson JG, Cowman AF. Sir2 paralogues cooperate to regulate virulence genes and antigenic variation in Plasmodium falciparum. PLoS Biol 2009; 7:e84. [PMID: 19402747 PMCID: PMC2672602 DOI: 10.1371/journal.pbio.1000084] [Citation(s) in RCA: 184] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2008] [Accepted: 03/02/2009] [Indexed: 11/19/2022] Open
Abstract
Cytoadherance of Plasmodium falciparum-infected erythrocytes in the brain, organs and peripheral microvasculature is linked to morbidity and mortality associated with severe malaria. Parasite-derived P. falciparum Erythrocyte Membrane Protein 1 (PfEMP1) molecules displayed on the erythrocyte surface are responsible for cytoadherance and undergo antigenic variation in the course of an infection. Antigenic variation of PfEMP1 is achieved by in situ switching and mutually exclusive transcription of the var gene family, a process that is controlled by epigenetic mechanisms. Here we report characterisation of the P. falciparum silent information regulator's A and B (PfSir2A and PfSir2B) and their involvement in mutual exclusion and silencing of the var gene repertoire. Analysis of P. falciparum parasites lacking either PfSir2A or PfSir2B shows that these NAD+-dependent histone deacetylases are required for silencing of different var gene subsets classified by their conserved promoter type. We also demonstrate that in the absence of either of these molecules mutually exclusive expression of var genes breaks down. We show that var gene silencing originates within the promoter and PfSir2 paralogues are involved in cis spreading of silenced chromatin into adjacent regions. Furthermore, parasites lacking PfSir2A but not PfSir2B have considerably longer telomeric repeats, demonstrating a role for this molecule in telomeric end protection. This work highlights the pivotal but distinct role for both PfSir2 paralogues in epigenetic silencing of P. falciparum virulence genes and the control of pathogenicity of malaria infection. The unicellular parasite Plasmodium falciparum is the cause of the most severe form of malaria and is responsible for 300 million infections and ∼2 million deaths a year. Infected erythrocytes clump and block capillaries in the peripheral circulation, the brain, and placenta and are a major contributor to the pathology of malaria. A parasite-derived protein displayed on the surface of the infected erythrocyte is responsible for erythrocyte clumping in capillaries. Although 60 subtelomeric var genes can encode different versions of this “sticky” capillary-binding protein, only one protein is expressed at a time, and switches in expression between these genes causes variation of this pathogenic molecule enabling the parasite to evade the immune system. Here we identify two chromatin-modifying proteins that cooperate to mediate silencing and mutual exclusive expression of var genes. These proteins are thus important virulence factors of the malaria-causing parasite. Investigation into two Sir2 histone deacetylases in the malaria-causing parasite revealstrans-acting epigenetic factors control mutually exclusive expression of a major subtelomeric virulence gene family.
Collapse
|
97
|
Loi S, Sotiriou C, Haibe-Kains B, Lallemand F, Conus NM, Piccart MJ, Speed TP, McArthur GA. Gene expression profiling identifies activated growth factor signaling in poor prognosis (Luminal-B) estrogen receptor positive breast cancer. BMC Med Genomics 2009; 2:37. [PMID: 19552798 PMCID: PMC2706265 DOI: 10.1186/1755-8794-2-37] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Accepted: 06/24/2009] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Within estrogen receptor-positive breast cancer (ER+ BC), the expression levels of proliferation-related genes can define two clinically distinct molecular subtypes. When treated with adjuvant tamoxifen, those ER+ BCs that are lowly proliferative have a good prognosis (luminal-A subtype), however the clinical outcome of those that are highly proliferative is poor (luminal-B subtype). METHODS To investigate the biological basis for these observations, gene set enrichment analysis (GSEA) was performed using microarray data from 246 ER+ BC samples from women treated with adjuvant tamoxifen monotherapy. To create an in vitro model of growth factor (GF) signaling activation, MCF-7 cells were treated with heregulin (HRG), an HER3 ligand. RESULTS We found that a gene set linked to GF signaling was significantly enriched in the luminal-B tumors, despite only 10% of samples over-expressing HER2 by immunohistochemistry. To determine the biological significance of this observation, MCF-7 cells were treated with HRG. These cells displayed phosphorylation of HER2/3 and downstream ERK and S6. Treatment with HRG overcame tamoxifen-induced cell cycle arrest with higher S-phase fraction and increased anchorage independent colony formation. Gene expression profiles of MCF-7 cells treated with HRG confirmed enrichment of the GF signaling gene set and a similar proliferative signature observed in human ER+ BCs resistant to tamoxifen. CONCLUSION These data demonstrate that activation of GF signaling pathways, independent of HER2 over-expression, could be contributing to the poor prognosis of the luminal-B ER+ BC subtype.
Collapse
|
98
|
Bengtsson H, Wirapati P, Speed TP. A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. ACTA ACUST UNITED AC 2009; 25:2149-56. [PMID: 19535535 PMCID: PMC2734319 DOI: 10.1093/bioinformatics/btp371] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Motivation: High-resolution copy-number (CN) analysis has in recent years gained much attention, not only for the purpose of identifying CN aberrations associated with a certain phenotype, but also for identifying CN polymorphisms. In order for such studies to be successful and cost effective, the statistical methods have to be optimized. We propose a single-array preprocessing method for estimating full-resolution total CNs. It is applicable to all Affymetrix genotyping arrays, including the recent ones that also contain non-polymorphic probes. A reference signal is only needed at the last step when calculating relative CNs. Results: As with our method for earlier generations of arrays, this one controls for allelic crosstalk, probe affinities and PCR fragment-length effects. Additionally, it also corrects for probe sequence effects and co-hybridization of fragments digested by multiple enzymes that takes place on the latest chips. We compare our method with Affymetrix's CN5 method and the dChip method by assessing how well they differentiate between various CN states at the full resolution and various amounts of smoothing. Although CRMA v2 is a single-array method, we observe that it performs as well as or better than alternative methods that use data from all arrays for their preprocessing. This shows that it is possible to do online analysis in large-scale projects where additional arrays are introduced over time. Availability: A bounded-memory implementation that can process any number of arrays is available in the open source R package aroma.affymetrix. Contact:hb@stat.berkeley.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Collapse
|
99
|
Robinson MD, Speed TP. Differential splicing using whole-transcript microarrays. BMC Bioinformatics 2009; 10:156. [PMID: 19463149 PMCID: PMC2703633 DOI: 10.1186/1471-2105-10-156] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2008] [Accepted: 05/22/2009] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The latest generation of Affymetrix microarrays are designed to interrogate expression over the entire length of every locus, thus giving the opportunity to study alternative splicing genome-wide. The Exon 1.0 ST (sense target) platform, with versions for Human, Mouse and Rat, is designed primarily to probe every known or predicted exon. The smaller Gene 1.0 ST array is designed as an expression microarray but still interrogates expression with probes along the full length of each well-characterized transcript. We explore the possibility of using the Gene 1.0 ST platform to identify differential splicing events. RESULTS We propose a strategy to score differential splicing by using the auxiliary information from fitting the statistical model, RMA (robust multichip analysis). RMA partitions the probe-level data into probe effects and expression levels, operating robustly so that if a small number of probes behave differently than the rest, they are downweighted in the fitting step. We argue that adjacent poorly fitting probes for a given sample can be evidence of differential splicing and have designed a statistic to search for this behaviour. Using a public tissue panel dataset, we show many examples of tissue-specific alternative splicing. Furthermore, we show that evidence for putative alternative splicing has a strong correspondence between the Gene 1.0 ST and Exon 1.0 ST platforms. CONCLUSION We propose a new approach, FIRMAGene, to search for differentially spliced genes using the Gene 1.0 ST platform. Such an analysis complements the search for differential expression. We validate the method by illustrating several known examples and we note some of the challenges in interpreting the probe-level data.Software implementing our methods is freely available as an R package.
Collapse
|
100
|
Chandran D, Tai YC, Hather G, Dewdney J, Denoux C, Burgess DG, Ausubel FM, Speed TP, Wildermuth MC. Temporal global expression data reveal known and novel salicylate-impacted processes and regulators mediating powdery mildew growth and reproduction on Arabidopsis. PLANT PHYSIOLOGY 2009; 149:1435-51. [PMID: 19176722 PMCID: PMC2649394 DOI: 10.1104/pp.108.132985] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 01/23/2009] [Indexed: 05/20/2023]
Abstract
Salicylic acid (SA) is a critical mediator of plant innate immunity. It plays an important role in limiting the growth and reproduction of the virulent powdery mildew (PM) Golovinomyces orontii on Arabidopsis (Arabidopsis thaliana). To investigate this later phase of the PM interaction and the role played by SA, we performed replicated global expression profiling for wild-type and SA biosynthetic mutant isochorismate synthase1 (ics1) Arabidopsis from 0 to 7 d after infection. We found that ICS1-impacted genes constitute 3.8% of profiled genes, with known molecular markers of Arabidopsis defense ranked very highly by the multivariate empirical Bayes statistic (T(2) statistic). Functional analyses of T(2)-selected genes identified statistically significant PM-impacted processes, including photosynthesis, cell wall modification, and alkaloid metabolism, that are ICS1 independent. ICS1-impacted processes include redox, vacuolar transport/secretion, and signaling. Our data also support a role for ICS1 (SA) in iron and calcium homeostasis and identify components of SA cross talk with other phytohormones. Through our analysis, 39 novel PM-impacted transcriptional regulators were identified. Insertion mutants in one of these regulators, PUX2 (for plant ubiquitin regulatory X domain-containing protein 2), results in significantly reduced reproduction of the PM in a cell death-independent manner. Although little is known about PUX2, PUX1 acts as a negative regulator of Arabidopsis CDC48, an essential AAA-ATPase chaperone that mediates diverse cellular activities, including homotypic fusion of endoplasmic reticulum and Golgi membranes, endoplasmic reticulum-associated protein degradation, cell cycle progression, and apoptosis. Future work will elucidate the functional role of the novel regulator PUX2 in PM resistance.
Collapse
|