1
|
Brash JT, Diez-Pinel G, Colletto C, Castellan RF, Fantin A, Ruhrberg C. The BulkECexplorer compiles endothelial bulk transcriptomes to predict functional versus leaky transcription. NATURE CARDIOVASCULAR RESEARCH 2024; 3:460-473. [PMID: 38708406 PMCID: PMC7615926 DOI: 10.1038/s44161-024-00436-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/26/2024] [Indexed: 05/07/2024]
Abstract
Transcriptomic data can be mined to understand the molecular activity of cell types. Yet, functional genes may remain undetected in RNA sequencing (RNA-seq) experiments for technical reasons, such as insufficient read depth or gene dropout. Conversely, RNA-seq experiments may detect lowly expressed mRNAs thought to be biologically irrelevant products of leaky transcription. To represent a cell type's functional transcriptome more accurately, we propose compiling many bulk RNA-seq datasets into a compendium and applying established classification models to predict whether detected transcripts are likely products of active or leaky transcription. Here, we present the BulkECexplorer (bulk RNA-seq endothelial cell explorer) compendium of 240 bulk RNA-seq datasets from five vascular endothelial cell subtypes. This resource reports transcript counts for genes of interest and predicts whether detected transcripts are likely the products of active or leaky gene expression. Beyond its usefulness for vascular biology research, this resource provides a blueprint for developing analogous tools for other cell types.
Collapse
Affiliation(s)
- James T. Brash
- UCL Institute of Ophthalmology, University College London, London, UK
| | | | - Chiara Colletto
- Department of Biosciences, University of Milan, Milan, Italy
| | | | - Alessandro Fantin
- UCL Institute of Ophthalmology, University College London, London, UK
- Department of Biosciences, University of Milan, Milan, Italy
| | | |
Collapse
|
2
|
Verbeeck N, Caprioli RM, Van de Plas R. Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry. MASS SPECTROMETRY REVIEWS 2020; 39:245-291. [PMID: 31602691 PMCID: PMC7187435 DOI: 10.1002/mas.21602] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 08/27/2018] [Indexed: 05/20/2023]
Abstract
Imaging mass spectrometry (IMS) is a rapidly advancing molecular imaging modality that can map the spatial distribution of molecules with high chemical specificity. IMS does not require prior tagging of molecular targets and is able to measure a large number of ions concurrently in a single experiment. While this makes it particularly suited for exploratory analysis, the large amount and high-dimensional nature of data generated by IMS techniques make automated computational analysis indispensable. Research into computational methods for IMS data has touched upon different aspects, including spectral preprocessing, data formats, dimensionality reduction, spatial registration, sample classification, differential analysis between IMS experiments, and data-driven fusion methods to extract patterns corroborated by both IMS and other imaging modalities. In this work, we review unsupervised machine learning methods for exploratory analysis of IMS data, with particular focus on (a) factorization, (b) clustering, and (c) manifold learning. To provide a view across the various IMS modalities, we have attempted to include examples from a range of approaches including matrix assisted laser desorption/ionization, desorption electrospray ionization, and secondary ion mass spectrometry-based IMS. This review aims to be an entry point for both (i) analytical chemists and mass spectrometry experts who want to explore computational techniques; and (ii) computer scientists and data mining specialists who want to enter the IMS field. © 2019 The Authors. Mass Spectrometry Reviews published by Wiley Periodicals, Inc. Mass SpecRev 00:1-47, 2019.
Collapse
Affiliation(s)
- Nico Verbeeck
- Delft Center for Systems and ControlDelft University of Technology ‐ TU DelftDelftThe Netherlands
- Aspect Analytics NVGenkBelgium
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT)KU LeuvenLeuvenBelgium
| | - Richard M. Caprioli
- Mass Spectrometry Research CenterVanderbilt UniversityNashvilleTN
- Department of BiochemistryVanderbilt UniversityNashvilleTN
- Department of ChemistryVanderbilt UniversityNashvilleTN
- Department of PharmacologyVanderbilt UniversityNashvilleTN
- Department of MedicineVanderbilt UniversityNashvilleTN
| | - Raf Van de Plas
- Delft Center for Systems and ControlDelft University of Technology ‐ TU DelftDelftThe Netherlands
- Mass Spectrometry Research CenterVanderbilt UniversityNashvilleTN
- Department of BiochemistryVanderbilt UniversityNashvilleTN
| |
Collapse
|
3
|
Kolisnyk B, Al-Onaizi M, Soreq L, Barbash S, Bekenstein U, Haberman N, Hanin G, Kish MT, Souza da Silva J, Fahnestock M, Ule J, Soreq H, Prado VF, Prado MAM. Cholinergic Surveillance over Hippocampal RNA Metabolism and Alzheimer's-Like Pathology. Cereb Cortex 2018; 27:3553-3567. [PMID: 27312991 DOI: 10.1093/cercor/bhw177] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
The relationship between long-term cholinergic dysfunction and risk of developing dementia is poorly understood. Here we used mice with deletion of the vesicular acetylcholine transporter (VAChT) in the forebrain to model cholinergic abnormalities observed in dementia. Whole-genome RNA sequencing of hippocampal samples revealed that cholinergic failure causes changes in RNA metabolism. Remarkably, key transcripts related to Alzheimer's disease are affected. BACE1, for instance, shows abnormal splicing caused by decreased expression of the splicing regulator hnRNPA2/B1. Resulting BACE1 overexpression leads to increased APP processing and accumulation of soluble Aβ1-42. This is accompanied by age-related increases in GSK3 activation, tau hyperphosphorylation, caspase-3 activation, decreased synaptic markers, increased neuronal death, and deteriorating cognition. Pharmacological inhibition of GSK3 hyperactivation reversed deficits in synaptic markers and tau hyperphosphorylation induced by cholinergic dysfunction, indicating a key role for GSK3 in some of these pathological changes. Interestingly, in human brains there was a high correlation between decreased levels of VAChT and hnRNPA2/B1 levels with increased tau hyperphosphorylation. These results suggest that changes in RNA processing caused by cholinergic loss can facilitate Alzheimer's-like pathology in mice, providing a mechanism by which decreased cholinergic tone may increase risk of dementia.
Collapse
Affiliation(s)
| | - Mohammed Al-Onaizi
- Robarts Research Institute.,Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada N6A5K8
| | - Lilach Soreq
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK
| | - Shahar Barbash
- The Edmond and Lily Safra Center for Brain Science and The Silberman Institute of Life Sciences, The Edmond J Safra Campus, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Uriya Bekenstein
- The Edmond and Lily Safra Center for Brain Science and The Silberman Institute of Life Sciences, The Edmond J Safra Campus, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Nejc Haberman
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK
| | - Geula Hanin
- The Edmond and Lily Safra Center for Brain Science and The Silberman Institute of Life Sciences, The Edmond J Safra Campus, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Maxine T Kish
- Robarts Research Institute.,Department of Physiology and Pharmacology
| | | | - Margaret Fahnestock
- Department of Psychiatry and Behavioural Neurosciences, McMaster University, Hamilton, Ontario, CanadaL8S 4K1
| | - Jernej Ule
- Department of Molecular Neuroscience, UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK
| | - Hermona Soreq
- The Edmond and Lily Safra Center for Brain Science and The Silberman Institute of Life Sciences, The Edmond J Safra Campus, The Hebrew University of Jerusalem, Jerusalem 91904, Israel
| | - Vania F Prado
- Robarts Research Institute.,Graduate Program in Neuroscience.,Department of Physiology and Pharmacology.,Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada N6A5K8
| | - Marco A M Prado
- Robarts Research Institute.,Graduate Program in Neuroscience.,Department of Physiology and Pharmacology.,Department of Anatomy and Cell Biology, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada N6A5K8
| |
Collapse
|
4
|
Miragaia RJ, Zhang X, Gomes T, Svensson V, Ilicic T, Henriksson J, Kar G, Lönnberg T. Single-cell RNA-sequencing resolves self-antigen expression during mTEC development. Sci Rep 2018; 8:685. [PMID: 29330484 PMCID: PMC5766627 DOI: 10.1038/s41598-017-19100-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 12/14/2017] [Indexed: 01/03/2023] Open
Abstract
The crucial capability of T cells for discrimination between self and non-self peptides is based on negative selection of developing thymocytes by medullary thymic epithelial cells (mTECs). The mTECs purge autoreactive T cells by expression of cell-type specific genes referred to as tissue-restricted antigens (TRAs). Although the autoimmune regulator (AIRE) protein is known to promote the expression of a subset of TRAs, its mechanism of action is still not fully understood. The expression of TRAs that are not under the control of AIRE also needs further characterization. Furthermore, expression patterns of TRA genes have been suggested to change over the course of mTEC development. Herein we have used single-cell RNA-sequencing to resolve patterns of TRA expression during mTEC development. Our data indicated that mTEC development consists of three distinct stages, correlating with previously described jTEC, mTEChi and mTEClo phenotypes. For each subpopulation, we have identified marker genes useful in future studies. Aire-induced TRAs were switched on during jTEC-mTEC transition and were expressed in genomic clusters, while otherwise the subsets expressed largely overlapping sets of TRAs. Moreover, population-level analysis of TRA expression frequencies suggested that such differences might not be necessary to achieve efficient thymocyte selection.
Collapse
Affiliation(s)
- Ricardo J Miragaia
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
- Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057, Braga, Portugal
| | - Xiuwei Zhang
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- University of California, Berkeley, USA
| | - Tomás Gomes
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Valentine Svensson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Tomislav Ilicic
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom
| | - Johan Henriksson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Gozde Kar
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Tapio Lönnberg
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, United Kingdom.
- Turku Centre for Biotechnology, University of Turku and Åbo Akademi University, Turku, Finland.
| |
Collapse
|
5
|
McCarthy DJ, Campbell KR, Lun ATL, Wills QF. Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 2017; 33:1179-1186. [PMID: 28088763 PMCID: PMC5408845 DOI: 10.1093/bioinformatics/btw777] [Citation(s) in RCA: 802] [Impact Index Per Article: 114.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 12/07/2016] [Indexed: 12/26/2022] Open
Abstract
Motivation Single-cell RNA sequencing (scRNA-seq) is increasingly used to study gene expression at the level of individual cells. However, preparing raw sequence data for further analysis is not a straightforward process. Biases, artifacts and other sources of unwanted variation are present in the data, requiring substantial time and effort to be spent on pre-processing, quality control (QC) and normalization. Results We have developed the R/Bioconductor package scater to facilitate rigorous pre-processing, quality control, normalization and visualization of scRNA-seq data. The package provides a convenient, flexible workflow to process raw sequencing reads into a high-quality expression dataset ready for downstream analysis. scater provides a rich suite of plotting tools for single-cell data and a flexible data structure that is compatible with existing tools and can be used as infrastructure for future software development. Availability and Implementation The open-source code, along with installation instructions, vignettes and case studies, is available through Bioconductor at http://bioconductor.org/packages/scater. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Davis J McCarthy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, CB10 1SD Hinxton, Cambridge, UK.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.,St Vincent's Institute of Medical Research, Fitzroy, Victoria 3065, Australia
| | - Kieran R Campbell
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.,Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford OX1 3QX, UK
| | - Aaron T L Lun
- CRUK Cambridge Institute, University of Cambridge, Cambridge CB2 0RE, UK
| | - Quin F Wills
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK.,Weatherall Institute for Molecular Medicine, University of Oxford, John Radcliffe Hospital, Oxford OX3 9DS, UK
| |
Collapse
|
6
|
Stubbington MJ, Mahata B, Svensson V, Deonarine A, Nissen JK, Betz AG, Teichmann SA. An atlas of mouse CD4(+) T cell transcriptomes. Biol Direct 2015; 10:14. [PMID: 25886751 PMCID: PMC4384382 DOI: 10.1186/s13062-015-0045-x] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 02/23/2015] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND CD4(+) T cells are key regulators of the adaptive immune system and can be divided into T helper (Th) cells and regulatory T (Treg) cells. During an immune response Th cells mature from a naive state into one of several effector subtypes that exhibit distinct functions. The transcriptional mechanisms that underlie the specific functional identity of CD4(+) T cells are not fully understood. RESULTS To assist investigations into the transcriptional identity and regulatory processes of these cells we performed mRNA-sequencing on three murine T helper subtypes (Th1, Th2 and Th17) as well as on splenic Treg cells and induced Treg (iTreg) cells. Our integrated analysis of this dataset revealed the gene expression changes associated with these related but distinct cellular identities. Each cell subtype differentially expresses a wealth of 'subtype upregulated' genes, some of which are well known whilst others promise new insights into signalling processes and transcriptional regulation. We show that hundreds of genes are regulated purely by alternative splicing to extend our knowledge of the role of post-transcriptional regulation in cell differentiation. CONCLUSIONS This CD4(+) transcriptome atlas provides a valuable resource for the study of CD4(+) T cell populations. To facilitate its use by others, we have made the data available in an easily accessible online resource at www.th-express.org.
Collapse
Affiliation(s)
- Michael Jt Stubbington
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Bidesh Mahata
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| | - Valentine Svensson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | | - Jesper K Nissen
- MRC Laboratory of Molecular Biology, Cambridge, CB2 0QH, UK.
| | | | - Sarah A Teichmann
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| |
Collapse
|
7
|
George NI, Chang CW. DAFS: a data-adaptive flag method for RNA-sequencing data to differentiate genes with low and high expression. BMC Bioinformatics 2014; 15:92. [PMID: 24685233 PMCID: PMC4098771 DOI: 10.1186/1471-2105-15-92] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 03/25/2014] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Next-generation sequencing (NGS) has advanced the application of high-throughput sequencing technologies in genetic and genomic variation analysis. Due to the large dynamic range of expression levels, RNA-seq is more prone to detect transcripts with low expression. It is clear that genes with no mapped reads are not expressed; however, there is ongoing debate about the level of abundance that constitutes biologically meaningful expression. To date, there is no consensus on the definition of low expression. Since random variation is high in regions with low expression and distributions of transcript expression are affected by numerous experimental factors, methods to differentiate low and high expressed data in a sample are critical to interpreting classes of abundance levels in RNA-seq data. RESULTS A data-adaptive approach was developed to estimate the lower bound of high expression for RNA-seq data. The Kolmgorov-Smirnov statistic and multivariate adaptive regression splines were used to determine the optimal cutoff value for separating transcripts with high and low expression. Results from the proposed method were compared to results obtained by estimating the theoretical cutoff of a fitted two-component mixture distribution. The robustness of the proposed method was demonstrated by analyzing different RNA-seq datasets that varied by sequencing depth, species, scale of measurement, and empirical density shape. CONCLUSIONS The analysis of real and simulated data presented here illustrates the need to employ data-adaptive methodology in lieu of arbitrary cutoffs to distinguish low expressed RNA-seq data from high expression. Our results also present the drawbacks of characterizing the data by a two-component mixture distribution when classes of gene expression are not well separated. The ability to ascertain stably expressed RNA-seq data is essential in the filtering process of data analysis, and methodologies that consider the underlying data structure demonstrate superior performance in preserving most of the interpretable and meaningful data. The proposed algorithm for classifying low and high regions of transcript abundance promises wide-range application in the continuing development of RNA-seq analysis.
Collapse
Affiliation(s)
- Nysia I George
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR 72079, USA
| | - Ching-Wei Chang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, FDA, Jefferson, AR 72079, USA
| |
Collapse
|
8
|
Anavy L, Levin M, Khair S, Nakanishi N, Fernandez-Valverde SL, Degnan BM, Yanai I. BLIND ordering of large-scale transcriptomic developmental timecourses. Development 2014; 141:1161-6. [PMID: 24504336 DOI: 10.1242/dev.105288] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
RNA-Seq enables the efficient transcriptome sequencing of many samples from small amounts of material, but the analysis of these data remains challenging. In particular, in developmental studies, RNA-Seq is challenged by the morphological staging of samples, such as embryos, since these often lack clear markers at any particular stage. In such cases, the automatic identification of the stage of a sample would enable previously infeasible experimental designs. Here we present the 'basic linear index determination of transcriptomes' (BLIND) method for ordering samples comprising different developmental stages. The method is an implementation of a traveling salesman algorithm to order the transcriptomes according to their inter-relationships as defined by principal components analysis. To establish the direction of the ordered samples, we show that an appropriate indicator is the entropy of transcriptomic gene expression levels, which increases over developmental time. Using BLIND, we correctly recover the annotated order of previously published embryonic transcriptomic timecourses for frog, mosquito, fly and zebrafish. We further demonstrate the efficacy of BLIND by collecting 59 embryos of the sponge Amphimedon queenslandica and ordering their transcriptomes according to developmental stage. BLIND is thus useful in establishing the temporal order of samples within large datasets and is of particular relevance to the study of organisms with asynchronous development and when morphological staging is difficult.
Collapse
Affiliation(s)
- Leon Anavy
- Department of Biology, Technion - Israel Institute of Technology, Haifa 32000, Israel
| | | | | | | | | | | | | |
Collapse
|
9
|
Zheng CL, Kawane S, Bottomly D, Wilmot B. Analysis considerations for utilizing RNA-Seq to characterize the brain transcriptome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:21-54. [PMID: 25172470 DOI: 10.1016/b978-0-12-801105-8.00002-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
RNA-Seq allows one to examine only gene expression as well as expression of noncoding RNAs, alternative splicing, and allele-specific expression. With this increased sensitivity and dynamic range, there are computational and statistical considerations that need to be contemplated, which are highly dependent on the biological question being asked. We highlight these to provide an overview of their importance and the impact they can have on downstream interpretation of the brain transcriptome.
Collapse
Affiliation(s)
- Christina L Zheng
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Knight Cancer Institute, Oregon Health, Oregon Health and Science University, Portland, Oregon, USA.
| | - Sunita Kawane
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Daniel Bottomly
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Beth Wilmot
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| |
Collapse
|
10
|
Nowrousian M. Fungal gene expression levels do not display a common mode of distribution. BMC Res Notes 2013; 6:559. [PMID: 24373411 PMCID: PMC3877863 DOI: 10.1186/1756-0500-6-559] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Accepted: 12/23/2013] [Indexed: 12/01/2022] Open
Abstract
Background RNA-seq studies in metazoa have revealed a distinct, double-peaked (bimodal) distribution of gene expression independent of species and cell type. However, two studies in filamentous fungi yielded conflicting results, with a bimodal distribution in Pyronema confluens and varying distributions in Sordaria macrospora. To obtain a broader overview of global gene expression distributions in fungi, an additional 60 publicly available RNA-seq data sets from six ascomycetes and one basidiomycete were analyzed with respect to gene expression distributions. Results Clustering of normalized, log2-transformed gene expression levels for each RNA-seq data set yielded distributions with one to five peaks. When only major peaks comprising at least 15% of all analyzed genes were considered, distributions ranged from one to three major peaks, suggesting that fungal gene expression is not generally bimodal. The number of peaks was not correlated with the phylogenetic position of a species; however, higher filamentous asco- and basidiomycetes showed up to three major peaks, whereas gene expression levels in the yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe had only one to two major peaks, with one predominant peak containing at least 70% of all expressed genes. In several species, the number of peaks varied even within a single species, e.g. depending on the growth conditions as evidenced in the one to three major peaks in different samples from Neurospora crassa. Earlier studies based on microarray and SAGE data revealed distributions of gene expression level that followed Zipf’s law, i.e. log-transformed gene expression levels were inversely proportional to the log-transformed expression rank of a gene. However, analyses of the fungal RNA-seq data sets could not identify any that confirmed to Zipf’s law. Conclusions Fungal gene expression patterns cannot generally be described by a single type of distribution (bimodal or Zipf’s law). One hypothesis to explain this finding might be that gene expression in fungi is highly dynamic, and fine-tuned at the level of transcription not only for individual genes, but also at a global level.
Collapse
Affiliation(s)
- Minou Nowrousian
- Lehrstuhl für Allgemeine und Molekulare Botanik, Ruhr-Universität Bochum, 44780 Bochum, Germany.
| |
Collapse
|
11
|
Kim K, Punj V, Choi J, Heo K, Kim JM, Laird PW, An W. Gene dysregulation by histone variant H2A.Z in bladder cancer. Epigenetics Chromatin 2013; 6:34. [PMID: 24279307 PMCID: PMC3853418 DOI: 10.1186/1756-8935-6-34] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Accepted: 09/27/2013] [Indexed: 12/20/2022] Open
Abstract
Background The incorporation of histone variants into nucleosomes is one of the main strategies that the cell uses to regulate the structure and function of chromatin. Histone H2A.Z is an evolutionarily conserved histone H2A variant that is preferentially localized within nucleosomes at the transcriptional start site (TSS). H2A.Z reorganizes the local chromatin structure and recruits the transcriptional machinery for gene activation. High expression of H2A.Z has been reported in several types of cancers and is causally linked to genomic instability and tumorigenesis. However, it is not entirely clear how H2A.Z overexpression in cancer cells establishes aberrant chromatin states and promotes gene expression. Results Through integration of genome-wide H2A.Z ChIP-seq data with microarray data, we demonstrate that H2A.Z is enriched around the TSS of cell cycle regulatory genes in bladder cancer cells, and this enrichment is correlated with the elevated expression of cancer-promoting genes. RNAi-mediated knockdown of H2A.Z in the cancer cells causes transcriptional suppression of multiple cell cycle regulatory genes with a distinct decrease in cell proliferation. H2A.Z nucleosomes around the TSS have higher levels of H3K4me2/me3, which coincides with the recruitment of two chromatin factors, WDR5 and BPTF. The observed recruitment is functional, as the active states of H2A.Z target genes are largely erased by suppressing the expression of WDR5 or BPTF, effects resembling H2A.Z knockdown. Conclusions We conclude that H2A.Z is overexpressed in bladder cancer cells and contributes to cancer-related transcription pathways. We also provide evidence in support of the engagement of H3K4me2/me3 and WDR5/BPTF in H2A.Z-induced cancer pathogenesis. Further studies are warranted to understand how H2A.Z overexpression contributes to the recruitment of the full repertoire of transcription machinery to target genes in bladder cancer cells.
Collapse
|
12
|
|
13
|
Hebenstreit D, Fang M, Gu M, Charoensawan V, van Oudenaarden A, Teichmann SA. RNA sequencing reveals two major classes of gene expression levels in metazoan cells. Mol Syst Biol 2011; 7:497. [PMID: 21654674 PMCID: PMC3159973 DOI: 10.1038/msb.2011.28] [Citation(s) in RCA: 227] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2011] [Accepted: 04/19/2011] [Indexed: 12/22/2022] Open
Abstract
The expression level of a gene is often used as a proxy for determining whether the protein or RNA product is functional in a cell or tissue. Therefore, it is of fundamental importance to understand the global distribution of gene expression levels, and to be able to interpret it mechanistically and functionally. Here we use RNA sequencing (RNA-seq) of mouse Th2 cells, coupled with a range of other techniques, to show that all genes can be separated, based on their expression abundance, into two distinct groups: one group comprised of lowly expressed and putatively non-functional mRNAs, and the other of highly expressed mRNAs with active chromatin marks at their promoters. These observations are confirmed in many other microarray and RNA-seq data sets of metazoan cell types.
Collapse
Affiliation(s)
- Daniel Hebenstreit
- Structural Studies Division, MRC Laboratory of Molecular Biology, Cambridge, UK.
| | | | | | | | | | | |
Collapse
|