251
|
Rahmatallah Y, Emmert-Streib F, Glazko G. Comparative evaluation of gene set analysis approaches for RNA-Seq data. BMC Bioinformatics 2014; 15:397. [PMID: 25475910 PMCID: PMC4265362 DOI: 10.1186/s12859-014-0397-8] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2014] [Accepted: 11/24/2014] [Indexed: 11/18/2022] Open
Abstract
Background Over the last few years transcriptome sequencing (RNA-Seq) has almost completely taken over microarrays for high-throughput studies of gene expression. Currently, the most popular use of RNA-Seq is to identify genes which are differentially expressed between two or more conditions. Despite the importance of Gene Set Analysis (GSA) in the interpretation of the results from RNA-Seq experiments, the limitations of GSA methods developed for microarrays in the context of RNA-Seq data are not well understood. Results We provide a thorough evaluation of popular multivariate and gene-level self-contained GSA approaches on simulated and real RNA-Seq data. The multivariate approach employs multivariate non-parametric tests combined with popular normalizations for RNA-Seq data. The gene-level approach utilizes univariate tests designed for the analysis of RNA-Seq data to find gene-specific P-values and combines them into a pathway P-value using classical statistical techniques. Our results demonstrate that the Type I error rate and the power of multivariate tests depend only on the test statistics and are insensitive to the different normalizations. In general standard multivariate GSA tests detect pathways that do not have any bias in terms of pathways size, percentage of differentially expressed genes, or average gene length in a pathway. In contrast the Type I error rate and the power of gene-level GSA tests are heavily affected by the methods for combining P-values, and all aforementioned biases are present in detected pathways. Conclusions Our result emphasizes the importance of using self-contained non-parametric multivariate tests for detecting differentially expressed pathways for RNA-Seq data and warns against applying gene-level GSA tests, especially because of their high level of Type I error rates for both, simulated and real data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-014-0397-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yasir Rahmatallah
- Division of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA.
| | - Frank Emmert-Streib
- Computational Biology and Machine Learning Laboratory, Center for Cancer Research and Cell Biology, School of Medicine, Dentistry and Biomedical Sciences, Queen's University Belfast, 97 Lisburn Road, Belfast, BT9 7BL, UK.
| | - Galina Glazko
- Division of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA.
| |
Collapse
|
252
|
|
253
|
Ji H, Zhang X, Oh S, Mayhew CN, Ulm A, Somineni HK, Ericksen M, Wells JM, Khurana Hershey GK. Dynamic transcriptional and epigenomic reprogramming from pediatric nasal epithelial cells to induced pluripotent stem cells. J Allergy Clin Immunol 2014; 135:236-44. [PMID: 25441642 DOI: 10.1016/j.jaci.2014.08.038] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2014] [Revised: 07/24/2014] [Accepted: 08/27/2014] [Indexed: 11/25/2022]
Abstract
BACKGROUND Induced pluripotent stem cells (iPSCs) hold tremendous potential, both as a biological tool to uncover the pathophysiology of disease by creating relevant human cell models and as a source of cells for cell-based therapeutic applications. Studying the reprogramming process will also provide significant insight into tissue development. OBJECTIVE We sought to characterize the derivation of iPSC lines from nasal epithelial cells (NECs) isolated from nasal mucosa samples of children, a highly relevant and easily accessible tissue for pediatric populations. METHODS We performed detailed comparative analysis on the transcriptomes and methylomes of NECs, iPSCs derived from NECs (NEC-iPSCs), and embryonic stem cells (ESCs). RESULTS NEC-iPSCs express pluripotent cell markers, can differentiate into all 3 germ layers in vivo and in vitro, and have a transcriptome and methylome remarkably similar to those of ESCs. However, residual DNA methylation marks exist, which are differentially methylated between NEC-iPSCs and ESCs. A subset of these methylation markers related to epithelium development and asthma and specific to NEC-iPSCs persisted after several passages in vitro, suggesting the retention of an epigenetic memory of their tissue of origin. Our analysis also identified novel candidate genes with dynamic gene expression and DNA methylation changes during reprogramming, which are indicative of possible roles in airway epithelium development. CONCLUSION NECs are an excellent tissue source to generate iPSCs in pediatric asthmatic patients, and detailed characterization of the resulting iPSC lines would help us better understand the reprogramming process and retention of epigenetic memory.
Collapse
Affiliation(s)
- Hong Ji
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio.
| | - Xue Zhang
- Division of Human Genetics, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - Sunghee Oh
- Division of Human Genetics, Kim Sook Za Children's Hospital Medical Center Research Foundation, Cheongju, South Korea
| | - Christopher N Mayhew
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - Ashley Ulm
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - Hari K Somineni
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - Mark Ericksen
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - James M Wells
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio; Division of Endocrinology, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| | - Gurjit K Khurana Hershey
- Division of Asthma Research, Cincinnati Children's Hospital Medical Center, and the Department of Pediatrics, University of Cincinnati, Cincinnati, Ohio
| |
Collapse
|
254
|
Xu M, Rhee SY. Becoming data-savvy in a big-data world. TRENDS IN PLANT SCIENCE 2014; 19:619-622. [PMID: 25213119 DOI: 10.1016/j.tplants.2014.08.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Revised: 08/15/2014] [Accepted: 08/20/2014] [Indexed: 06/03/2023]
Abstract
Plant biology is becoming a data-driven science. High-throughput technologies generate data quickly from molecular to ecosystem levels. Statistical and computational approaches enable describing and interpreting data quantitatively. We highlight the purpose, common problems, and general principles in data analysis. We use RNA sequencing (RNAseq) analysis to illustrate the rationale behind some of the choices made in statistical data analysis. Finally, we provide a list of free online resources that emphasize intuition behind quantitative data analysis.
Collapse
Affiliation(s)
- Meng Xu
- Carnegie Institution for Science, Department of Plant Biology, 260 Panama Street, Stanford, CA 94305, USA
| | - Seung Yon Rhee
- Carnegie Institution for Science, Department of Plant Biology, 260 Panama Street, Stanford, CA 94305, USA.
| |
Collapse
|
255
|
Wu PY, Chandramohan R, Phan JH, Mahle WT, Gaynor JW, Maher KO, Wang MD. Cardiovascular transcriptomics and epigenomics using next-generation sequencing: challenges, progress, and opportunities. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:701-10. [PMID: 25518043 PMCID: PMC4983435 DOI: 10.1161/circgenetics.113.000129] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- Po-Yen Wu
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - Raghu Chandramohan
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - John H Phan
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - William T Mahle
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - J William Gaynor
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - Kevin O Maher
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.)
| | - May D Wang
- From the School of Electrical and Computer Engineering (P.-Y.W.), School of Biology (R.C.), Georgia Institute of Technology, Atlanta, GA; The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA (J.H.P.); The Wallace H. Coulter Department of Biomedical Engineering, School of Electrical and Computer Engineering, Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology and Emory University, Atlanta, GA (M.D.W.); Children's Healthcare of Atlanta, Department of Pediatrics, Emory University School of Medicine, Atlanta, GA (W.T.M., K.O.M.); The Children's Hospital of Philadelphia, Department of Surgery, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA (J.W.G.).
| |
Collapse
|
256
|
Taher L, Pfeiffer MJ, Fuellen G. Bioinformatics approaches to single-blastomere transcriptomics. ACTA ACUST UNITED AC 2014; 21:115-25. [DOI: 10.1093/molehr/gau083] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
257
|
Finotello F, Di Camillo B. Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. Brief Funct Genomics 2014; 14:130-42. [PMID: 25240000 DOI: 10.1093/bfgp/elu035] [Citation(s) in RCA: 128] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
RNA-seq is a methodology for RNA profiling based on next-generation sequencing that enables to measure and compare gene expression patterns at unprecedented resolution. Although the appealing features of this technique have promoted its application to a wide panel of transcriptomics studies, the fast-evolving nature of experimental protocols and computational tools challenges the definition of a unified RNA-seq analysis pipeline. In this review, focused on the study of differential gene expression with RNA-seq, we go through the main steps of data processing and discuss open challenges and possible solutions.
Collapse
|
258
|
Ofek-Lalzar M, Sela N, Goldman-Voronov M, Green SJ, Hadar Y, Minz D. Niche and host-associated functional signatures of the root surface microbiome. Nat Commun 2014; 5:4950. [PMID: 25232638 DOI: 10.1038/ncomms5950] [Citation(s) in RCA: 204] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Accepted: 08/08/2014] [Indexed: 01/20/2023] Open
Abstract
Plant microbiomes are critical to host adaptation and impact plant productivity and health. Root-associated microbiomes vary by soil and host genotype, but the contribution of these factors to community structure and metabolic potential has not been fully addressed. Here we characterize root microbial communities of two disparate agricultural crops grown in the same natural soil in a controlled and replicated experimental system. Metagenomic (genetic potential) analysis identifies a core set of functional genes associated with root colonization in both plant hosts, and metatranscriptomic (functional expression) analysis revealed that most genes enriched in the root zones are expressed. Root colonization requires multiple functional capabilities, and these capabilities are enriched at the community level. Differences between the root-associated microbial communities from different plants are observed at the genus or species level, and are related to root-zone environmental factors.
Collapse
Affiliation(s)
- Maya Ofek-Lalzar
- 1] Department of Soil, Water and Environmental Sciences, Agricultural Research Organization of Israel, Bet Dagan 50250, Israel [2] Department of Plant Pathology and Microbiology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | - Noa Sela
- Department of Plant Pathology and Weed Research, Agricultural Research Organization of Israel, Bet Dagan 50250, Israel
| | - Milana Goldman-Voronov
- Department of Soil, Water and Environmental Sciences, Agricultural Research Organization of Israel, Bet Dagan 50250, Israel
| | - Stefan J Green
- 1] DNA Services Facility, Research Resources Center, University of Illinois at Chicago, Chicago, Illinois 60612, USA [2] Department of Biological Sciences, University of Illinois at Chicago, Chicago, Illinois 60612, USA
| | - Yitzhak Hadar
- Department of Plant Pathology and Microbiology, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot 76100, Israel
| | - Dror Minz
- Department of Soil, Water and Environmental Sciences, Agricultural Research Organization of Israel, Bet Dagan 50250, Israel
| |
Collapse
|
259
|
Mazuc E, Guglielmi L, Bec N, Parez V, Hahn CS, Mollevi C, Parrinello H, Desvignes JP, Larroque C, Jupp R, Dariavach P, Martineau P. In-cell intrabody selection from a diverse human library identifies C12orf4 protein as a new player in rodent mast cell degranulation. PLoS One 2014; 9:e104998. [PMID: 25122211 PMCID: PMC4133367 DOI: 10.1371/journal.pone.0104998] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2014] [Accepted: 07/14/2014] [Indexed: 01/04/2023] Open
Abstract
The high specificity of antibodies for their antigen allows a fine discrimination of target conformations and post-translational modifications, making antibodies the first choice tool to interrogate the proteome. We describe here an approach based on a large-scale intracellular expression and selection of antibody fragments in eukaryotic cells, so-called intrabodies, and the subsequent identification of their natural target within living cell. Starting from a phenotypic trait, this integrated system allows the identification of new therapeutic targets together with their companion inhibitory intrabody. We applied this system in a model of allergy and inflammation. We first cloned a large and highly diverse intrabody library both in a plasmid and a retroviral eukaryotic expression vector. After transfection in the RBL-2H3 rat basophilic leukemia cell line, we performed seven rounds of selection to isolate cells displaying a defect in FcεRI-induced degranulation. We used high throughput sequencing to identify intrabody sequences enriched during the course of selection. Only one intrabody was common to both plasmid and retroviral selections, and was used to capture and identify its target from cell extracts. Mass spectrometry analysis identified protein RGD1311164 (C12orf4), with no previously described function. Our data demonstrate that RGD1311164 is a cytoplasmic protein implicated in the early signaling events following FcεRI-induced cell activation. This work illustrates the strength of the intrabody-based in-cell selection, which allowed the identification of a new player in mast cell activation together with its specific inhibitor intrabody.
Collapse
Affiliation(s)
- Elsa Mazuc
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Laurence Guglielmi
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Nicole Bec
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Vincent Parez
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Chang S. Hahn
- Sanofi-Aventis, Bridgewater, New Jersey, United States of America
| | - Caroline Mollevi
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Hugues Parrinello
- MGX-Montpellier GenomiX, Institut de Génomique Fonctionnelle, Montpellier, France
| | | | - Christian Larroque
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
| | - Ray Jupp
- Sanofi-Aventis, Bridgewater, New Jersey, United States of America
| | - Piona Dariavach
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
- Université Montpellier2, Montpellier, France
- * E-mail: (PD); (PM)
| | - Pierre Martineau
- IRCM, Institut de Recherche en Cancérologie de Montpellier, Montpellier, France
- INSERM, U896, Montpellier, France
- Université Montpellier1, Montpellier, France
- ICM, Institut régional du Cancer Montpellier, Montpellier, France
- * E-mail: (PD); (PM)
| |
Collapse
|
260
|
Zhang ZH, Jhaveri DJ, Marshall VM, Bauer DC, Edson J, Narayanan RK, Robinson GJ, Lundberg AE, Bartlett PF, Wray NR, Zhao QY. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS One 2014; 9:e103207. [PMID: 25119138 PMCID: PMC4132098 DOI: 10.1371/journal.pone.0103207] [Citation(s) in RCA: 151] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2014] [Accepted: 06/30/2014] [Indexed: 01/23/2023] Open
Abstract
Recent advances in next-generation sequencing technology allow high-throughput cDNA sequencing (RNA-Seq) to be widely applied in transcriptomic studies, in particular for detecting differentially expressed genes between groups. Many software packages have been developed for the identification of differentially expressed genes (DEGs) between treatment groups based on RNA-Seq data. However, there is a lack of consensus on how to approach an optimal study design and choice of suitable software for the analysis. In this comparative study we evaluate the performance of three of the most frequently used software tools: Cufflinks-Cuffdiff2, DESeq and edgeR. A number of important parameters of RNA-Seq technology were taken into consideration, including the number of replicates, sequencing depth, and balanced vs. unbalanced sequencing depth within and between groups. We benchmarked results relative to sets of DEGs identified through either quantitative RT-PCR or microarray. We observed that edgeR performs slightly better than DESeq and Cuffdiff2 in terms of the ability to uncover true positives. Overall, DESeq or taking the intersection of DEGs from two or more tools is recommended if the number of false positives is a major concern in the study. In other circumstances, edgeR is slightly preferable for differential expression analysis at the expense of potentially introducing more false positives.
Collapse
Affiliation(s)
- Zong Hong Zhang
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Dhanisha J. Jhaveri
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Vikki M. Marshall
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Denis C. Bauer
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
- CSIRO Preventative Health Flagship and CSIRO Computational Informatics, Sydney, New South Wales, Australia
| | - Janette Edson
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
- The University of Queensland, Diamantina Institute, Brisbane, Queensland, Australia
| | - Ramesh K. Narayanan
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Gregory J. Robinson
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Andreas E. Lundberg
- Swedish University of Agricultural Sciences, Department of Clinical Sciences, Uppsala, Sweden
| | - Perry F. Bartlett
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Naomi R. Wray
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
| | - Qiong-Yi Zhao
- The University of Queensland, Queensland Brain Institute, Brisbane, Queensland, Australia
- * E-mail:
| |
Collapse
|
261
|
Aguilar-Pontes MV, de Vries RP, Zhou M. (Post-)genomics approaches in fungal research. Brief Funct Genomics 2014; 13:424-39. [PMID: 25037051 DOI: 10.1093/bfgp/elu028] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
To date, hundreds of fungal genomes have been sequenced and many more are in progress. This wealth of genomic information has provided new directions to study fungal biodiversity. However, to further dissect and understand the complicated biological mechanisms involved in fungal life styles, functional studies beyond genomes are required. Thanks to the developments of current -omics techniques, it is possible to produce large amounts of fungal functional data in a high-throughput fashion (e.g. transcriptome, proteome, etc.). The increasing ease of creating -omics data has also created a major challenge for downstream data handling and analysis. Numerous databases, tools and software have been created to meet this challenge. Facing such a richness of techniques and information, hereby we provide a brief roadmap on current wet-lab and bioinformatics approaches to study functional genomics in fungi.
Collapse
|
262
|
Hou J, Groothuismink ZMA, Koning L, Roomer R, van IJcken WFJ, Kreefft K, Liu BS, Janssen HLA, de Knegt RJ, Boonstra A. Analysis of the transcriptome and immune function of monocytes during IFNα-based therapy in chronic HCV revealed induction of TLR7 responsiveness. Antiviral Res 2014; 109:116-24. [PMID: 25014880 DOI: 10.1016/j.antiviral.2014.06.020] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Revised: 06/10/2014] [Accepted: 06/30/2014] [Indexed: 01/17/2023]
Abstract
Although in vitro studies have been performed to dissect the mechanism of action of IFNα, detailed in vivo studies on the long-term effects of IFNα on monocytes have not been performed. Here we examined peripheral blood from 14 chronic HCV patients at baseline and 12 weeks after start of IFNα-based therapy. Monocytes were phenotyped by flow-cytometry and their function evaluated upon TLR stimulation and assessed by multiplex cytokine assays. During therapy of HCV patients, monocytes displayed a hyperactive state as evidenced by increased TLR-induced pro-inflammatory cytokine levels, as well as enhanced CD69 and CD83 mRNA and protein expression. Moreover, monocytes from 8 patients at baseline and 12 weeks after start of IFNα-based therapy were transcriptomically profiled by high throughput RNA-sequencing. Detailed RNA-seq analysis of monocytes showed significant ISG mRNA induction during therapy. Importantly, IFNα-based therapy activated TLR7 signaling pathways, as demonstrated by up-regulated expression of TLR7, MyD88, and IRF7 mRNA, whereas other TLR family members as well as CD1c, CLEC4C, and CLEC9A were not induced. The induction of TLR7 responsiveness of monocytes by IFNα in vivo in HCV patients is relevant for the development of TLR7 agonists that are currently under development as a promising immunotherapeutic compounds to treat chronic viral hepatitis.
Collapse
Affiliation(s)
- Jun Hou
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Zwier M A Groothuismink
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Ludi Koning
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Robert Roomer
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | | | - Kim Kreefft
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Bi-Sheng Liu
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Harry L A Janssen
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands; Liver Clinic University Health Network, Division of Gastroenterology, University of Toronto, Canada
| | - Robert J de Knegt
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands
| | - Andre Boonstra
- Department of Gastroenterology and Hepatology, Erasmus MC, University Medical Center Rotterdam, The Netherlands.
| |
Collapse
|
263
|
An J, Kim K, Chae H, Kim S. DegPack: a web package using a non-parametric and information theoretic algorithm to identify differentially expressed genes in multiclass RNA-seq samples. Methods 2014; 69:306-14. [PMID: 24981074 DOI: 10.1016/j.ymeth.2014.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Revised: 06/02/2014] [Accepted: 06/04/2014] [Indexed: 11/29/2022] Open
Abstract
Gene expression in the whole cell can be routinely measured by microarray technologies or recently by using sequencing technologies. Using these technologies, identifying differentially expressed genes (DEGs) among multiple phenotypes is the very first step to understand difference between phenotypes. Thus many methods for detecting DEGs between two groups have been developed. For example, T-test and relative entropy are used for detecting difference between two probability distributions. When more than two phenotypes are considered, these methods are not applicable and other methods such as ANOVA F-test and Kruskal-Wallis are used for finding DEGs in the multiclass data. However, ANOVA F-test assumes a normal distribution and it is not designed to identify DEGs where genes are expressed distinctively in each of phenotypes. Kruskal-Wallis method, a non-parametric method, is more robust but sensitive to outliers. In this paper, we propose a non-parametric and information theoretical approach for identifying DEGs. Our method identified DEGs effectively and it is shown less sensitive to outliers in two data sets: a three-class drought resistant rice data set and a three-class breast cancer data set. In extensive experiments with simulated and real data, our method was shown to outperform existing tools in terms of accuracy of characterizing phenotypes using DEGs. A web service is implemented at http://biohealth.snu.ac.kr/software/degpack for the analysis of multi-class data and it includes SAMseq and PoissonSeq methods in addition to the method described in this paper.
Collapse
Affiliation(s)
- Jaehyun An
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea
| | - Kwangsoo Kim
- Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea
| | - Heejoon Chae
- Department of Computer Science, School of Informatics and Computing, Indiana University, Bloomington, IN, USA
| | - Sun Kim
- Department of Computer Science and Engineering, Seoul National University, Seoul, Republic of Korea; Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea; Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Republic of Korea.
| |
Collapse
|
264
|
Stratification of gene coexpression patterns and GO function mining for a RNA-Seq data series. BIOMED RESEARCH INTERNATIONAL 2014; 2014:969768. [PMID: 24955372 PMCID: PMC4052503 DOI: 10.1155/2014/969768] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2014] [Revised: 04/05/2014] [Accepted: 04/06/2014] [Indexed: 11/17/2022]
Abstract
RNA-Seq is emerging as an increasingly important tool in biological research, and it provides the most direct evidence of the relationship between the physiological state and molecular changes in cells. A large amount of RNA-Seq data across diverse experimental conditions have been generated and deposited in public databases. However, most developed approaches for coexpression analyses focus on the coexpression pattern mining of the transcriptome, thereby ignoring the magnitude of gene differences in one pattern. Furthermore, the functional relationships of genes in one pattern, and notably among patterns, were not always recognized. In this study, we developed an integrated strategy to identify differential coexpression patterns of genes and probed the functional mechanisms of the modules. Two real datasets were used to validate the method and allow comparisons with other methods. One of the datasets was selected to illustrate the flow of a typical analysis. In summary, we present an approach to robustly detect coexpression patterns in transcriptomes and to stratify patterns according to their relative differences. Furthermore, a global relationship between patterns and biological functions was constructed. In addition, a freely accessible web toolkit “coexpression pattern mining and GO functional analysis” (COGO) was developed.
Collapse
|
265
|
Dannemiller KC, Mendell MJ, Macher JM, Kumagai K, Bradman A, Holland N, Harley K, Eskenazi B, Peccia J. Next-generation DNA sequencing reveals that low fungal diversity in house dust is associated with childhood asthma development. INDOOR AIR 2014; 24:236-47. [PMID: 24883433 PMCID: PMC4048861 DOI: 10.1111/ina.12072] [Citation(s) in RCA: 117] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
UNLABELLED Dampness and visible mold in homes are associated with asthma development, but causal mechanisms remain unclear. The goal of this research was to explore associations among measured dampness, fungal exposure, and childhood asthma development without the bias of culture-based microbial analysis. In the low-income, Latino CHAMACOS birth cohort, house dust was collected at age 12 months, and asthma status was determined at age 7 years.The current analysis included 13 asthma cases and 28 controls. Next-generation DNA sequencing methods quantified fungal taxa and diversity. Lower fungal diversity (number of fungal operational taxonomic units) was significantly associated with increased risk of asthma development: unadjusted odds ratio(OR) 4.80 (95% confidence interval (CI) 1.04–22.1). Control for potential confounders strengthened this relationship. Decreased diversity within the genus Cryptococcus was significantly associated with increased asthma risk (OR 21.0, 95% CI 2.16–204). No fungal taxon (species, genus, class) was significantly positively associated with asthma development, and one was significantly negatively associated. Elevated moisture was associated with increased fungal diversity, and moisture/mold indicators were associated with four fungal taxa. Next-generation DNA sequencing provided comprehensive estimates of fungal identity and diversity, demonstrating significant associations between low fungal diversity and childhood asthma development in this community. PRACTICAL IMPLICATIONS Early life exposure to low fungal diversity in house dust was associated with increased risk for later asthma developmen tin this low-income, immigrant community. No individual fungal taxon (species, genus, or class) was associated with asthma development, although exposure to low diversity within the genus Cryptococcus was associated with asthma development. Future asthma development studies should incorporate fungal diversity measurements, in addition to measuring individual fungal taxa. These results represent a step toward identifying the aspect(s) of indoor microbial populations that are associated with asthma development and suggest that understanding the factors that control diversity in the indoor environment may lead to public health recommendations for asthma prevention in the future.
Collapse
Affiliation(s)
- Karen C. Dannemiller
- Department of Chemical and Environmental Engineering, Yale University, 9 Hillhouse Ave, PO Box 208286, New Haven, CT 06520, USA
| | - Mark J. Mendell
- Indoor Air Quality Section, Environmental Health Laboratory Branch, 850 Marina Bay Parkway, MS G365/EHLB, California Department of Public Health, Richmond, CA 94804, USA
| | - Janet M. Macher
- Indoor Air Quality Section, Environmental Health Laboratory Branch, 850 Marina Bay Parkway, MS G365/EHLB, California Department of Public Health, Richmond, CA 94804, USA
| | - Kazukiyo Kumagai
- Indoor Air Quality Section, Environmental Health Laboratory Branch, 850 Marina Bay Parkway, MS G365/EHLB, California Department of Public Health, Richmond, CA 94804, USA
| | - Asa Bradman
- Center for Environmental Research and Children’s Health (CERCH), School of Public Health, UC Berkeley, 1995 University Ave., Suite 265, Berkeley, CA 94720, USA
| | - Nina Holland
- Center for Environmental Research and Children’s Health (CERCH), School of Public Health, UC Berkeley, 1995 University Ave., Suite 265, Berkeley, CA 94720, USA
| | - Kim Harley
- Center for Environmental Research and Children’s Health (CERCH), School of Public Health, UC Berkeley, 1995 University Ave., Suite 265, Berkeley, CA 94720, USA
| | - Brenda Eskenazi
- Center for Environmental Research and Children’s Health (CERCH), School of Public Health, UC Berkeley, 1995 University Ave., Suite 265, Berkeley, CA 94720, USA
| | - Jordan Peccia
- Department of Chemical and Environmental Engineering, Yale University, 9 Hillhouse Ave, PO Box 208286, New Haven, CT 06520, USA
| |
Collapse
|
266
|
MultiRankSeq: multiperspective approach for RNAseq differential expression analysis and quality control. BIOMED RESEARCH INTERNATIONAL 2014; 2014:248090. [PMID: 24977143 PMCID: PMC4058234 DOI: 10.1155/2014/248090] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2013] [Revised: 02/01/2014] [Accepted: 03/15/2014] [Indexed: 11/17/2022]
Abstract
Background. After a decade of microarray technology dominating the field of high-throughput gene expression profiling, the introduction of RNAseq has revolutionized gene expression research. While RNAseq provides more abundant information than microarray, its analysis has proved considerably more complicated. To date, no consensus has been reached on the best approach for RNAseq-based differential expression analysis. Not surprisingly, different studies have drawn different conclusions as to the best approach to identify differentially expressed genes based upon their own criteria and scenarios considered. Furthermore, the lack of effective quality control may lead to misleading results interpretation and erroneous conclusions. To solve these aforementioned problems, we propose a simple yet safe and practical rank-sum approach for RNAseq-based differential gene expression analysis named MultiRankSeq. MultiRankSeq first performs quality control assessment. For data meeting the quality control criteria, MultiRankSeq compares the study groups using several of the most commonly applied analytical methods and combines their results to generate a new rank-sum interpretation. MultiRankSeq provides a unique analysis approach to RNAseq differential expression analysis. MultiRankSeq is written in R, and it is easily applicable. Detailed graphical and tabular analysis reports can be generated with a single command line.
Collapse
|
267
|
Brunner AL, Li J, Guo X, Sweeney RT, Varma S, Zhu SX, Li R, Tibshirani R, West RB. A shared transcriptional program in early breast neoplasias despite genetic and clinical distinctions. Genome Biol 2014; 15:R71. [PMID: 24887547 PMCID: PMC4072957 DOI: 10.1186/gb-2014-15-5-r71] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Accepted: 05/23/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The earliest recognizable stages of breast neoplasia are lesions that represent a heterogeneous collection of epithelial proliferations currently classified based on morphology. Their role in the development of breast cancer is not well understood but insight into the critical events at this early stage will improve efforts in breast cancer detection and prevention. These microscopic lesions are technically difficult to study so very little is known about their molecular alterations. RESULTS To characterize the transcriptional changes of early breast neoplasia, we sequenced 3'- end enriched RNAseq libraries from formalin-fixed paraffin-embedded tissue of early neoplasia samples and matched normal breast and carcinoma samples from 25 patients. We find that gene expression patterns within early neoplasias are distinct from both normal and breast cancer patterns and identify a pattern of pro-oncogenic changes, including elevated transcription of ERBB2, FOXA1, and GATA3 at this early stage. We validate these findings on a second independent gene expression profile data set generated by whole transcriptome sequencing. Measurements of protein expression by immunohistochemistry on an independent set of early neoplasias confirms that ER pathway regulators FOXA1 and GATA3, as well as ER itself, are consistently upregulated at this early stage. The early neoplasia samples also demonstrate coordinated changes in long non-coding RNA expression and microenvironment stromal gene expression patterns. CONCLUSIONS This study is the first examination of global gene expression in early breast neoplasia, and the genes identified here represent candidate participants in the earliest molecular events in the development of breast cancer.
Collapse
|
268
|
Croner RS, Stürzl M, Rau TT, Metodieva G, Geppert CI, Naschberger E, Lausen B, Metodiev MV. Quantitative proteome profiling of lymph node-positive vs. -negative colorectal carcinomas pinpoints MX1 as a marker for lymph node metastasis. Int J Cancer 2014; 135:2878-86. [PMID: 24771638 DOI: 10.1002/ijc.28929] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2013] [Accepted: 04/09/2014] [Indexed: 01/26/2023]
Abstract
We used high-resolution mass spectrometry to measure the abundance of more than 9,000 proteins in 19 individually dissected colorectal tumors representing lymph node metastatic (n = 10) and nonmetastatic (n = 9) phenotypes. Statistical analysis identified MX1 and several other proteins as overexpressed in lymph node-positive tumors. MX1, IGF1-R and IRF2BP1 showed significantly different expression in immunohistochemical validation (Wilcoxon test p = 0.007 for IGF1-R, p = 0.04 for IRF2BP1 and p = 0.02 for MX1 at the invasion front) in the validation cohort. Knockout of MX1 by siRNA in cell cultures and wound healing assays provided additional evidence for the involvement of this protein in tumor invasion. The collection of identified and quantified proteins to our knowledge is the largest tumor proteome dataset available at the present. The identified proteins can give insights into the mechanisms of lymphatic metastasis in colorectal carcinoma and may act as prognostic markers and therapeutic targets after further prospective validation.
Collapse
Affiliation(s)
- Roland S Croner
- Department of Surgery, University Hospital Erlangen, Erlangen, Germany
| | | | | | | | | | | | | | | |
Collapse
|
269
|
Zhou X, Lindsay H, Robinson MD. Robustly detecting differential expression in RNA sequencing data using observation weights. Nucleic Acids Res 2014; 42:e91. [PMID: 24753412 PMCID: PMC4066750 DOI: 10.1093/nar/gku310] [Citation(s) in RCA: 295] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
A popular approach for comparing gene expression levels between (replicated) conditions of RNA sequencing data relies on counting reads that map to features of interest. Within such count-based methods, many flexible and advanced statistical approaches now exist and offer the ability to adjust for covariates (e.g. batch effects). Often, these methods include some sort of ‘sharing of information’ across features to improve inferences in small samples. It is important to achieve an appropriate tradeoff between statistical power and protection against outliers. Here, we study the robustness of existing approaches for count-based differential expression analysis and propose a new strategy based on observation weights that can be used within existing frameworks. The results suggest that outliers can have a global effect on differential analyses. We demonstrate the effectiveness of our new approach with real data and simulated data that reflects properties of real datasets (e.g. dispersion-mean trend) and develop an extensible framework for comprehensive testing of current and future methods. In addition, we explore the origin of such outliers, in some cases highlighting additional biological or technical factors within the experiment. Further details can be downloaded from the project website: http://imlspenticton.uzh.ch/robinson_lab/edgeR_robust/.
Collapse
Affiliation(s)
- Xiaobei Zhou
- Institute of Molecular Life Sciences, University of Zurich, CH-8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, University of Zurich, CH-8057 Zurich, Switzerland
| | - Helen Lindsay
- Institute of Molecular Life Sciences, University of Zurich, CH-8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, University of Zurich, CH-8057 Zurich, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, CH-8057 Zurich, Switzerland SIB Swiss Institute of Bioinformatics, University of Zurich, CH-8057 Zurich, Switzerland
| |
Collapse
|
270
|
Poole A, Urbanek C, Eng C, Schageman J, Jacobson S, O'Connor BP, Galanter JM, Gignoux CR, Roth LA, Kumar R, Lutz S, Liu AH, Fingerlin TE, Setterquist RA, Burchard EG, Rodriguez-Santana J, Seibold MA. Dissecting childhood asthma with nasal transcriptomics distinguishes subphenotypes of disease. J Allergy Clin Immunol 2014; 133:670-8.e12. [PMID: 24495433 PMCID: PMC4043390 DOI: 10.1016/j.jaci.2013.11.025] [Citation(s) in RCA: 192] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Revised: 11/07/2013] [Accepted: 11/20/2013] [Indexed: 10/25/2022]
Abstract
BACKGROUND Bronchial airway expression profiling has identified inflammatory subphenotypes of asthma, but the invasiveness of this technique has limited its application to childhood asthma. OBJECTIVES We sought to determine whether the nasal transcriptome can proxy expression changes in the lung airway transcriptome in asthmatic patients. We also sought to determine whether the nasal transcriptome can distinguish subphenotypes of asthma. METHODS Whole-transcriptome RNA sequencing was performed on nasal airway brushings from 10 control subjects and 10 asthmatic subjects, which were compared with established bronchial and small-airway transcriptomes. Targeted RNA sequencing nasal expression analysis was used to profile 105 genes in 50 asthmatic subjects and 50 control subjects for differential expression and clustering analyses. RESULTS We found 90.2% overlap in expressed genes and strong correlation in gene expression (ρ = .87) between the nasal and bronchial transcriptomes. Previously observed asthmatic bronchial differential expression was strongly correlated with asthmatic nasal differential expression (ρ = 0.77, P = 5.6 × 10(-9)). Clustering analysis identified TH2-high and TH2-low subjects differentiated by expression of 70 genes, including IL13, IL5, periostin (POSTN), calcium-activated chloride channel regulator 1 (CLCA1), and serpin peptidase inhibitor, clade B (SERPINB2). TH2-high subjects were more likely to have atopy (odds ratio, 10.3; P = 3.5 × 10(-6)), atopic asthma (odds ratio, 32.6; P = 6.9 × 10(-7)), high blood eosinophil counts (odds ratio, 9.1; P = 2.6 × 10(-6)), and rhinitis (odds ratio, 8.3; P = 4.1 × 10(-6)) compared with TH2-low subjects. Nasal IL13 expression levels were 3.9-fold higher in asthmatic participants who experienced an asthma exacerbation in the past year (P = .01). Several differentially expressed nasal genes were specific to asthma and independent of atopic status. CONCLUSION Nasal airway gene expression profiles largely recapitulate expression profiles in the lung airways. Nasal expression profiling can be used to identify subjects with IL13-driven asthma and a TH2-skewed systemic immune response.
Collapse
Affiliation(s)
- Alex Poole
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, Colo
| | - Cydney Urbanek
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, Colo
| | - Celeste Eng
- Department of Medicine, University of California-San Francisco, San Francisco, Calif
| | | | - Sean Jacobson
- Departments of Epidemiology and Biostatistics, Colorado School of Public Health, Aurora, Colo
| | - Brian P O'Connor
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, Colo; Integrated Department of Immunology, National Jewish Health and the University of Colorado-Denver, Denver, Colo
| | - Joshua M Galanter
- Department of Medicine, University of California-San Francisco, San Francisco, Calif; Department of Bioengineering and Therapeutic Sciences, University of California-San Francisco, San Francisco, Calif
| | - Christopher R Gignoux
- Department of Medicine, University of California-San Francisco, San Francisco, Calif; Department of Bioengineering and Therapeutic Sciences, University of California-San Francisco, San Francisco, Calif
| | - Lindsey A Roth
- Department of Medicine, University of California-San Francisco, San Francisco, Calif
| | - Rajesh Kumar
- Ann and Robert H. Lurie Children's Hospital of Chicago and the Feinberg School of Medicine, Northwestern University, Chicago, Ill
| | - Sharon Lutz
- Departments of Epidemiology and Biostatistics, Colorado School of Public Health, Aurora, Colo
| | - Andrew H Liu
- Department of Pediatrics, National Jewish Health, Denver, Colo
| | - Tasha E Fingerlin
- Departments of Epidemiology and Biostatistics, Colorado School of Public Health, Aurora, Colo
| | | | - Esteban G Burchard
- Department of Medicine, University of California-San Francisco, San Francisco, Calif; Department of Bioengineering and Therapeutic Sciences, University of California-San Francisco, San Francisco, Calif
| | | | - Max A Seibold
- Integrated Center for Genes, Environment, and Health, National Jewish Health, Denver, Colo; Department of Pediatrics, National Jewish Health, Denver, Colo; Division of Pulmonary Sciences and Critical Care Medicine, Department of Medicine, University of Colorado-Denver, Denver, Colo.
| |
Collapse
|
271
|
Oh S, Song S, Dasgupta N, Grabowski G. The analytical landscape of static and temporal dynamics in transcriptome data. Front Genet 2014; 5:35. [PMID: 24600473 PMCID: PMC3929947 DOI: 10.3389/fgene.2014.00035] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2013] [Accepted: 01/30/2014] [Indexed: 12/16/2022] Open
Abstract
Interpreting gene expression profiles often involves statistical analysis of large numbers of differentially expressed genes, isoforms, and alternative splicing events at either static or dynamic spectrums. Reduced sequencing costs have made feasible dense time-series analysis of gene expression using RNA-seq; however, statistical methods in the context of temporal RNA-seq data are poorly developed. Here we will review current methods for identifying temporal changes in gene expression using RNA-seq, which are limited to static pairwise comparisons of time points and which fail to account for temporal dependencies in gene expression patterns. We also review recently developed very few number of temporal dynamic RNA-seq specific methods. Application and development of RNA-specific temporal dynamic methods have been continuously under the development, yet, it is still in infancy. We fully cover microarray specific temporal methods and transcriptome studies in initial digital technology (e.g., SAGE) between traditional microarray and new RNA-seq.
Collapse
Affiliation(s)
- Sunghee Oh
- Division of Human Genetics, Department of Pediatrics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Seongho Song
- Department of Mathematical Sciences, McMicken College of Arts and Sciences, University of Cincinnati Cincinnati, OH, USA
| | - Nupur Dasgupta
- Division of Human Genetics, Department of Pediatrics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| | - Gregory Grabowski
- Division of Human Genetics, Department of Pediatrics, Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
| |
Collapse
|
272
|
Rosa BA, Jasmer DP, Mitreva M. Genome-wide tissue-specific gene expression, co-expression and regulation of co-expressed genes in adult nematode Ascaris suum. PLoS Negl Trop Dis 2014; 8:e2678. [PMID: 24516681 PMCID: PMC3916258 DOI: 10.1371/journal.pntd.0002678] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 12/18/2013] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND Caenorhabditis elegans has traditionally been used as a model for studying nematode biology, but its small size limits the ability for researchers to perform some experiments such as high-throughput tissue-specific gene expression studies. However, the dissection of individual tissues is possible in the parasitic nematode Ascaris suum due to its relatively large size. Here, we take advantage of the recent genome sequencing of Ascaris suum and the ability to physically dissect its separate tissues to produce a wide-scale tissue-specific nematode RNA-seq datasets, including data on three non-reproductive tissues (head, pharynx, and intestine) in both male and female worms, as well as four reproductive tissues (testis, seminal vesicle, ovary, and uterus). We obtained fundamental information about the biology of diverse cell types and potential interactions among tissues within this multicellular organism. METHODOLOGY/PRINCIPAL FINDINGS Overexpression and functional enrichment analyses identified many putative biological functions enriched in each tissue studied, including functions which have not been previously studied in detail in nematodes. Putative tissue-specific transcriptional factors and corresponding binding motifs that regulate expression in each tissue were identified, including the intestine-enriched ELT-2 motif/transcription factor previously described in nematode intestines. Constitutively expressed and novel genes were also characterized, with the largest number of novel genes found to be overexpressed in the testis. Finally, a putative acetylcholine-mediated transcriptional network connecting biological activity in the head to the male reproductive system is described using co-expression networks, along with a similar ecdysone-mediated system in the female. CONCLUSIONS/SIGNIFICANCE The expression profiles, co-expression networks and co-expression regulation of the 10 tissues studied and the tissue-specific analysis presented here are a valuable resource for studying tissue-specific biological functions in nematodes.
Collapse
Affiliation(s)
- Bruce A. Rosa
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Douglas P. Jasmer
- Department of Veterinary Microbiology and Pathology, Washington State University, Pullman, Washington, United States of America
| | - Makedonka Mitreva
- The Genome Institute, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Medicine, Division of Infectious Diseases, Washington University School of Medicine, St. Louis, Missouri, United States of America
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * E-mail:
| |
Collapse
|
273
|
Rodríguez Cubillos AE, Perlaza-Jiménez L, Bernal Giraldo AJ. RNA-Seq Data Analysis in Prokaryotes: A Review for Non-experts. ACTA BIOLÓGICA COLOMBIANA 2014. [DOI: 10.15446/abc.v19n2.41010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
|
274
|
Transforming RNA-Seq data to improve the performance of prognostic gene signatures. PLoS One 2014; 9:e85150. [PMID: 24416353 PMCID: PMC3885686 DOI: 10.1371/journal.pone.0085150] [Citation(s) in RCA: 90] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2013] [Accepted: 11/24/2013] [Indexed: 12/29/2022] Open
Abstract
Gene expression measurements have successfully been used for building prognostic signatures, i.e for identifying a short list of important genes that can predict patient outcome. Mostly microarray measurements have been considered, and there is little advice available for building multivariable risk prediction models from RNA-Seq data. We specifically consider penalized regression techniques, such as the lasso and componentwise boosting, which can simultaneously consider all measurements and provide both, multivariable regression models for prediction and automated variable selection. However, they might be affected by the typical skewness, mean-variance-dependency or extreme values of RNA-Seq covariates and therefore could benefit from transformations of the latter. In an analytical part, we highlight preferential selection of covariates with large variances, which is problematic due to the mean-variance dependency of RNA-Seq data. In a simulation study, we compare different transformations of RNA-Seq data for potentially improving detection of important genes. Specifically, we consider standardization, the log transformation, a variance-stabilizing transformation, the Box-Cox transformation, and rank-based transformations. In addition, the prediction performance for real data from patients with kidney cancer and acute myeloid leukemia is considered. We show that signature size, identification performance, and prediction performance critically depend on the choice of a suitable transformation. Rank-based transformations perform well in all scenarios and can even outperform complex variance-stabilizing approaches. Generally, the results illustrate that the distribution and potential transformations of RNA-Seq data need to be considered as a critical step when building risk prediction models by penalized regression techniques.
Collapse
|
275
|
Zheng CL, Kawane S, Bottomly D, Wilmot B. Analysis considerations for utilizing RNA-Seq to characterize the brain transcriptome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:21-54. [PMID: 25172470 DOI: 10.1016/b978-0-12-801105-8.00002-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
RNA-Seq allows one to examine only gene expression as well as expression of noncoding RNAs, alternative splicing, and allele-specific expression. With this increased sensitivity and dynamic range, there are computational and statistical considerations that need to be contemplated, which are highly dependent on the biological question being asked. We highlight these to provide an overview of their importance and the impact they can have on downstream interpretation of the brain transcriptome.
Collapse
Affiliation(s)
- Christina L Zheng
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Knight Cancer Institute, Oregon Health, Oregon Health and Science University, Portland, Oregon, USA.
| | - Sunita Kawane
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Daniel Bottomly
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Beth Wilmot
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| |
Collapse
|
276
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014. [PMID: 25516281 DOI: 10.1101/002832] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/15/2023] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
277
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014; 60:157-66. [PMID: 25516281 DOI: 10.1007/s00248-010-9658-x] [Citation(s) in RCA: 125] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2010] [Accepted: 03/12/2010] [Indexed: 05/20/2023] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
278
|
Fernandes AD, Reid JNS, Macklaim JM, McMurrough TA, Edgell DR, Gloor GB. Unifying the analysis of high-throughput sequencing datasets: characterizing RNA-seq, 16S rRNA gene sequencing and selective growth experiments by compositional data analysis. MICROBIOME 2014; 2:15. [PMID: 24910773 PMCID: PMC4030730 DOI: 10.1186/2049-2618-2-15] [Citation(s) in RCA: 684] [Impact Index Per Article: 68.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Accepted: 03/25/2014] [Indexed: 05/09/2023]
Abstract
BACKGROUND Experimental designs that take advantage of high-throughput sequencing to generate datasets include RNA sequencing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP-seq), sequencing of 16S rRNA gene fragments, metagenomic analysis and selective growth experiments. In each case the underlying data are similar and are composed of counts of sequencing reads mapped to a large number of features in each sample. Despite this underlying similarity, the data analysis methods used for these experimental designs are all different, and do not translate across experiments. Alternative methods have been developed in the physical and geological sciences that treat similar data as compositions. Compositional data analysis methods transform the data to relative abundances with the result that the analyses are more robust and reproducible. RESULTS Data from an in vitro selective growth experiment, an RNA-seq experiment and the Human Microbiome Project 16S rRNA gene abundance dataset were examined by ALDEx2, a compositional data analysis tool that uses Bayesian methods to infer technical and statistical error. The ALDEx2 approach is shown to be suitable for all three types of data: it correctly identifies both the direction and differential abundance of features in the differential growth experiment, it identifies a substantially similar set of differentially expressed genes in the RNA-seq dataset as the leading tools and it identifies as differential the taxa that distinguish the tongue dorsum and buccal mucosa in the Human Microbiome Project dataset. The design of ALDEx2 reduces the number of false positive identifications that result from datasets composed of many features in few samples. CONCLUSION Statistical analysis of high-throughput sequencing datasets composed of per feature counts showed that the ALDEx2 R package is a simple and robust tool, which can be applied to RNA-seq, 16S rRNA gene sequencing and differential growth datasets, and by extension to other techniques that use a similar approach.
Collapse
Affiliation(s)
| | - Jennifer NS Reid
- Department of Biochemistry, Medical Science Building, University of Western Ontario, 1151 Richmond St, N6A 5C1, London, ON, Canada
| | - Jean M Macklaim
- Department of Biochemistry, Medical Science Building, University of Western Ontario, 1151 Richmond St, N6A 5C1, London, ON, Canada
| | - Thomas A McMurrough
- Department of Biochemistry, Medical Science Building, University of Western Ontario, 1151 Richmond St, N6A 5C1, London, ON, Canada
| | - David R Edgell
- Department of Biochemistry, Medical Science Building, University of Western Ontario, 1151 Richmond St, N6A 5C1, London, ON, Canada
| | - Gregory B Gloor
- Department of Biochemistry, Medical Science Building, University of Western Ontario, 1151 Richmond St, N6A 5C1, London, ON, Canada
| |
Collapse
|
279
|
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol 2014. [PMID: 25516281 DOI: 10.1186/s13059-014-0550-558] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023] Open
Abstract
In comparative high-throughput sequencing assays, a fundamental task is the analysis of count data, such as read counts per gene in RNA-seq, for evidence of systematic changes across experimental conditions. Small replicate numbers, discreteness, large dynamic range and the presence of outliers require a suitable statistical approach. We present DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates. This enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression. The DESeq2 package is available at http://www.bioconductor.org/packages/release/bioc/html/DESeq2.html webcite.
Collapse
|
280
|
Seyednasrollah F, Laiho A, Elo LL. Comparison of software packages for detecting differential expression in RNA-seq studies. Brief Bioinform 2013; 16:59-70. [PMID: 24300110 PMCID: PMC4293378 DOI: 10.1093/bib/bbt086] [Citation(s) in RCA: 254] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
RNA-sequencing (RNA-seq) has rapidly become a popular tool to characterize transcriptomes. A fundamental research problem in many RNA-seq studies is the identification of reliable molecular markers that show differential expression between distinct sample groups. Together with the growing popularity of RNA-seq, a number of data analysis methods and pipelines have already been developed for this task. Currently, however, there is no clear consensus about the best practices yet, which makes the choice of an appropriate method a daunting task especially for a basic user without a strong statistical or computational background. To assist the choice, we perform here a systematic comparison of eight widely used software packages and pipelines for detecting differential expression between sample groups in a practical research setting and provide general guidelines for choosing a robust pipeline. In general, our results demonstrate how the data analysis tool utilized can markedly affect the outcome of the data analysis, highlighting the importance of this choice.
Collapse
|
281
|
Yang TY, Jeong S. Grouped False-Discovery Rate for Removing the Gene-Set-Level Bias of RNA-seq. Evol Bioinform Online 2013; 9:467-78. [PMID: 24277981 PMCID: PMC3836564 DOI: 10.4137/ebo.s13099] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
In recent years, RNA-seq has become a very competitive alternative to microarrays. In RNA-seq experiments, the expected read count for a gene is proportional to its expression level multiplied by its transcript length. Even when two genes are expressed at the same level, differences in length will yield differing numbers of total reads. The characteristics of these RNA-seq experiments create a gene-level bias such that the proportion of significantly differentially expressed genes increases with the transcript length, whereas such bias is not present in microarray data. Gene-set analysis seeks to identify the gene sets that are enriched in the list of the identified significant genes. In the gene-set analysis of RNA-seq, the gene-level bias subsequently yields the gene-set-level bias that a gene set with genes of long length will be more likely to show up as enriched than will a gene set with genes of shorter length. Because gene expression is not related to its transcript length, any gene set containing long genes is not of biologically greater interest than gene sets with shorter genes. Accordingly the gene-set-level bias should be removed to accurately calculate the statistical significance of each gene-set enrichment in the RNA-seq. We present a new gene set analysis method of RNA-seq, called FDRseq, which can accurately calculate the statistical significance of a gene-set enrichment score by the grouped false-discovery rate. Numerical examples indicated that FDRseq is appropriate for controlling the transcript length bias in the gene-set analysis of RNA-seq data. To implement FDRseq, we developed the R program, which can be downloaded at no cost from http://home.mju.ac.kr/home/index.action?siteId=tyang.
Collapse
Affiliation(s)
- Tae Young Yang
- Department of Mathematics, Myongji University, Yongin, Kyonggi, Korea 449-728
| | | |
Collapse
|
282
|
Klambauer G, Unterthiner T, Hochreiter S. DEXUS: identifying differential expression in RNA-Seq studies with unknown conditions. Nucleic Acids Res 2013; 41:e198. [PMID: 24049071 PMCID: PMC3834838 DOI: 10.1093/nar/gkt834] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Detection of differential expression in RNA-Seq data is currently limited to studies in which two or more sample conditions are known a priori. However, these biological conditions are typically unknown in cohort, cross-sectional and nonrandomized controlled studies such as the HapMap, the ENCODE or the 1000 Genomes project. We present DEXUS for detecting differential expression in RNA-Seq data for which the sample conditions are unknown. DEXUS models read counts as a finite mixture of negative binomial distributions in which each mixture component corresponds to a condition. A transcript is considered differentially expressed if modeling of its read counts requires more than one condition. DEXUS decomposes read count variation into variation due to noise and variation due to differential expression. Evidence of differential expression is measured by the informative/noninformative (I/NI) value, which allows differentially expressed transcripts to be extracted at a desired specificity (significance level) or sensitivity (power). DEXUS performed excellently in identifying differentially expressed transcripts in data with unknown conditions. On 2400 simulated data sets, I/NI value thresholds of 0.025, 0.05 and 0.1 yielded average specificities of 92, 97 and 99% at sensitivities of 76, 61 and 38%, respectively. On real-world data sets, DEXUS was able to detect differentially expressed transcripts related to sex, species, tissue, structural variants or quantitative trait loci. The DEXUS R package is publicly available from Bioconductor and the scripts for all experiments are available at http://www.bioinf.jku.at/software/dexus/.
Collapse
Affiliation(s)
- Günter Klambauer
- Institute of Bioinformatics, Johannes Kepler University, A-4040 Linz, Austria
| | | | | |
Collapse
|
283
|
Kozubek J, Ma Z, Fleming E, Duggan T, Wu R, Shin DG, Dadras SS. In-depth characterization of microRNA transcriptome in melanoma. PLoS One 2013; 8:e72699. [PMID: 24023765 PMCID: PMC3762816 DOI: 10.1371/journal.pone.0072699] [Citation(s) in RCA: 95] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2013] [Accepted: 07/10/2013] [Indexed: 01/09/2023] Open
Abstract
The full repertoire of human microRNAs (miRNAs) that could distinguish common (benign) nevi from cutaneous (malignant) melanomas remains to be established. In an effort to gain further insight into the role of miRNAs in melanoma, we applied Illumina next-generation sequencing (NGS) platform to carry out an in-depth analysis of miRNA transcriptome in biopsies of nevi, thick primary (>4.0 mm) and metastatic melanomas with matched normal skin in parallel to melanocytes and melanoma cell lines (both primary and metastatic) (n = 28). From this data representing 698 known miRNAs, we defined a set of top-40 list, which properly classified normal from cancer; also confirming 23 (58%) previously discovered miRNAs while introducing an additional 17 (42%) known and top-15 putative novel candidate miRNAs deregulated during melanoma progression. Surprisingly, the miRNA signature distinguishing specimens of melanoma from nevus was significantly different than that of melanoma cell lines from melanocytes. Among the top list, miR-203, miR-204-5p, miR-205-5p, miR-211-5p, miR-23b-3p, miR-26a-5p and miR-26b-5p were decreased in melanomas vs. nevi. In a validation cohort (n = 101), we verified the NGS results by qRT-PCR and showed that receiver-operating characteristic curves for miR-211-5p expression accurately discriminated invasive melanoma (AUC = 0.933), melanoma in situ (AUC = 0.933) and dysplastic (atypical) nevi (AUC = 0.951) from common nevi. Target prediction analysis of co-transcribed miRNAs showed a cooperative regulation of key elements in the MAPK signaling pathway. Furthermore, we found extensive sequence variations (isomiRs) and other non-coding small RNAs revealing a complex melanoma transcriptome. Deep-sequencing small RNAs directly from clinically defined specimens provides a robust strategy to improve melanoma diagnostics.
Collapse
Affiliation(s)
- James Kozubek
- Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, Connecticut, United States of America
- Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut, United States of America
| | - Zhihai Ma
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
| | - Elizabeth Fleming
- Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, Connecticut, United States of America
| | - Tatiana Duggan
- Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, Connecticut, United States of America
| | - Rong Wu
- Connecticut Institute for Clinical and Translational Science Biostatics Center, University of Connecticut Health Center, Farmington, Connecticut, United States of America
| | - Dong-Guk Shin
- Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut, United States of America
| | - Soheil S. Dadras
- Department of Genetics and Developmental Biology, University of Connecticut Health Center, Farmington, Connecticut, United States of America
- Department of Dermatology, University of Connecticut Health Center, Farmington, Connecticut, United States of America
- * E-mail:
| |
Collapse
|
284
|
Katayama S, Töhönen V, Linnarsson S, Kere J. SAMstrt: statistical test for differential expression in single-cell transcriptome with spike-in normalization. ACTA ACUST UNITED AC 2013; 29:2943-5. [PMID: 23995393 PMCID: PMC3810855 DOI: 10.1093/bioinformatics/btt511] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Motivation: Recent transcriptome studies have revealed that total transcript numbers vary by cell type and condition; therefore, the statistical assumptions for single-cell transcriptome studies must be revisited. SAMstrt is an extension code for SAMseq, which is a statistical method for differential expression, to enable spike-in normalization and statistical testing based on the estimated absolute number of transcripts per cell for single-cell RNA-seq methods. Availability and Implementation: SAMstrt is implemented on R and available in github (https://github.com/shka/R-SAMstrt). Contact:shintaro.katayama@ki.se Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shintaro Katayama
- Department of Biosciences and Nutrition, Karolinska Institutet, 141 83 Huddinge, Sweden, Science for Life Laboratory, Karolinska Institutet Science Park, 171 21 Solna, Sweden and Department of Medical Biochemistry and Biophysics, Karolinska Institutet, 171 77 Stockholm, Sweden
| | | | | | | |
Collapse
|
285
|
Guo Y, Sheng Q, Li J, Ye F, Samuels DC, Shyr Y. Large scale comparison of gene expression levels by microarrays and RNAseq using TCGA data. PLoS One 2013; 8:e71462. [PMID: 23977046 PMCID: PMC3748065 DOI: 10.1371/journal.pone.0071462] [Citation(s) in RCA: 140] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Accepted: 07/02/2013] [Indexed: 01/26/2023] Open
Abstract
RNAseq and microarray methods are frequently used to measure gene expression level. While similar in purpose, there are fundamental differences between the two technologies. Here, we present the largest comparative study between microarray and RNAseq methods to date using The Cancer Genome Atlas (TCGA) data. We found high correlations between expression data obtained from the Affymetrix one-channel microarray and RNAseq (Spearman correlations coefficients of ∼0.8). We also observed that the low abundance genes had poorer correlations between microarray and RNAseq data than high abundance genes. As expected, due to measurement and normalization differences, Agilent two-channel microarray and RNAseq data were poorly correlated (Spearman correlations coefficients of only ∼0.2). By examining the differentially expressed genes between tumor and normal samples we observed reasonable concordance in directionality between Agilent two-channel microarray and RNAseq data, although a small group of genes were found to have expression changes reported in opposite directions using these two technologies. Overall, RNAseq produces comparable results to microarray technologies in term of expression profiling. The RNAseq normalization methods RPKM and RSEM produce similar results on the gene level and reasonably concordant results on the exon level. Longer exons tended to have better concordance between the two normalization methods than shorter exons.
Collapse
Affiliation(s)
- Yan Guo
- Center for Quantitative Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail: (YG); (YS)
| | - Quanhu Sheng
- Center for Quantitative Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jiang Li
- Center for Quantitative Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Fei Ye
- Center for Quantitative Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - David C. Samuels
- Center for Human Genetics Research, Dept. of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Yu Shyr
- Center for Quantitative Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail: (YG); (YS)
| |
Collapse
|
286
|
Rapicavoli NA, Qu K, Zhang J, Mikhail M, Laberge RM, Chang HY. A mammalian pseudogene lncRNA at the interface of inflammation and anti-inflammatory therapeutics. eLife 2013; 2:e00762. [PMID: 23898399 PMCID: PMC3721235 DOI: 10.7554/elife.00762] [Citation(s) in RCA: 364] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Accepted: 06/18/2013] [Indexed: 12/30/2022] Open
Abstract
Pseudogenes are thought to be inactive gene sequences, but recent evidence of extensive pseudogene transcription raised the question of potential function. Here we discover and characterize the sets of mouse lncRNAs induced by inflammatory signaling via TNFα. TNFα regulates hundreds of lncRNAs, including 54 pseudogene lncRNAs, several of which show exquisitely selective expression in response to specific cytokines and microbial components in a NF-κB-dependent manner. Lethe, a pseudogene lncRNA, is selectively induced by proinflammatory cytokines via NF-κB or glucocorticoid receptor agonist, and functions in negative feedback signaling to NF-κB. Lethe interacts with NF-κB subunit RelA to inhibit RelA DNA binding and target gene activation. Lethe level decreases with organismal age, a physiological state associated with increased NF-κB activity. These findings suggest that expression of pseudogenes lncRNAs are actively regulated and constitute functional regulators of inflammatory signaling. DOI:http://dx.doi.org/10.7554/eLife.00762.001 The simplest account of gene expression is that DNA is transcribed into messenger RNA, which is then translated into a protein. However, not all genes encode proteins; for some it is the RNA molecule itself that is the end product. Many of these ‘non-coding RNAs’ are thought to be involved in regulating the expression of other genes, but their exact functions are unknown. Pseudogenes are genes that have lost their protein-coding abilities as a result of mutations they have accumulated mutations over the course of evolution. They were previously referred to as ‘junk DNA’ or ‘dead genes’ because they were thought to be completely non-functional, lacking even the ability to encode RNA. However, recent work has shown that pseudogenes are in fact transcribed into long non-coding RNAs, and these are now the focus of much research. Here, Rapicavoli et al. report that certain pseudogenes and long non-coding RNAs are involved in regulating the immune response. Specific and distinct pseudogene-derived long RNAs are made when cells are exposed to different kinds of infections. Immune cells such as macrophages and lymphocytes produce a protein called tumor necrosis factor alpha (TNFα), which is involved in triggering fever and inflammation. TNFα exerts these effects by binding to and activating a transcription factor called NF-κB, which then moves to the nucleus and binds to DNA, regulating the expression of genes that encode immune proteins. Rapicavoli et al. found that the production of a long non-coding RNA called Lethe (after the ‘river of forgetfulness’ in Greek mythology) increases when TNFα activates NF-κB. Surprisingly, however, Lethe then binds to NF-κB and prevents it from interacting with DNA, thereby reducing the production of various inflammatory proteins. This is the first time that a pseudogene has been shown to have an active role in regulating signaling pathways involved in inflammation, and raises the possibility that other pseudogenes may also influence distinct feedback loops and signaling networks. It suggests that many novel functions for pseudogenes and long non-coding RNAs remain to be discovered. DOI:http://dx.doi.org/10.7554/eLife.00762.002
Collapse
Affiliation(s)
- Nicole A Rapicavoli
- Program in Epithelial Biology , Howard Hughes Medical Institute, Stanford University School of Medicine , Stanford , United States
| | | | | | | | | | | |
Collapse
|
287
|
Parallel comparison of Illumina RNA-Seq and Affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated HT-29 colon cancer cells and simulated datasets. BMC Bioinformatics 2013; 14 Suppl 9:S1. [PMID: 23902433 PMCID: PMC3697991 DOI: 10.1186/1471-2105-14-s9-s1] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND High throughput parallel sequencing, RNA-Seq, has recently emerged as an appealing alternative to microarray in identifying differentially expressed genes (DEG) between biological groups. However, there still exists considerable discrepancy on gene expression measurements and DEG results between the two platforms. The objective of this study was to compare parallel paired-end RNA-Seq and microarray data generated on 5-azadeoxy-cytidine (5-Aza) treated HT-29 colon cancer cells with an additional simulation study. METHODS We first performed general correlation analysis comparing gene expression profiles on both platforms. An Errors-In-Variables (EIV) regression model was subsequently applied to assess proportional and fixed biases between the two technologies. Then several existing algorithms, designed for DEG identification in RNA-Seq and microarray data, were applied to compare the cross-platform overlaps with respect to DEG lists, which were further validated using qRT-PCR assays on selected genes. Functional analyses were subsequently conducted using Ingenuity Pathway Analysis (IPA). RESULTS Pearson and Spearman correlation coefficients between the RNA-Seq and microarray data each exceeded 0.80, with 66%~68% overlap of genes on both platforms. The EIV regression model indicated the existence of both fixed and proportional biases between the two platforms. The DESeq and baySeq algorithms (RNA-Seq) and the SAM and eBayes algorithms (microarray) achieved the highest cross-platform overlap rate in DEG results from both experimental and simulated datasets. DESeq method exhibited a better control on the false discovery rate than baySeq on the simulated dataset although it performed slightly inferior to baySeq in the sensitivity test. RNA-Seq and qRT-PCR, but not microarray data, confirmed the expected reversal of SPARC gene suppression after treating HT-29 cells with 5-Aza. Thirty-three IPA canonical pathways were identified by both microarray and RNA-Seq data, 152 pathways by RNA-Seq data only, and none by microarray data only. CONCLUSIONS These results suggest that RNA-Seq has advantages over microarray in identification of DEGs with the most consistent results generated from DESeq and SAM methods. The EIV regression model reveals both fixed and proportional biases between RNA-Seq and microarray. This may explain in part the lower cross-platform overlap in DEG lists compared to those in detectable genes.
Collapse
|
288
|
Abstract
DNA microarrays are a relatively new technology that can simultaneously measure the expression level of thousands of genes. They have become an important tool for a wide variety of biological experiments. One of the most common goals of DNA microarray experiments is to identify genes associated with biological processes of interest. Conventional statistical tests often produce poor results when applied to microarray data owing to small sample sizes, noisy data, and correlation among the expression levels of the genes. Thus, novel statistical methods are needed to identify significant genes in DNA microarray experiments. This article discusses the challenges inherent in DNA microarray analysis and describes a series of statistical techniques that can be used to overcome these challenges. The problem of multiple hypothesis testing and its relation to microarray studies are also considered, along with several possible solutions.
Collapse
Affiliation(s)
- Eric Bair
- Department of Endodontics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA ; Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
289
|
Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size. ACTA ACUST UNITED AC 2013; 29:1275-82. [PMID: 23589650 PMCID: PMC3654711 DOI: 10.1093/bioinformatics/btt143] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Motivation: RNA-seq experiments produce digital counts of reads that are affected by both biological and technical variation. To distinguish the systematic changes in expression between conditions from noise, the counts are frequently modeled by the Negative Binomial distribution. However, in experiments with small sample size, the per-gene estimates of the dispersion parameter are unreliable. Method: We propose a simple and effective approach for estimating the dispersions. First, we obtain the initial estimates for each gene using the method of moments. Second, the estimates are regularized, i.e. shrunk towards a common value that minimizes the average squared difference between the initial estimates and the shrinkage estimates. The approach does not require extra modeling assumptions, is easy to compute and is compatible with the exact test of differential expression. Results: We evaluated the proposed approach using 10 simulated and experimental datasets and compared its performance with that of currently popular packages edgeR, DESeq, baySeq, BBSeq and SAMseq. For these datasets, sSeq performed favorably for experiments with small sample size in sensitivity, specificity and computational time. Availability:http://www.stat.purdue.edu/∼ovitek/Software.html and Bioconductor. Contact:ovitek@purdue.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Danni Yu
- Genome Biology Unit, European Molecular Biology Laboratory, Mayerhofstraße 1, Heidelberg 69117, Germany
| | | | | |
Collapse
|
290
|
Chung LM, Ferguson JP, Zheng W, Qian F, Bruno V, Montgomery RR, Zhao H. Differential expression analysis for paired RNA-Seq data. BMC Bioinformatics 2013; 14:110. [PMID: 23530607 PMCID: PMC3663822 DOI: 10.1186/1471-2105-14-110] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Accepted: 03/01/2013] [Indexed: 11/18/2022] Open
Abstract
Background RNA-Seq technology measures the transcript abundance by generating sequence reads and counting their frequencies across different biological conditions. To identify differentially expressed genes between two conditions, it is important to consider the experimental design as well as the distributional property of the data. In many RNA-Seq studies, the expression data are obtained as multiple pairs, e.g., pre- vs. post-treatment samples from the same individual. We seek to incorporate paired structure into analysis. Results We present a Bayesian hierarchical mixture model for RNA-Seq data to separately account for the variability within and between individuals from a paired data structure. The method assumes a Poisson distribution for the data mixed with a gamma distribution to account variability between pairs. The effect of differential expression is modeled by two-component mixture model. The performance of this approach is examined by simulated and real data. Conclusions In this setting, our proposed model provides higher sensitivity than existing methods to detect differential expression. Application to real RNA-Seq data demonstrates the usefulness of this method for detecting expression alteration for genes with low average expression levels or shorter transcript length.
Collapse
Affiliation(s)
- Lisa M Chung
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
| | | | | | | | | | | | | |
Collapse
|
291
|
Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics 2013; 14:91. [PMID: 23497356 PMCID: PMC3608160 DOI: 10.1186/1471-2105-14-91] [Citation(s) in RCA: 542] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 03/01/2013] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Finding genes that are differentially expressed between conditions is an integral part of understanding the molecular basis of phenotypic variation. In the past decades, DNA microarrays have been used extensively to quantify the abundance of mRNA corresponding to different genes, and more recently high-throughput sequencing of cDNA (RNA-seq) has emerged as a powerful competitor. As the cost of sequencing decreases, it is conceivable that the use of RNA-seq for differential expression analysis will increase rapidly. To exploit the possibilities and address the challenges posed by this relatively new type of data, a number of software packages have been developed especially for differential expression analysis of RNA-seq data. RESULTS We conducted an extensive comparison of eleven methods for differential expression analysis of RNA-seq data. All methods are freely available within the R framework and take as input a matrix of counts, i.e. the number of reads mapping to each genomic feature of interest in each of a number of samples. We evaluate the methods based on both simulated data and real RNA-seq data. CONCLUSIONS Very small sample sizes, which are still common in RNA-seq experiments, impose problems for all evaluated methods and any results obtained under such conditions should be interpreted with caution. For larger sample sizes, the methods combining a variance-stabilizing transformation with the 'limma' method for differential expression analysis perform well under many different conditions, as does the nonparametric SAMseq method.
Collapse
Affiliation(s)
- Charlotte Soneson
- Bioinformatics Core Facility, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| | | |
Collapse
|
292
|
Identification and functional annotation of genome-wide ER-regulated genes in breast cancer based on ChIP-Seq data. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2012; 2012:568950. [PMID: 23346221 PMCID: PMC3546463 DOI: 10.1155/2012/568950] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Accepted: 12/18/2012] [Indexed: 11/18/2022]
Abstract
Estrogen receptor (ER) is a crucial molecule symbol of breast cancer. Molecular interactions between ER complexes and DNA regulate the expression of genes responsible for cancer cell phenotypes. However, the positions and mechanisms of the ER binding with downstream gene targets are far from being fully understood. ChIP-Seq is an important assay for the genome-wide study of protein-DNA interactions. In this paper, we explored the genome-wide chromatin localization of ER-DNA binding regions by analyzing ChIP-Seq data from MCF-7 breast cancer cell line. By integrating three peak detection algorithms and two datasets, we localized 933 ER binding sites, 92% among which were located far away from promoters, suggesting long-range control by ER. Moreover, 489 genes in the vicinity of ER binding sites were identified as estrogen response elements by comparison with expression data. In addition, 836 single nucleotide polymorphisms (SNPs) in or near 157 ER-regulated genes were found in the vicinity of ER binding sites. Furthermore, we annotated the function of the nearest-neighbor genes of these binding sites using Gene Ontology (GO), KEGG, and GeneGo pathway databases. The results revealed novel ER-regulated genes pathways for further experimental validation. ER was found to affect every developed stage of breast cancer by regulating genes related to the development, progression, and metastasis. This study provides a deeper understanding of the regulatory mechanisms of ER and its associated genes.
Collapse
|
293
|
Farazi TA, Brown M, Morozov P, ten Hoeve JJ, Ben-Dov IZ, Hovestadt V, Hafner M, Renwick N, Mihailović A, Wessels LF, Tuschl T. Bioinformatic analysis of barcoded cDNA libraries for small RNA profiling by next-generation sequencing. Methods 2012; 58:171-87. [PMID: 22836126 PMCID: PMC3597438 DOI: 10.1016/j.ymeth.2012.07.020] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2012] [Revised: 07/12/2012] [Accepted: 07/16/2012] [Indexed: 11/17/2022] Open
Abstract
The characterization of post-transcriptional gene regulation by small regulatory RNAs of 20-30 nt length, particularly miRNAs and piRNAs, has become a major focus of research in recent years. A prerequisite for the characterization of small RNAs is their identification and quantification across different developmental stages, normal and diseased tissues, as well as model cell lines. Here we present a step-by-step protocol for the bioinformatic analysis of barcoded cDNA libraries for small RNA profiling generated by Illumina sequencing, thereby facilitating miRNA and other small RNA profiling of large sample collections.
Collapse
Affiliation(s)
- Thalia A. Farazi
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Miguel Brown
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Pavel Morozov
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Jelle J. ten Hoeve
- Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Iddo Z. Ben-Dov
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Volker Hovestadt
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Markus Hafner
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Neil Renwick
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Aleksandra Mihailović
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| | - Lodewyk F.A. Wessels
- Department of Molecular Carcinogenesis, Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands
| | - Thomas Tuschl
- Howard Hughes Medical Institute, Laboratory of RNA Molecular Biology, The Rockefeller University, 1230 York Avenue, Box 186, New York, NY 10065, USA
| |
Collapse
|
294
|
Mutz KO, Heilkenbrinker A, Lönne M, Walter JG, Stahl F. Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol 2012; 24:22-30. [PMID: 23020966 DOI: 10.1016/j.copbio.2012.09.004] [Citation(s) in RCA: 303] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Revised: 09/03/2012] [Accepted: 09/04/2012] [Indexed: 12/16/2022]
Abstract
Up to date research in biology, biotechnology, and medicine requires fast genome and transcriptome analysis technologies for the investigation of cellular state, physiology, and activity. Here, microarray technology and next generation sequencing of transcripts (RNA-Seq) are state of the art. Since microarray technology is limited towards the amount of RNA, the quantification of transcript levels and the sequence information, RNA-Seq provides nearly unlimited possibilities in modern bioanalysis. This chapter presents a detailed description of next-generation sequencing (NGS), describes the impact of this technology on transcriptome analysis and explains its possibilities to explore the modern RNA world.
Collapse
Affiliation(s)
- Kai-Oliver Mutz
- Leibniz Universität Hannover, Institute for Technical Chemistry, Callinstrasse 5, 30167 Hannover, Germany
| | | | | | | | | |
Collapse
|