Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Nicolae M, Mangul S, Măndoiu II, Zelikovsky A. Estimation of alternative splicing isoform frequencies from RNA-Seq data. Algorithms Mol Biol 2011;6:9. [PMID: 21504602 PMCID: PMC3107792 DOI: 10.1186/1748-7188-6-9] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Accepted: 04/19/2011] [Indexed: 11/21/2022] Open

For:	Nicolae M, Mangul S, Măndoiu II, Zelikovsky A. Estimation of alternative splicing isoform frequencies from RNA-Seq data. Algorithms Mol Biol 2011;6:9. [PMID: 21504602 PMCID: PMC3107792 DOI: 10.1186/1748-7188-6-9] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Accepted: 04/19/2011] [Indexed: 11/21/2022] Open

Number

Cited by Other Article(s)

Jousheghani ZZ, Patro R. Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.28.582591. [PMID: 38464200 PMCID: PMC10925290 DOI: 10.1101/2024.02.28.582591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]

Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023;14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open

Affiliation(s)

Dhrithi Deshpande Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Karishma Chhugani Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Yutong Chang Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Aaron Karlsberg Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Caitlin Loeffler Department of Computer Science, University of California, Los Angeles, CA, United States
Jinyang Zhang Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
Agata Muszyńska Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
Viorel Munteanu Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
Harry Yang Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
Jeremy Rotman Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
Laura Tao Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
Brunilda Balliu Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
Elizabeth Tseng Pacific Biosciences, Menlo Park, CA, United States
Eleazar Eskin Department of Computer Science, University of California, Los Angeles, CA, United States Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
Fangqing Zhao Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
Pejman Mohammadi Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
Paweł P. Łabaj Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland Department of Biotechnology, Boku University Vienna, Vienna, Austria
Serghei Mangul Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States *Correspondence: Serghei Mangul,

Collapse

Sibbesen JA, Eizenga JM, Novak AM, Sirén J, Chang X, Garrison E, Paten B. Haplotype-aware pantranscriptome analyses using spliced pangenome graphs. Nat Methods 2023;20:239-247. [PMID: 36646895 DOI: 10.1101/2021.03.26.437240] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/28/2022] [Indexed: 05/24/2023]

Terrón-Camero LC, Gordillo-González F, Salas-Espejo E, Andrés-León E. Comparison of Metagenomics and Metatranscriptomics Tools: A Guide to Making the Right Choice. Genes (Basel) 2022;13:2280. [PMID: 36553546 PMCID: PMC9777648 DOI: 10.3390/genes13122280] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 11/28/2022] [Accepted: 12/01/2022] [Indexed: 12/09/2022] Open

Abstract

The study of microorganisms is a field of great interest due to their environmental (e.g., soil contamination) and biomedical (e.g., parasitic diseases, autism) importance. The advent of revolutionary next-generation sequencing techniques, and their application to the hypervariable regions of the 16S, 18S or 23S ribosomal subunits, have allowed the research of a large variety of organisms more in-depth, including bacteria, archaea, eukaryotes and fungi. Additionally, together with the development of analysis software, the creation of specific databases (e.g., SILVA or RDP) has boosted the enormous growth of these studies. As the cost of sequencing per sample has continuously decreased, new protocols have also emerged, such as shotgun sequencing, which allows the profiling of all taxonomic domains in a sample. The sequencing of hypervariable regions and shotgun sequencing are technologies that enable the taxonomic classification of microorganisms from the DNA present in microbial communities. However, they are not capable of measuring what is actively expressed. Conversely, we advocate that metatranscriptomics is a "new" technology that makes the identification of the mRNAs of a microbial community possible, quantifying gene expression levels and active biological pathways. Furthermore, it can be also used to characterise symbiotic interactions between the host and its microbiome. In this manuscript, we examine the three technologies above, and discuss the implementation of different software and databases, which greatly impact the obtaining of reliable results. Finally, we have developed two easy-to-use pipelines leveraging Nextflow technology. These aim to provide everything required for an average user to perform a metagenomic analysis of marker genes with QIMME2 and a metatranscriptomic study using Kraken2/Bracken.

Collapse

Baaijens JA, Zulli A, Ott IM, Nika I, van der Lugt MJ, Petrone ME, Alpert T, Fauver JR, Kalinich CC, Vogels CBF, Breban MI, Duvallet C, McElroy KA, Ghaeli N, Imakaev M, Mckenzie-Bennett MF, Robison K, Plocik A, Schilling R, Pierson M, Littlefield R, Spencer ML, Simen BB, Hanage WP, Grubaugh ND, Peccia J, Baym M. Lineage abundance estimation for SARS-CoV-2 in wastewater using transcriptome quantification techniques. Genome Biol 2022;23:236. [PMID: 36348471 PMCID: PMC9643916 DOI: 10.1186/s13059-022-02805-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2021] [Accepted: 10/25/2022] [Indexed: 11/09/2022] Open

Affiliation(s)

Jasmijn A Baaijens Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands.
Alessandro Zulli Department of Chemical and Environmental Engineering, Yale University, New Haven, CT, USA
Isabel M Ott Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Ioanna Nika Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands
Mart J van der Lugt Department of Intelligent Systems, Delft University of Technology, Delft, Netherlands
Mary E Petrone Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Tara Alpert Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Joseph R Fauver Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA Department of Epidemiology, University of Nebraska Medical Center, Omaha, NE, USA
Chaney C Kalinich Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Chantal B F Vogels Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Mallery I Breban Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
Claire Duvallet Biobot Analytics, Inc., Cambridge, MA, USA
Kyle A McElroy Biobot Analytics, Inc., Cambridge, MA, USA
Newsha Ghaeli Biobot Analytics, Inc., Cambridge, MA, USA
Maxim Imakaev Biobot Analytics, Inc., Cambridge, MA, USA
Malaika F Mckenzie-Bennett Ginkgo Bioworks, Inc., Boston, MA, USA
Keith Robison Ginkgo Bioworks, Inc., Boston, MA, USA
Alex Plocik Ginkgo Bioworks, Inc., Boston, MA, USA
Rebecca Schilling Ginkgo Bioworks, Inc., Boston, MA, USA
Martha Pierson Ginkgo Bioworks, Inc., Boston, MA, USA
Rebecca Littlefield Ginkgo Bioworks, Inc., Boston, MA, USA
Michelle L Spencer Ginkgo Bioworks, Inc., Boston, MA, USA
Birgitte B Simen Ginkgo Bioworks, Inc., Boston, MA, USA
William P Hanage Center for Communicable Disease Dynamics and Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
Nathan D Grubaugh Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA
Jordan Peccia Department of Chemical and Environmental Engineering, Yale University, New Haven, CT, USA
Michael Baym Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA

Collapse

Qi W, Fu H, Luo X, Ren Y, Liu X, Dai H, Zheng Q, Liang F. Electroacupuncture at PC6 (Neiguan) Attenuates Angina Pectoris in Rats with Myocardial Ischemia-Reperfusion Injury Through Regulating the Alternative Splicing of the Major Inhibitory Neurotransmitter Receptor GABRG2. J Cardiovasc Transl Res 2022;15:1176-1191. [PMID: 35377129 DOI: 10.1007/s12265-022-10245-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 03/25/2022] [Indexed: 11/27/2022]

Schaap-Johansen AL, Vujović M, Borch A, Hadrup SR, Marcatili P. T Cell Epitope Prediction and Its Application to Immunotherapy. Front Immunol 2021;12:712488. [PMID: 34603286 PMCID: PMC8479193 DOI: 10.3389/fimmu.2021.712488] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 07/12/2021] [Indexed: 12/13/2022] Open

Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021;22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open

Affiliation(s)

Mohammed Alser Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
Jeremy Rotman Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Dhrithi Deshpande Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
Kodi Taraszka Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Huwenbo Shi Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
Pelin Icer Baykal Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Harry Taegyun Yang Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
Victor Xue Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Sergey Knyazev Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Benjamin D Singer Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
Brunilda Balliu Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
David Koslicki Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA Biology Department, Pennsylvania State University, University Park, PA, 16801, USA The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
Pavel Skums Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Alex Zelikovsky Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
Can Alkan Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
Onur Mutlu Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
Serghei Mangul Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.

Collapse

Knyazev S, Tsyvina V, Shankar A, Melnyk A, Artyomenko A, Malygina T, Porozov YB, Campbell EM, Switzer WM, Skums P, Mangul S, Zelikovsky A. Accurate assembly of minority viral haplotypes from next-generation sequencing through efficient noise reduction. Nucleic Acids Res 2021;49:e102. [PMID: 34214168 PMCID: PMC8464054 DOI: 10.1093/nar/gkab576] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 05/25/2021] [Accepted: 06/18/2021] [Indexed: 12/21/2022] Open

Hu Y, Fang L, Chen X, Zhong JF, Li M, Wang K. LIQA: long-read isoform quantification and analysis. Genome Biol 2021;22:182. [PMID: 34140043 PMCID: PMC8212471 DOI: 10.1186/s13059-021-02399-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 06/04/2021] [Indexed: 11/10/2022] Open

Simoneau J, Gosselin R, Scott MS. Factorial study of the RNA-seq computational workflow identifies biases as technical gene signatures. NAR Genom Bioinform 2021;2:lqaa043. [PMID: 33575596 PMCID: PMC7671328 DOI: 10.1093/nargab/lqaa043] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 05/15/2020] [Accepted: 06/05/2020] [Indexed: 12/12/2022] Open

Hounkpe BW, Chenou F, de Lima F, De Paula E. HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets. Nucleic Acids Res 2021;49:D947-D955. [PMID: 32663312 PMCID: PMC7778946 DOI: 10.1093/nar/gkaa609] [Citation(s) in RCA: 111] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 07/08/2020] [Indexed: 12/18/2022] Open

Melsted P, Ntranos V, Pachter L. The barcode, UMI, set format and BUStools. Bioinformatics 2020;35:4472-4473. [PMID: 31073610 DOI: 10.1093/bioinformatics/btz279] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 02/15/2019] [Accepted: 04/13/2019] [Indexed: 11/13/2022] Open

Deschamps-Francoeur G, Simoneau J, Scott MS. Handling multi-mapped reads in RNA-seq. Comput Struct Biotechnol J 2020;18:1569-1576. [PMID: 32637053 PMCID: PMC7330433 DOI: 10.1016/j.csbj.2020.06.014] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 06/06/2020] [Accepted: 06/07/2020] [Indexed: 11/07/2022] Open

Lachmann A, Clarke DJB, Torre D, Xie Z, Ma'ayan A. Interoperable RNA-Seq analysis in the cloud. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2020;1863:194521. [PMID: 32156561 DOI: 10.1016/j.bbagrm.2020.194521] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Revised: 03/01/2020] [Accepted: 03/01/2020] [Indexed: 12/25/2022]

Zheng H, Brennan K, Hernaez M, Gevaert O. Benchmark of long non-coding RNA quantification for RNA sequencing of cancer samples. Gigascience 2019;8:giz145. [PMID: 31808800 PMCID: PMC6897288 DOI: 10.1093/gigascience/giz145] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 09/30/2019] [Accepted: 11/15/2019] [Indexed: 12/14/2022] Open

Malik L, Almodaresi F, Patro R. Grouper: graph-based clustering and annotation for improved de novo transcriptome analysis. Bioinformatics 2019;34:3265-3272. [PMID: 29746620 DOI: 10.1093/bioinformatics/bty378] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 05/03/2018] [Indexed: 11/14/2022] Open

Nomoto Y, Kubota Y, Ohnishi Y, Kasahara K, Tomita A, Oshime T, Yamashita H, Fahmi M, Ito M. Gene Cascade Finder: A tool for identification of gene cascades and its application in Caenorhabditis elegans. PLoS One 2019;14:e0215187. [PMID: 31504044 PMCID: PMC6736238 DOI: 10.1371/journal.pone.0215187] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Accepted: 08/06/2019] [Indexed: 11/24/2022] Open

Arefeen A, Liu J, Xiao X, Jiang T. TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 2019;34:2521-2529. [PMID: 30052912 DOI: 10.1093/bioinformatics/bty110] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Accepted: 02/22/2018] [Indexed: 01/08/2023] Open

Abstract

Motivation

The length of the 3' untranslated region (3' UTR) of an mRNA is essential for many biological activities such as mRNA stability, sub-cellular localization, protein translation, protein binding and translation efficiency. Moreover, correlation between diseases and the shortening (or lengthening) of 3' UTRs has been reported in the literature. This length is largely determined by the polyadenylation cleavage site in the mRNA. As alternative polyadenylation (APA) sites are common in mammalian genes, several tools have been published recently for detecting APA sites from RNA-Seq data or performing shortening/lengthening analysis. These tools consider either up to only two APA sites in a gene or only APA sites that occur in the last exon of a gene, although a gene may generally have more than two APA sites and an APA site may sometimes occur before the last exon. Furthermore, the tools are unable to integrate the analysis of shortening/lengthening events with APA site detection.

Results

We propose a new tool, called TAPAS, for detecting novel APA sites from RNA-Seq data. It can deal with more than two APA sites in a gene as well as APA sites that occur before the last exon. The tool is based on an existing method for finding change points in time series data, but some filtration techniques are also adopted to remove change points that are likely false APA sites. It is then extended to identify APA sites that are expressed differently between two biological samples and genes that contain 3' UTRs with shortening/lengthening events. Our extensive experiments on simulated and real RNA-Seq data demonstrate that TAPAS outperforms the existing tools for APA site detection or shortening/lengthening analysis significantly.

Availability and implementation

https://github.com/arefeen/TAPAS.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Raghupathy N, Choi K, Vincent MJ, Beane GL, Sheppard KS, Munger SC, Korstanje R, Pardo-Manual de Villena F, Churchill GA. Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression. Bioinformatics 2019;34:2177-2184. [PMID: 29444201 DOI: 10.1093/bioinformatics/bty078] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Accepted: 02/09/2018] [Indexed: 02/06/2023] Open

Abstract

Motivation

Allele-specific expression (ASE) refers to the differential abundance of the allelic copies of a transcript. RNA sequencing (RNA-seq) can provide quantitative estimates of ASE for genes with transcribed polymorphisms. When short-read sequences are aligned to a diploid transcriptome, read-mapping ambiguities confound our ability to directly count reads. Multi-mapping reads aligning equally well to multiple genomic locations, isoforms or alleles can comprise the majority (>85%) of reads. Discarding them can result in biases and substantial loss of information. Methods have been developed that use weighted allocation of read counts but these methods treat the different types of multi-reads equivalently. We propose a hierarchical approach to allocation of read counts that first resolves ambiguities among genes, then among isoforms, and lastly between alleles. We have implemented our model in EMASE software (Expectation-Maximization for Allele Specific Expression) to estimate total gene expression, isoform usage and ASE based on this hierarchical allocation.

Results

Methods that align RNA-seq reads to a diploid transcriptome incorporating known genetic variants improve estimates of ASE and total gene expression compared to methods that use reference genome alignments. Weighted allocation methods outperform methods that discard multi-reads. Hierarchical allocation of reads improves estimation of ASE even when data are simulated from a non-hierarchical model. Analysis of RNA-seq data from F1 hybrid mice using EMASE reveals widespread ASE associated with cis-acting polymorphisms and a small number of parent-of-origin effects.

Availability and implementation

EMASE software is available at https://github.com/churchill-lab/emase.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Van den Berge K, Hembach KM, Soneson C, Tiberi S, Clement L, Love MI, Patro R, Robinson MD. RNA Sequencing Data: Hitchhiker's Guide to Expression Analysis. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-072018-021255] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Karimzadeh M, Ernst C, Kundaje A, Hoffman MM. Umap and Bismap: quantifying genome and methylome mappability. Nucleic Acids Res 2019;46:e120. [PMID: 30169659 PMCID: PMC6237805 DOI: 10.1093/nar/gky677] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 07/22/2018] [Indexed: 11/14/2022] Open

Duan JE, Jiang ZC, Alqahtani F, Mandoiu I, Dong H, Zheng X, Marjani SL, Chen J, Tian XC. Methylome Dynamics of Bovine Gametes and in vivo Early Embryos. Front Genet 2019;10:512. [PMID: 31191619 PMCID: PMC6546829 DOI: 10.3389/fgene.2019.00512] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 05/10/2019] [Indexed: 01/12/2023] Open

Mangul S, Martin LS, Hill BL, Lam AKM, Distler MG, Zelikovsky A, Eskin E, Flint J. Systematic benchmarking of omics computational tools. Nat Commun 2019;10:1393. [PMID: 30918265 PMCID: PMC6437167 DOI: 10.1038/s41467-019-09406-4] [Citation(s) in RCA: 86] [Impact Index Per Article: 17.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 03/06/2019] [Indexed: 01/11/2023] Open

Moussa M, Măndoiu II. Locality Sensitive Imputation for Single Cell RNA-Seq Data. J Comput Biol 2019;26:822-835. [PMID: 30785309 DOI: 10.1089/cmb.2018.0236] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

McCurdy SR, Ntranos V, Pachter L. Deterministic column subset selection for single-cell RNA-Seq. PLoS One 2019;14:e0210571. [PMID: 30682053 PMCID: PMC6347249 DOI: 10.1371/journal.pone.0210571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Accepted: 12/26/2018] [Indexed: 12/02/2022] Open

A discriminative learning approach to differential expression analysis for single-cell RNA-seq. Nat Methods 2019;16:163-166. [PMID: 30664774 DOI: 10.1038/s41592-018-0303-9] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Accepted: 12/13/2018] [Indexed: 12/16/2022]

Duan JE, Flock K, Jue N, Zhang M, Jones A, Seesi SA, Mandoiu I, Pillai S, Hoffman M, O'Neill R, Zinn S, Govoni K, Reed S, Jiang H, Jiang ZC, Tian XC. Dosage Compensation and Gene Expression of the X Chromosome in Sheep. G3 (BETHESDA, MD.) 2019;9:305-314. [PMID: 30482800 PMCID: PMC6325915 DOI: 10.1534/g3.118.200815] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 11/26/2018] [Indexed: 12/20/2022]

Duan JE, Shi W, Jue NK, Jiang Z, Kuo L, O'Neill R, Wolf E, Dong H, Zheng X, Chen J, Tian XC. Dosage Compensation of the X Chromosomes in Bovine Germline, Early Embryos, and Somatic Tissues. Genome Biol Evol 2019;11:242-252. [PMID: 30566637 PMCID: PMC6354180 DOI: 10.1093/gbe/evy270] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/12/2018] [Indexed: 12/15/2022] Open

Duan J(E, Zhang M, Flock K, Seesi SA, Mandoiu I, Jones A, Johnson E, Pillai S, Hoffman M, McFadden K, Jiang H, Reed S, Govoni K, Zinn S, Jiang Z, Tian X(C. Effects of maternal nutrition on the expression of genomic imprinted genes in ovine fetuses. Epigenetics 2018;13:793-807. [PMID: 30051747 PMCID: PMC6224220 DOI: 10.1080/15592294.2018.1503489] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 07/04/2018] [Accepted: 07/15/2018] [Indexed: 12/27/2022] Open

Event Analysis: Using Transcript Events To Improve Estimates of Abundance in RNA-seq Data. G3-GENES GENOMES GENETICS 2018;8:2923-2940. [PMID: 30021829 PMCID: PMC6118309 DOI: 10.1534/g3.118.200373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Alternative splicing leverages genomic content by allowing the synthesis of multiple transcripts and, by implication, protein isoforms, from a single gene. However, estimating the abundance of transcripts produced in a given tissue from short sequencing reads is difficult and can result in both the construction of transcripts that do not exist, and the failure to identify true transcripts. An alternative approach is to catalog the events that make up isoforms (splice junctions and exons). We present here the Event Analysis (EA) approach, where we project transcripts onto the genome and identify overlapping/unique regions and junctions. In addition, all possible logical junctions are assembled into a catalog. Transcripts are filtered before quantitation based on simple measures: the proportion of the events detected, and the coverage. We find that mapping to a junction catalog is more efficient at detecting novel junctions than mapping in a splice aware manner. We identify 99.8% of true transcripts while iReckon identifies 82% of the true transcripts and creates more transcripts not included in the simulation than were initially used in the simulation. Using PacBio Iso-seq data from a mouse neural progenitor cell model, EA detects 60% of the novel junctions that are combinations of existing exons while only 43% are detected by STAR. EA further detects ∼5,000 annotated junctions missed by STAR. Filtering transcripts based on the proportion of the transcript detected and the number of reads on average supporting that transcript captures 95% of the PacBio transcriptome. Filtering the reference transcriptome before quantitation, results in is a more stable estimate of isoform abundance, with improved correlation between replicates. This was particularly evident when EA is applied to an RNA-seq study of type 1 diabetes (T1D), where the coefficient of variation among subjects (n = 81) in the transcript abundance estimates was substantially reduced compared to the estimation using the full reference. EA focuses on individual transcriptional events. These events can be quantitate and analyzed directly or used to identify the probable set of expressed transcripts. Simple rules based on detected events and coverage used in filtering result in a dramatic improvement in isoform estimation without the use of ancillary data (e.g., ChIP, long reads) that may not be available for many studies.

Collapse

Papastamoulis P, Rattray M. Bayesian estimation of differential transcript usage from RNA-seq data. Stat Appl Genet Mol Biol 2018;16:367-386. [PMID: 29091583 DOI: 10.1515/sagmb-2017-0005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Schaeffer L, Pimentel H, Bray N, Melsted P, Pachter L. Pseudoalignment for metagenomic read assignment. Bioinformatics 2018;33:2082-2088. [PMID: 28334086 DOI: 10.1093/bioinformatics/btx106] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 02/17/2017] [Indexed: 12/13/2022] Open

Mangul S, Yang HT, Strauli N, Gruhl F, Porath HT, Hsieh K, Chen L, Daley T, Christenson S, Wesolowska-Andersen A, Spreafico R, Rios C, Eng C, Smith AD, Hernandez RD, Ophoff RA, Santana JR, Levanon EY, Woodruff PG, Burchard E, Seibold MA, Shifman S, Eskin E, Zaitlen N. ROP: dumpster diving in RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues. Genome Biol 2018;19:36. [PMID: 29548336 PMCID: PMC5857127 DOI: 10.1186/s13059-018-1403-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 02/02/2018] [Indexed: 11/22/2022] Open

Affiliation(s)

Serghei Mangul Department of Computer Science, University of California, Los Angeles, CA, USA. .,Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA.
Harry Taegyun Yang Department of Computer Science, University of California, Los Angeles, CA, USA
Nicolas Strauli Biomedical Sciences Graduate Program, University of California, San Francisco, CA, USA
Franziska Gruhl Center for Integrative Genomics, University of Lausanne, Lausanne, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
Hagit T Porath The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
Kevin Hsieh Department of Computer Science, University of California, Los Angeles, CA, USA
Linus Chen Department of Bioengineering, University of California, Los Angeles, CA, USA
Timothy Daley Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
Stephanie Christenson Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, and Cardiovascular Research Institute, University of California, San Francisco, CA, USA
Agata Wesolowska-Andersen Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA
Roberto Spreafico Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, CA, USA
Cydney Rios Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, USA
Celeste Eng Department of Medicine, University of California, San Francisco, CA, USA
Andrew D Smith Molecular and Computational Biology, Department of Biological Sciences, University of Southern California, Los Angeles, CA, USA
Ryan D Hernandez Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA.,Institute for Quantitative Biosciences, University of California, San Francisco, CA, USA.,Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA
Roel A Ophoff Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, University California, Los Angeles, CA, USA.,Department of Human Genetics, University of California, Los Angeles, CA, USA.,Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
Jose Rodriguez Santana Centro de Neumología Pediátrica, San Juan, Puerto Rico
Erez Y Levanon The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
Prescott G Woodruff Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, and Cardiovascular Research Institute, University of California, San Francisco, CA, USA
Esteban Burchard Schools of Pharmacy and Medicine, Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA, USA
Max A Seibold Department of Pediatrics, National Jewish Health, Denver, CO, USA.,University of Colorado School of Medicine, Denver, CO, USA
Sagiv Shifman Department of Genetics, The Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
Eleazar Eskin Department of Computer Science, University of California, Los Angeles, CA, USA.,Department of Human Genetics, University of California, Los Angeles, CA, USA
Noah Zaitlen Division of Pulmonary, Critical Care, Sleep and Allergy, Department of Medicine, and Cardiovascular Research Institute, University of California, San Francisco, CA, USA.

Collapse

Li HL, Lin HR, Xia JH. Differential Gene Expression Profiles and Alternative Isoform Regulations in Gill of Nile Tilapia in Response to Acute Hypoxia. MARINE BIOTECHNOLOGY (NEW YORK, N.Y.) 2017;19:551-562. [PMID: 28920148 DOI: 10.1007/s10126-017-9774-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2017] [Accepted: 07/27/2017] [Indexed: 06/07/2023]

Zhang P, He D, Xu Y, Hou J, Pan BF, Wang Y, Liu T, Davis CM, Ehli EA, Tan L, Zhou F, Hu J, Yu Y, Chen X, Nguyen TM, Rosen JM, Hawke DH, Ji Z, Chen Y. Genome-wide identification and differential analysis of translational initiation. Nat Commun 2017;8:1749. [PMID: 29170441 PMCID: PMC5701008 DOI: 10.1038/s41467-017-01981-8] [Citation(s) in RCA: 83] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2017] [Accepted: 10/31/2017] [Indexed: 01/28/2023] Open

Affiliation(s)

Peng Zhang Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Dandan He Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Yi Xu Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Jiakai Hou Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Bih-Fang Pan Proteomics and Metabolomics Facility, and Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Yunfei Wang Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Tao Liu Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY, 14203, USA
Christel M Davis Avera Institute for Human Genetics, Sioux Falls, SD, 57108, USA
Erik A Ehli Avera Institute for Human Genetics, Sioux Falls, SD, 57108, USA
Lin Tan Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Feng Zhou Liver Cancer Institute, Zhongshan Hospital, Key Laboratory of Carcinogenesis and Cancer Invasion, Minister of Education, and Institutes of Biomedical Sciences, Fudan University, Shanghai, 200032, China
Jian Hu Department of Cancer Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77054, USA
Yonghao Yu Department of Biochemistry, The University of Texas Southwestern Medical Center, Dallas, TX, 75390, USA
Xi Chen Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
Tuan M Nguyen Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, 77030, USA Program in Translational Biology and Molecular Medicine, Baylor College of Medicine, Houston, TX, 77030, USA
Jeffrey M Rosen Department of Molecular and Cellular Biology, Baylor College of Medicine, Houston, TX, 77030, USA
David H Hawke Proteomics and Metabolomics Facility, and Department of Systems Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
Zhe Ji Department of Biological Chemistry and Molecular and Pharmacology, Harvard Medical School, Boston, MA, 02115, USA Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
Yiwen Chen Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA.

Collapse

Srivastava A, Sarkar H, Gupta N, Patro R. RapMap: a rapid, sensitive and accurate tool for mapping RNA-seq reads to transcriptomes. Bioinformatics 2017;32:i192-i200. [PMID: 27307617 PMCID: PMC4908361 DOI: 10.1093/bioinformatics/btw277] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open

Zakeri M, Srivastava A, Almodaresi F, Patro R. Improved data-driven likelihood factorizations for transcript abundance estimation. Bioinformatics 2017;33:i142-i151. [PMID: 28881996 PMCID: PMC5870700 DOI: 10.1093/bioinformatics/btx262] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Abstract

MOTIVATION

Many methods for transcript-level abundance estimation reduce the computational burden associated with the iterative algorithms they use by adopting an approximate factorization of the likelihood function they optimize. This leads to considerably faster convergence of the optimization procedure, since each round of e.g. the EM algorithm, can execute much more quickly. However, these approximate factorizations of the likelihood function simplify calculations at the expense of discarding certain information that can be useful for accurate transcript abundance estimation.

RESULTS

We demonstrate that model simplifications (i.e. factorizations of the likelihood function) adopted by certain abundance estimation methods can lead to a diminished ability to accurately estimate the abundances of highly related transcripts. In particular, considering factorizations based on transcript-fragment compatibility alone can result in a loss of accuracy compared to the per-fragment, unsimplified model. However, we show that such shortcomings are not an inherent limitation of approximately factorizing the underlying likelihood function. By considering the appropriate conditional fragment probabilities, and adopting improved, data-driven factorizations of this likelihood, we demonstrate that such approaches can achieve accuracy nearly indistinguishable from methods that consider the complete (i.e. per-fragment) likelihood, while retaining the computational efficiently of the compatibility-based factorizations.

AVAILABILITY AND IMPLEMENTATION

Our data-driven factorizations are incorporated into a branch of the Salmon transcript quantification tool: https://github.com/COMBINE-lab/salmon/tree/factorizations .

CONTACT

rob.patro@cs.stonybrook.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Endocannabinoid system acts as a regulator of immune homeostasis in the gut. Proc Natl Acad Sci U S A 2017;114:5005-5010. [PMID: 28439004 DOI: 10.1073/pnas.1612177114] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Patro R, Duggal G, Love MI, Irizarry RA, Kingsford C. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 2017;14:417-419. [PMID: 28263959 PMCID: PMC5600148 DOI: 10.1038/nmeth.4197] [Citation(s) in RCA: 5930] [Impact Index Per Article: 847.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 01/22/2017] [Indexed: 12/12/2022]

Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods 2017. [PMID: 28263959 DOI: 10.1038/nmeth.4197.] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Papastamoulis P, Rattray M. A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data. J R Stat Soc Ser C Appl Stat 2017;67:3-23. [PMID: 29353941 PMCID: PMC5763373 DOI: 10.1111/rssc.12213] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]

Williams CR, Baccarella A, Parrish JZ, Kim CC. Empirical assessment of analysis workflows for differential expression analysis of human samples using RNA-Seq. BMC Bioinformatics 2017;18:38. [PMID: 28095772 PMCID: PMC5240434 DOI: 10.1186/s12859-016-1457-z] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2016] [Accepted: 12/31/2016] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

RNA-Seq has supplanted microarrays as the preferred method of transcriptome-wide identification of differentially expressed genes. However, RNA-Seq analysis is still rapidly evolving, with a large number of tools available for each of the three major processing steps: read alignment, expression modeling, and identification of differentially expressed genes. Although some studies have benchmarked these tools against gold standard gene expression sets, few have evaluated their performance in concert with one another. Additionally, there is a general lack of testing of such tools on real-world, physiologically relevant datasets, which often possess qualities not reflected in tightly controlled reference RNA samples or synthetic datasets.

RESULTS

Here, we evaluate 219 combinatorial implementations of the most commonly used analysis tools for their impact on differential gene expression analysis by RNA-Seq. A test dataset was generated using highly purified human classical and nonclassical monocyte subsets from a clinical cohort, allowing us to evaluate the performance of 495 unique workflows, when accounting for differences in expression units and gene- versus transcript-level estimation. We find that the choice of methodologies leads to wide variation in the number of genes called significant, as well as in performance as gauged by precision and recall, calculated by comparing our RNA-Seq results to those from four previously published microarray and BeadChip analyses of the same cell populations. The method of differential gene expression identification exhibited the strongest impact on performance, with smaller impacts from the choice of read aligner and expression modeler. Many workflows were found to exhibit similar overall performance, but with differences in their calibration, with some biased toward higher precision and others toward higher recall.

CONCLUSIONS

There is significant heterogeneity in the performance of RNA-Seq workflows to identify differentially expressed genes. Among the higher performing workflows, different workflows exhibit a precision/recall tradeoff, and the ultimate choice of workflow should take into consideration how the results will be used in subsequent applications. Our analyses highlight the performance characteristics of these workflows, and the data generated in this study could also serve as a useful resource for future development of software for RNA-Seq analysis.

Collapse

Love MI, Hogenesch JB, Irizarry RA. Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nat Biotechnol 2016. [PMID: 27669167 DOI: 10.1101/025767] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation. Nat Biotechnol 2016;34:1287-1291. [PMID: 27669167 PMCID: PMC5143225 DOI: 10.1038/nbt.3682] [Citation(s) in RCA: 100] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 08/22/2016] [Indexed: 11/17/2022]

Karunakaran DKP, Al Seesi S, Banday AR, Baumgartner M, Olthof A, Lemoine C, Măndoiu II, Kanadia RN. Network-based bioinformatics analysis of spatio-temporal RNA-Seq data reveals transcriptional programs underpinning normal and aberrant retinal development. BMC Genomics 2016;17 Suppl 5:495. [PMID: 27586787 PMCID: PMC5009874 DOI: 10.1186/s12864-016-2822-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Abstract

Background

The retina as a model system with extensive information on genes involved in development/maintenance is of great value for investigations employing deep sequencing to capture transcriptome change over time. This in turn could enable us to find patterns in gene expression across time to reveal transition in biological processes.

Methods

We developed a bioinformatics pipeline to categorize genes based on their differential expression and their alternative splicing status across time by binning genes based on their transcriptional kinetics. Genes within same bins were then leveraged to query gene annotation databases to discover molecular programs employed by the developing retina.

Results

Using our pipeline on RNA-Seq data obtained from fractionated (nucleus/cytoplasm) developing retina at embryonic day (E) 16 and postnatal day (P) 0, we captured high-resolution as in the difference between the cytoplasm and the nucleus at the same developmental time. We found de novo transcription of genes whose transcripts were exclusively found in the nuclear transcriptome at P0. Further analysis showed that these genes enriched for functions that are known to be executed during postnatal development, thus showing that the P0 nuclear transcriptome is temporally ahead of that of its cytoplasm. We extended our strategy to perform temporal analysis comparing P0 data to either P21-Nrl-wildtype (WT) or P21-Nrl-knockout (KO) retinae, which predicted that the KO retina would have compromised vasculature. Indeed, histological manifestation of vasodilation has been reported at a later time point (P60).

Conclusions

Thus, our approach was predictive of a phenotype before it presented histologically. Our strategy can be extended to investigating the development and/or disease progression of other tissue types.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-016-2822-z) contains supplementary material, which is available to authorized users.

Collapse

Huang Y, Sanguinetti G. Statistical modeling of isoform splicing dynamics from RNA-seq time series data. Bioinformatics 2016;32:2965-72. [PMID: 27318208 DOI: 10.1093/bioinformatics/btw364] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 06/05/2016] [Indexed: 01/08/2023] Open

Yuan Y, Xu H, Leung RKK. An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq. BMC Genomics 2016;17:403. [PMID: 27229683 PMCID: PMC4880854 DOI: 10.1186/s12864-016-2745-8] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 05/14/2016] [Indexed: 11/28/2022] Open

Ntranos V, Kamath GM, Zhang JM, Pachter L, Tse DN. Fast and accurate single-cell RNA-seq analysis by clustering of transcript-compatibility counts. Genome Biol 2016;17:112. [PMID: 27230763 PMCID: PMC4881296 DOI: 10.1186/s13059-016-0970-8] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 04/29/2016] [Indexed: 12/17/2022] Open

Lin Z, Li M, Sestan N, Zhao H. A Markov random field-based approach for joint estimation of differentially expressed genes in mouse transcriptome data. Stat Appl Genet Mol Biol 2016;15:139-50. [PMID: 26926866 PMCID: PMC5587217 DOI: 10.1515/sagmb-2015-0070] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]