Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Oberg AL, Bot BM, Grill DE, Poland GA, Therneau TM. Technical and biological variance structure in mRNA-Seq data: life in the real world. BMC Genomics 2012;13:304. [PMID: 22769017 PMCID: PMC3505161 DOI: 10.1186/1471-2164-13-304] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Accepted: 07/07/2012] [Indexed: 01/14/2023] Open

For:	Oberg AL, Bot BM, Grill DE, Poland GA, Therneau TM. Technical and biological variance structure in mRNA-Seq data: life in the real world. BMC Genomics 2012;13:304. [PMID: 22769017 PMCID: PMC3505161 DOI: 10.1186/1471-2164-13-304] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Accepted: 07/07/2012] [Indexed: 01/14/2023] Open

Number

Cited by Other Article(s)

Williams EC, Chazarra-Gil R, Shahsavari A, Mohorianu I. The Sum of Two Halves May Be Different from the Whole-Effects of Splitting Sequencing Samples Across Lanes. Genes (Basel) 2022;13:genes13122265. [PMID: 36553532 PMCID: PMC9777937 DOI: 10.3390/genes13122265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 11/23/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open

Lozoya OA, McClelland KS, Papas BN, Li JL, Yao HHC. Patterns, Profiles, and Parsimony: Dissecting Transcriptional Signatures From Minimal Single-Cell RNA-Seq Output With SALSA. Front Genet 2020;11:511286. [PMID: 33193599 PMCID: PMC7586319 DOI: 10.3389/fgene.2020.511286] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 09/18/2020] [Indexed: 11/23/2022] Open

Abstract

Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects.

Collapse

Kuzmin DA, Feranchuk SI, Sharov VV, Cybin AN, Makolov SV, Putintseva YA, Oreshkova NV, Krutovsky KV. Stepwise large genome assembly approach: a case of Siberian larch (Larix sibirica Ledeb). BMC Bioinformatics 2019;20:37. [PMID: 30717661 PMCID: PMC6362582 DOI: 10.1186/s12859-018-2570-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Affiliation(s)

Dmitry A Kuzmin Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074, Krasnoyarsk, Russia
Sergey I Feranchuk Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Department of Informatics, National Research Technical University, 664074, Irkutsk, Russia.,Limnological Institute, Siberian Branch of Russian Academy of Sciences, 664033, Irkutsk, Russia
Vadim V Sharov Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074, Krasnoyarsk, Russia
Alexander N Cybin Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074, Krasnoyarsk, Russia
Stepan V Makolov Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Department of High Performance Computing, Institute of Space and Information Technologies, Siberian Federal University, 660074, Krasnoyarsk, Russia
Yuliya A Putintseva Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia
Natalya V Oreshkova Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia.,Laboratory of Forest Genetics and Selection, V. N. Sukachev Institute of Forest, Siberian Branch of Russian Academy of Sciences, 660036, Krasnoyarsk, Russia
Konstantin V Krutovsky Laboratory of Forest Genomics, Genome Research and Education Center, Siberian Federal University, 660036, Krasnoyarsk, Russia. .,Department of Forest Genetics and Forest Tree Breeding, Georg-August University of Göttingen, 37077, Göttingen, Germany. .,Laboratory of Population Genetics, N. I. Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119333, Russia. .,Department of Ecosystem Science and Management, Texas A&M University, College Station, TX, 77843-2138, USA.

Collapse

Expression analysis of RNA sequencing data from human neural and glial cell lines depends on technical replication and normalization methods. BMC Bioinformatics 2018;19:412. [PMID: 30453873 PMCID: PMC6245503 DOI: 10.1186/s12859-018-2382-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Abstract

Background

The potential for astrocyte participation in central nervous system recovery is highlighted by in vitro experiments demonstrating their capacity to transdifferentiate into neurons. Understanding astrocyte plasticity could be advanced by comparing astrocytes with stem cells. RNA sequencing (RNA-seq) is ideal for comparing differences across cell types. However, this novel multi-stage process has the potential to introduce unwanted technical variation at several points in the experimental workflow. Quantitative understanding of the contribution of experimental parameters to technical variation would facilitate the design of robust RNA-Seq experiments.

Results

RNA-Seq was used to achieve biological and technical objectives. The biological aspect compared gene expression between normal human fetal-derived astrocytes and human neural stem cells cultured in identical conditions. When differential expression threshold criteria of |log₂fold change| > 2 were applied to the data, no significant differences were observed. The technical component quantified variation arising from particular steps in the research pathway, and compared the ability of different normalization methods to reduce unwanted variance. To facilitate this objective, a liberal false discovery rate of 10% and a |log₂fold change| > 0.5 were implemented for the differential expression threshold. Data were normalized with RPKM, TMM, and UQS methods using JMP Genomics. The contributions of key replicable experimental parameters (cell lot; library preparation; flow cell) to variance in the data were evaluated using principal variance component analysis. Our analysis showed that, although the variance for every parameter is strongly influenced by the normalization method, the largest contributor to technical variance was library preparation. The ability to detect differentially expressed genes was also affected by normalization; differences were only detected in non-normalized and TMM-normalized data.

Conclusions

The similarity in gene expression between astrocytes and neural stem cells supports the potential for astrocytic transdifferentiation into neurons, and emphasizes the need to evaluate the therapeutic potential of astrocytes for central nervous system damage. The choice of normalization method influences the contributions to experimental variance as well as the outcomes of differential expression analysis. However irrespective of normalization method, our findings illustrate that library preparation contributed the largest component of technical variance.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2382-0) contains supplementary material, which is available to authorized users.

Collapse

Veras PST, Ramos PIP, de Menezes JPB. In Search of Biomarkers for Pathogenesis and Control of Leishmaniasis by Global Analyses of Leishmania-Infected Macrophages. Front Cell Infect Microbiol 2018;8:326. [PMID: 30283744 PMCID: PMC6157484 DOI: 10.3389/fcimb.2018.00326] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2018] [Accepted: 08/27/2018] [Indexed: 12/12/2022] Open

Abstract

Leishmaniasis is a vector-borne, neglected tropical disease with a worldwide distribution that can present in a variety of clinical forms, depending on the parasite species and host genetic background. The pathogenesis of this disease remains far from being elucidated because the involvement of a complex immune response orchestrated by host cells significantly affects the clinical outcome. Among these cells, macrophages are the main host cells, produce cytokines and chemokines, thereby triggering events that contribute to the mediation of the host immune response and, subsequently, to the establishment of infection or, alternatively, disease control. There has been relatively limited commercial interest in developing new pharmaceutical compounds to treat leishmaniasis. Moreover, advances in the understanding of the underlying biology of Leishmania spp. have not translated into the development of effective new chemotherapeutic compounds. As a result, biomarkers as surrogate disease endpoints present several potential advantages to be used in the identification of targets capable of facilitating therapeutic interventions considered to ameliorate disease outcome. More recently, large-scale genomic and proteomic analyses have allowed the identification and characterization of the pathways involved in the infection process in both parasites and the host, and these analyses have been shown to be more effective than studying individual molecules to elucidate disease pathogenesis. RNA-seq and proteomics are large-scale approaches that characterize genes or proteins in a given cell line, tissue, or organism to provide a global and more integrated view of the myriad biological processes that occur within a cell than focusing on an individual gene or protein. Bioinformatics provides us with the means to computationally analyze and integrate the large volumes of data generated by high-throughput sequencing approaches. The integration of genomic expression and proteomic data offers a rich multi-dimensional analysis, despite the inherent technical and statistical challenges. We propose that these types of global analyses facilitate the identification, among a large number of genes and proteins, those that hold potential as biomarkers. The present review focuses on large-scale studies that have identified and evaluated relevant biomarkers in macrophages in response to Leishmania infection.

Collapse

Pannala VR, Wall ML, Estes SK, Trenary I, O'Brien TP, Printz RL, Vinnakota KC, Reifman J, Shiota M, Young JD, Wallqvist A. Metabolic network-based predictions of toxicant-induced metabolite changes in the laboratory rat. Sci Rep 2018;8:11678. [PMID: 30076366 PMCID: PMC6076258 DOI: 10.1038/s41598-018-30149-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 07/23/2018] [Indexed: 12/11/2022] Open

Affiliation(s)

Venkat R Pannala Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, MD, 21702, USA.
Martha L Wall Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Engineering, Nashville, TN, 37232, USA
Shanea K Estes Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
Irina Trenary Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Engineering, Nashville, TN, 37232, USA
Tracy P O'Brien Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
Richard L Printz Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
Kalyan C Vinnakota Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, MD, 21702, USA
Jaques Reifman Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, MD, 21702, USA
Masakazu Shiota Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
Jamey D Young Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA. .,Department of Chemical and Biomolecular Engineering, Vanderbilt University School of Engineering, Nashville, TN, 37232, USA.
Anders Wallqvist Department of Defense Biotechnology High Performance Computing Software Applications Institute, Telemedicine and Advanced Technology Research Center, U.S. Army Medical Research and Materiel Command, Fort Detrick, MD, 21702, USA.

Collapse

Lozoya OA, Santos JH, Woychik RP. A Leveraged Signal-to-Noise Ratio (LSTNR) Method to Extract Differentially Expressed Genes and Multivariate Patterns of Expression From Noisy and Low-Replication RNAseq Data. Front Genet 2018;9:176. [PMID: 29868123 PMCID: PMC5964166 DOI: 10.3389/fgene.2018.00176] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2017] [Accepted: 04/27/2018] [Indexed: 12/11/2022] Open

Abstract

To life scientists, one important feature offered by RNAseq, a next-generation sequencing tool used to estimate changes in gene expression levels, lies in its unprecedented resolution. It can score countable differences in transcript numbers among thousands of genes and between experimental groups, all at once. However, its high cost limits experimental designs to very small sample sizes, usually N = 3, which often results in statistically underpowered analysis and poor reproducibility. All these issues are compounded by the presence of experimental noise, which is harder to distinguish from instrumental error when sample sizes are limiting (e.g., small-budget pilot tests), experimental populations exhibit biologically heterogeneous or diffuse expression phenotypes (e.g., patient samples), or when discriminating among transcriptional signatures of closely related experimental conditions (e.g., toxicological modes of action, or MOAs). Here, we present a leveraged signal-to-noise ratio (LSTNR) thresholding method, founded on generalized linear modeling (GLM) of aligned read detection limits to extract differentially expressed genes (DEGs) from noisy low-replication RNAseq data. The LSTNR method uses an agnostic independent filtering strategy to define the dynamic range of detected aggregate read counts per gene, and assigns statistical weights that prioritize genes with better sequencing resolution in differential expression analyses. To assess its performance, we implemented the LSTNR method to analyze three separate datasets: first, using a systematically noisy in silico dataset, we demonstrated that LSTNR can extract pre-designed patterns of expression and discriminate between "noise" and "true" differentially expressed pseudogenes at a 100% success rate; then, we illustrated how the LSTNR method can assign patient-derived breast cancer specimens correctly to one out of their four reported molecular subtypes (luminal A, luminal B, Her2-enriched and basal-like); and last, we showed the ability to retrieve five different modes of action (MOA) elicited in livers of rats exposed to three toxicants under three nutritional routes by using the LSTNR method. By combining differential measurements with resolving power to detect DEGs, the LSTNR method offers an alternative approach to interrogate noisy and low-replication RNAseq datasets, which handles multiple biological conditions at once, and defines benchmarks to validate RNAseq experiments with standard benchtop assays.

Collapse

Genome-wide gene expression changes associated with exposure of rat liver, heart, and kidney cells to endosulfan. Toxicol In Vitro 2018;48:244-254. [DOI: 10.1016/j.tiv.2018.01.022] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Revised: 01/25/2018] [Accepted: 01/27/2018] [Indexed: 02/06/2023]

Wang M, Uebbing S, Ellegren H. Bayesian Inference of Allele-Specific Gene Expression Indicates Abundant Cis-Regulatory Variation in Natural Flycatcher Populations. Genome Biol Evol 2017;9:1266-1279. [PMID: 28453623 PMCID: PMC5434935 DOI: 10.1093/gbe/evx080] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/25/2017] [Indexed: 12/13/2022] Open

Huang WC, Ferris E, Cheng T, Hörndli CS, Gleason K, Tamminga C, Wagner JD, Boucher KM, Christian JL, Gregg C. Diverse Non-genetic, Allele-Specific Expression Effects Shape Genetic Architecture at the Cellular Level in the Mammalian Brain. Neuron 2017;93:1094-1109.e7. [PMID: 28238550 PMCID: PMC5774018 DOI: 10.1016/j.neuron.2017.01.033] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2016] [Revised: 11/27/2016] [Accepted: 01/30/2017] [Indexed: 01/19/2023]

Gene signatures associated with adaptive humoral immunity following seasonal influenza A/H1N1 vaccination. Genes Immun 2016;17:371-379. [PMID: 27534615 PMCID: PMC5133148 DOI: 10.1038/gene.2016.34] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2015] [Revised: 06/07/2016] [Accepted: 06/09/2016] [Indexed: 12/27/2022]

Haralambieva IH, Zimmermann MT, Ovsyannikova IG, Grill DE, Oberg AL, Kennedy RB, Poland GA. Whole Transcriptome Profiling Identifies CD93 and Other Plasma Cell Survival Factor Genes Associated with Measles-Specific Antibody Response after Vaccination. PLoS One 2016;11:e0160970. [PMID: 27529750 PMCID: PMC4987012 DOI: 10.1371/journal.pone.0160970] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 07/27/2016] [Indexed: 11/29/2022] Open

Yang EW, Jiang T. SDEAP: a splice graph based differential transcript expression analysis tool for population data. Bioinformatics 2016;32:3593-3602. [PMID: 27522083 DOI: 10.1093/bioinformatics/btw513] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 07/21/2016] [Accepted: 07/28/2016] [Indexed: 12/26/2022] Open

Abstract

MOTIVATION

Differential transcript expression (DTE) analysis without predefined conditions is critical to biological studies. For example, it can be used to discover biomarkers to classify cancer samples into previously unknown subtypes such that better diagnosis and therapy methods can be developed for the subtypes. Although several DTE tools for population data, i.e. data without known biological conditions, have been published, these tools either assume binary conditions in the input population or require the number of conditions as a part of the input. Fixing the number of conditions to binary is unrealistic and may distort the results of a DTE analysis. Estimating the correct number of conditions in a population could also be challenging for a routine user. Moreover, the existing tools only provide differential usages of exons, which may be insufficient to interpret the patterns of alternative splicing across samples and restrains the applications of the tools from many biology studies.

RESULTS

We propose a novel DTE analysis algorithm, called SDEAP, that estimates the number of conditions directly from the input samples using a Dirichlet mixture model and discovers alternative splicing events using a new graph modular decomposition algorithm. By taking advantage of the above technical improvement, SDEAP was able to outperform the other DTE analysis methods in our extensive experiments on simulated data and real data with qPCR validation. The prediction of SDEAP also allowed us to classify the samples of cancer subtypes and cell-cycle phases more accurately.

AVAILABILITY AND IMPLEMENTATION

SDEAP is publicly available for free at https://github.com/ewyang089/SDEAP/wiki CONTACT: yyang027@cs.ucr.edu; jiang@cs.ucr.eduSupplementary information: Supplementary data are available at Bioinformatics online.

Collapse

Buschmann D, Haberberger A, Kirchner B, Spornraft M, Riedmaier I, Schelling G, Pfaffl MW. Toward reliable biomarker signatures in the age of liquid biopsies - how to standardize the small RNA-Seq workflow. Nucleic Acids Res 2016;44:5995-6018. [PMID: 27317696 PMCID: PMC5291277 DOI: 10.1093/nar/gkw545] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2016] [Accepted: 06/03/2016] [Indexed: 12/21/2022] Open

Vincent AT, Derome N, Boyle B, Culley AI, Charette SJ. Next-generation sequencing (NGS) in the microbiological world: How to make the most of your money. J Microbiol Methods 2016;138:60-71. [PMID: 26995332 DOI: 10.1016/j.mimet.2016.02.016] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 01/26/2016] [Accepted: 02/24/2016] [Indexed: 12/16/2022]

Zhen H, Krumins V, Fennell DE, Mainelis G. Development of a dual-internal-reference technique to improve accuracy when determining bacterial 16S rRNA:16S rRNA gene ratio with application to Escherichia coli liquid and aerosol samples. J Microbiol Methods 2015;117:113-21. [PMID: 26241659 DOI: 10.1016/j.mimet.2015.07.023] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Revised: 07/27/2015] [Accepted: 07/27/2015] [Indexed: 01/04/2023]

Abstract

Accurate enumeration of rRNA content in microbial cells, e.g. by using the 16S rRNA:16S rRNA gene ratio, is critical to properly understand its relationship to microbial activities. However, few studies have considered possible methodological artifacts that may contribute to the variability of rRNA analysis results. In this study, a technique utilizing genomic DNA and 16S rRNA from an exogenous species (Pseudomonas fluorescens) as dual internal references was developed to improve accuracy when determining the 16S rRNA:16S rRNA gene ratio of a target organism, Escherichia coli. This technique was able to adequately control the variability in sample processing and analysis procedures due to nucleic acid (DNA and RNA) losses, inefficient reverse transcription of RNA, and inefficient PCR amplification. The measured 16S rRNA:16S rRNA gene ratio of E. coli increased by 2-3 fold when E. coli 16S rRNA gene and 16S rRNA quantities were normalized to the sample-specific fractional recoveries of reference (P. fluorescens) 16S rRNA gene and 16S rRNA, respectively. In addition, the intra-sample variation of this ratio, represented by coefficients of variation from replicate samples, decreased significantly after normalization. This technique was applied to investigate the temporal variation of 16S rRNA:16S rRNA gene ratio of E. coli during its non-steady-state growth in a complex liquid medium, and to E. coli aerosols when exposed to particle-free air after their collection on a filter. The 16S rRNA:16S rRNA gene ratio of E. coli increased significantly during its early exponential phase of growth; when E. coli aerosols were exposed to extended filtration stress after sample collection, the ratio also increased. In contrast, no significant temporal trend in E. coli 16S rRNA:16S rRNA gene ratio was observed when the determined ratios were not normalized based on the recoveries of dual references. The developed technique could be widely applied in studies of relationship between cellular rRNA abundance and bacterial activity.

Collapse

High Intensity Interval Training Favourably Affects Angiotensinogen mRNA Expression and Markers of Cardiorenal Health in a Rat Model of Early-Stage Chronic Kidney Disease. BIOMED RESEARCH INTERNATIONAL 2015;2015:156584. [PMID: 26090382 PMCID: PMC4458272 DOI: 10.1155/2015/156584] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 04/03/2015] [Indexed: 12/19/2022]

Oberg AL, McKinney BA, Schaid DJ, Pankratz VS, Kennedy RB, Poland GA. Lessons learned in the analysis of high-dimensional data in vaccinomics. Vaccine 2015;33:5262-70. [PMID: 25957070 DOI: 10.1016/j.vaccine.2015.04.088] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Revised: 04/16/2015] [Accepted: 04/23/2015] [Indexed: 12/17/2022]

Tsementzi D, Poretsky R, Rodriguez-R LM, Luo C, Konstantinidis KT. Evaluation of metatranscriptomic protocols and application to the study of freshwater microbial communities. ENVIRONMENTAL MICROBIOLOGY REPORTS 2014;6:640-655. [PMID: 25756118 DOI: 10.1111/1758-2229.12180] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]

Finotello F, Di Camillo B. Measuring differential gene expression with RNA-seq: challenges and strategies for data analysis. Brief Funct Genomics 2014;14:130-42. [PMID: 25240000 DOI: 10.1093/bfgp/elu035] [Citation(s) in RCA: 137] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Gatto A, Torroja-Fungairiño C, Mazzarotto F, Cook SA, Barton PJR, Sánchez-Cabo F, Lara-Pezzi E. FineSplice, enhanced splice junction detection and quantification: a novel pipeline based on the assessment of diverse RNA-Seq alignment solutions. Nucleic Acids Res 2014;42:e71. [PMID: 24574529 PMCID: PMC4005686 DOI: 10.1093/nar/gku166] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open

Wang X, Cairns MJ. SeqGSEA: a Bioconductor package for gene set enrichment analysis of RNA-Seq data integrating differential expression and splicing. ACTA ACUST UNITED AC 2014;30:1777-9. [PMID: 24535097 DOI: 10.1093/bioinformatics/btu090] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

McKinney BA, White BC, Grill DE, Li PW, Kennedy RB, Poland GA, Oberg AL. ReliefSeq: a gene-wise adaptive-K nearest-neighbor feature selection tool for finding gene-gene interactions and main effects in mRNA-Seq gene expression data. PLoS One 2013;8:e81527. [PMID: 24339943 PMCID: PMC3858248 DOI: 10.1371/journal.pone.0081527] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2013] [Accepted: 10/14/2013] [Indexed: 11/29/2022] Open

Abstract

Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main effects and interaction effects. Software Availability: http://insilico.utulsa.edu/ReliefSeq.php.

Collapse

Hart SN, Therneau TM, Zhang Y, Poland GA, Kocher JP. Calculating sample size estimates for RNA sequencing data. J Comput Biol 2013;20:970-8. [PMID: 23961961 DOI: 10.1089/cmb.2012.0283] [Citation(s) in RCA: 199] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open

Kennedy RB, Oberg AL, Ovsyannikova IG, Haralambieva IH, Grill D, Poland GA. Transcriptomic profiles of high and low antibody responders to smallpox vaccine. Genes Immun 2013;14:277-85. [PMID: 23594957 PMCID: PMC3723701 DOI: 10.1038/gene.2013.14] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2012] [Revised: 03/13/2013] [Accepted: 03/15/2013] [Indexed: 12/21/2022]

Ferté C, Trister AD, Huang E, Bot BM, Guinney J, Commo F, Sieberts S, André F, Besse B, Soria JC, Friend SH. Impact of bioinformatic procedures in the development and translation of high-throughput molecular classifiers in oncology. Clin Cancer Res 2013;19:4315-25. [PMID: 23780890 DOI: 10.1158/1078-0432.ccr-12-3937] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Genome-wide characterization of transcriptional patterns in high and low antibody responders to rubella vaccination. PLoS One 2013;8:e62149. [PMID: 23658707 PMCID: PMC3641062 DOI: 10.1371/journal.pone.0062149] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 03/18/2013] [Indexed: 12/16/2022] Open

Abstract

Immune responses to current rubella vaccines demonstrate significant inter-individual variability. We performed mRNA-Seq profiling on PBMCs from high and low antibody responders to rubella vaccination to delineate transcriptional differences upon viral stimulation. Generalized linear models were used to assess the per gene fold change (FC) for stimulated versus unstimulated samples or the interaction between outcome and stimulation. Model results were evaluated by both FC and p-value. Pathway analysis and self-contained gene set tests were performed for assessment of gene group effects.

Of 17,566 detected genes, we identified 1,080 highly significant differentially expressed genes upon viral stimulation (p<1.00E⁻¹⁵, FDR<1.00E⁻¹⁴), including various immune function and inflammation-related genes, genes involved in cell signaling, cell regulation and transcription, and genes with unknown function. Analysis by immune outcome and stimulation status identified 27 genes (p≤0.0006 and FDR≤0.30) that responded differently to viral stimulation in high vs. low antibody responders, including major histocompatibility complex (MHC) class I genes (HLA-A, HLA-B and B2M with p = 0.0001, p = 0.0005 and p = 0.0002, respectively), and two genes related to innate immunity and inflammation (EMR3 and MEFV with p = 1.46E⁻⁰⁸ and p = 0.0004, respectively). Pathway and gene set analysis also revealed transcriptional differences in antigen presentation and innate/inflammatory gene sets and pathways between high and low responders. Using mRNA-Seq genome-wide transcriptional profiling, we identified antigen presentation and innate/inflammatory genes that may assist in explaining rubella vaccine-induced immune response variations. Such information may provide new scientific insights into vaccine-induced immunity useful in rational vaccine development and immune response monitoring.

Collapse

Wang X, Cairns MJ. Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing. BMC Bioinformatics 2013;14 Suppl 5:S16. [PMID: 23734663 PMCID: PMC3622641 DOI: 10.1186/1471-2105-14-s5-s16] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Kloster MB, Bilgrau AE, Rodrigo-Domingo M, Bergkvist KS, Schmitz A, Sønderkær M, Bødker JS, Falgreen S, Nyegaard M, Johnsen HE, Nielsen KL, Dybkaer K, Bøgsted M. A model system for assessing and comparing the ability of exon microarray and tag sequencing to detect genes specific for malignant B-cells. BMC Genomics 2012;13:596. [PMID: 23127183 PMCID: PMC3505742 DOI: 10.1186/1471-2164-13-596] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 10/11/2012] [Indexed: 11/30/2022] Open

Abstract

Background

Malignant cells in tumours of B-cell origin account for 0.1% to 98% of the total cell content, depending on disease entity. Recently, gene expression profiles (GEPs) of B-cell lymphomas based on microarray technologies have contributed significantly to improved sub-classification and diagnostics. However, the varying degrees of malignant B-cell frequencies in analysed samples influence the interpretation of the GEPs. Based on emerging next-generation sequencing technologies (NGS) like tag sequencing (tag-seq) for GEP, it is expected that the detection of mRNA transcripts from malignant B-cells can be supplemented. This study provides a quantitative assessment and comparison of the ability of microarrays and tag-seq to detect mRNA transcripts from malignant B-cells. A model system was established by eight serial dilutions of the malignant B-cell lymphoma cell line, OCI-Ly8, into the embryonic kidney cell line, HEK293, prior to parallel analysis by exon microarrays and tag-seq.

Results

We identified 123 and 117 differentially expressed genes between pure OCI-Ly8 and HEK293 cells by exon microarray and tag-seq, respectively. There were thirty genes in common, and of those, most were B-cell specific. Hierarchical clustering from all dilutions based on the differentially expressed genes showed that neither technology could distinguish between samples with less than 1% malignant B-cells from non-B-cells. A novel statistical concept was developed to assess the ability to detect single genes for both technologies, and used to demonstrate an inverse proportional relationship with the sample purity. Of the 30 common genes, the detection capability of a representative set of three B-cell specific genes - CD74, HLA-DRA, and BCL6 - was analysed. It was noticed that at least 5%, 13% and 22% sample purity respectively was required for detection of the three genes by exon microarray whereas at least 2%, 4% and 51% percent sample purity of malignant B-cells were required for tag-seq detection.

Conclusion

A sample purity-dependent loss of the ability to detect genes for both technologies was demonstrated. Taq-seq, in comparison to exon microarray, required slightly less malignant B-cells in the samples analysed in order to detect the two most abundantly expressed of the selected genes. The results show that malignant cell frequency is an important variable, with fundamental impact when interpreting GEPs from both technologies.

Collapse