1
|
Zhang J, Li Y, Zhu F, Guo X, Huang Y. Time-/dose- series transcriptome data analysis and traditional Chinese medicine treatment of pneumoconiosis. Int J Biol Macromol 2024; 267:131515. [PMID: 38614165 DOI: 10.1016/j.ijbiomac.2024.131515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Revised: 04/07/2024] [Accepted: 04/09/2024] [Indexed: 04/15/2024]
Abstract
Pneumoconiosis' pathogenesis is still unclear and specific drugs for its treatment are lacking. Analysis of series transcriptome data often uses a single comparison method, and there are few reports on using such data to predict the treatment of pneumoconiosis with traditional Chinese medicine (TCM). Here, we proposed a new method for analyzing series transcriptomic data, series difference analysis (SDA), and applied it to pneumoconiosis. By comparison with 5 gene sets including existing pneumoconiosis-related genes and gene set functional enrichment analysis, we demonstrated that the new method was not inferior to two existing traditional analysis methods. Furthermore, based on the TCM-drug target interaction network, we predicted the TCM corresponding to the common pneumoconiosis-related genes obtained by multiple methods, and combined them with the high-frequency TCM for its treatment obtained through literature mining to form a new TCM formula for it. After feeding it to pneumoconiosis modeling mice for two months, compared with the untreated group, the coat color, mental state and tissue sections of the mice in the treated group were markedly improved, indicating that the new TCM formula has a certain efficacy. Our study provides new insights into method development for series transcriptomic data analysis and treatment of pneumoconiosis.
Collapse
Affiliation(s)
- Jifeng Zhang
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Health and Safety, Ministry of Education, Anhui University of Science and Technology, Huainan, Anhui 232001, China; School of Biological Engineering & Institute of Digital Ecology and Health, Huainan Normal University, Huainan, China
| | - Yaobin Li
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Health and Safety, Ministry of Education, Anhui University of Science and Technology, Huainan, Anhui 232001, China.
| | - Fenglin Zhu
- Key Laboratory of Industrial Dust Prevention and Control & Occupational Health and Safety, Ministry of Education, Anhui University of Science and Technology, Huainan, Anhui 232001, China
| | - Xiaodi Guo
- School of Biological Engineering & Institute of Digital Ecology and Health, Huainan Normal University, Huainan, China
| | - Yuqing Huang
- School of Biological Engineering & Institute of Digital Ecology and Health, Huainan Normal University, Huainan, China
| |
Collapse
|
2
|
Ajmal HB, Madden MG. Dynamic Bayesian Network Learning to Infer Sparse Models From Time Series Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2794-2805. [PMID: 34181549 DOI: 10.1109/tcbb.2021.3092879] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
One of the key challenges in systems biology is to derive gene regulatory networks (GRNs) from complex high-dimensional sparse data. Bayesian networks (BNs) and dynamic Bayesian networks (DBNs) have been widely applied to infer GRNs from gene expression data. GRNs are typically sparse but traditional approaches of BN structure learning to elucidate GRNs often produce many spurious (false positive) edges. We present two new BN scoring functions, which are extensions to the Bayesian Information Criterion (BIC) score, with additional penalty terms and use them in conjunction with DBN structure search methods to find a graph structure that maximises the proposed scores. Our BN scoring functions offer better solutions for inferring networks with fewer spurious edges compared to the BIC score. The proposed methods are evaluated extensively on auto regressive and DREAM4 benchmarks. We found that they significantly improve the precision of the learned graphs, relative to the BIC score. The proposed methods are also evaluated on three real time series gene expression datasets. The results demonstrate that our algorithms are able to learn sparse graphs from high-dimensional time series data. The implementation of these algorithms is open source and is available in form of an R package on GitHub at https://github.com/HamdaBinteAjmal/DBN4GRN, along with the documentation and tutorials.
Collapse
|
3
|
Guo W, Tzioutziou NA, Stephen G, Milne I, Calixto CPG, Waugh R, Brown JWS, Zhang R. 3D RNA-seq: a powerful and flexible tool for rapid and accurate differential expression and alternative splicing analysis of RNA-seq data for biologists. RNA Biol 2021; 18:1574-1587. [PMID: 33345702 PMCID: PMC8594885 DOI: 10.1080/15476286.2020.1858253] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Revised: 11/26/2020] [Accepted: 11/27/2020] [Indexed: 12/19/2022] Open
Abstract
RNA-sequencing (RNA-seq) analysis of gene expression and alternative splicing should be routine and robust but is often a bottleneck for biologists because of different and complex analysis programs and reliance on specialized bioinformatics skills. We have developed the '3D RNA-seq' App, an R shiny App and web-based pipeline for the comprehensive analysis of RNA-seq data from any organism. It represents an easy-to-use, flexible and powerful tool for analysis of both gene and transcript-level gene expression to identify differential gene/transcript expression, differential alternative splicing and differential transcript usage (3D) as well as isoform switching from RNA-seq data. 3D RNA-seq integrates state-of-the-art differential expression analysis tools and adopts best practice for RNA-seq analysis. The program is designed to be run by biologists with minimal bioinformatics experience (or by bioinformaticians) allowing lab scientists to analyse their RNA-seq data. It achieves this by operating through a user-friendly graphical interface which automates the data flow through the programs in the pipeline. The comprehensive analysis performed by 3D RNA-seq is extremely rapid and accurate, can handle complex experimental designs, allows user setting of statistical parameters, visualizes the results through graphics and tables, and generates publication quality figures such as heat-maps, expression profiles and GO enrichment plots. The utility of 3D RNA-seq is illustrated by analysis of data from a time-series of cold-treated Arabidopsis plants and from dexamethasone-treated male and female mouse cortex and hypothalamus data identifying dexamethasone-induced sex- and brain region-specific differential gene expression and alternative splicing.
Collapse
Affiliation(s)
- Wenbin Guo
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Nikoleta A Tzioutziou
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
| | - Gordon Stephen
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Iain Milne
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Cristiane PG Calixto
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, UK
| | - John W. S. Brown
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| |
Collapse
|
4
|
Guo W, Tzioutziou NA, Stephen G, Milne I, Calixto CP, Waugh R, Brown JWS, Zhang R. 3D RNA-seq: a powerful and flexible tool for rapid and accurate differential expression and alternative splicing analysis of RNA-seq data for biologists. RNA Biol 2021. [PMID: 33345702 DOI: 10.1101/656686] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023] Open
Abstract
RNA-sequencing (RNA-seq) analysis of gene expression and alternative splicing should be routine and robust but is often a bottleneck for biologists because of different and complex analysis programs and reliance on specialized bioinformatics skills. We have developed the '3D RNA-seq' App, an R shiny App and web-based pipeline for the comprehensive analysis of RNA-seq data from any organism. It represents an easy-to-use, flexible and powerful tool for analysis of both gene and transcript-level gene expression to identify differential gene/transcript expression, differential alternative splicing and differential transcript usage (3D) as well as isoform switching from RNA-seq data. 3D RNA-seq integrates state-of-the-art differential expression analysis tools and adopts best practice for RNA-seq analysis. The program is designed to be run by biologists with minimal bioinformatics experience (or by bioinformaticians) allowing lab scientists to analyse their RNA-seq data. It achieves this by operating through a user-friendly graphical interface which automates the data flow through the programs in the pipeline. The comprehensive analysis performed by 3D RNA-seq is extremely rapid and accurate, can handle complex experimental designs, allows user setting of statistical parameters, visualizes the results through graphics and tables, and generates publication quality figures such as heat-maps, expression profiles and GO enrichment plots. The utility of 3D RNA-seq is illustrated by analysis of data from a time-series of cold-treated Arabidopsis plants and from dexamethasone-treated male and female mouse cortex and hypothalamus data identifying dexamethasone-induced sex- and brain region-specific differential gene expression and alternative splicing.
Collapse
Affiliation(s)
- Wenbin Guo
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Nikoleta A Tzioutziou
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
| | - Gordon Stephen
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Iain Milne
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| | - Cristiane Pg Calixto
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, UK
| | - John W S Brown
- Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, UK
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, The James Hutton Institute, Dundee, UK
| |
Collapse
|
5
|
McLoughlin KE, Correia CN, Browne JA, Magee DA, Nalpas NC, Rue-Albrecht K, Whelan AO, Villarreal-Ramos B, Vordermeier HM, Gormley E, Gordon SV, MacHugh DE. RNA-Seq Transcriptome Analysis of Peripheral Blood From Cattle Infected With Mycobacterium bovis Across an Experimental Time Course. Front Vet Sci 2021; 8:662002. [PMID: 34124223 PMCID: PMC8193354 DOI: 10.3389/fvets.2021.662002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Accepted: 04/06/2021] [Indexed: 12/14/2022] Open
Abstract
Bovine tuberculosis, caused by infection with members of the Mycobacterium tuberculosis complex, particularly Mycobacterium bovis, is a major endemic disease affecting cattle populations worldwide, despite the implementation of stringent surveillance and control programs in many countries. The development of high-throughput functional genomics technologies, including RNA sequencing, has enabled detailed analysis of the host transcriptome to M. bovis infection, particularly at the macrophage and peripheral blood level. In the present study, we have analysed the transcriptome of bovine whole peripheral blood samples collected at −1 week pre-infection and +1, +2, +6, +10, and +12 weeks post-infection time points. Differentially expressed genes were catalogued and evaluated at each post-infection time point relative to the −1 week pre-infection time point and used for the identification of putative candidate host transcriptional biomarkers for M. bovis infection. Differentially expressed gene sets were also used for examination of cellular pathways associated with the host response to M. bovis infection, construction of de novo gene interaction networks enriched for host differentially expressed genes, and time-series analyses to identify functionally important groups of genes displaying similar patterns of expression across the infection time course. A notable outcome of these analyses was identification of a 19-gene transcriptional biosignature of infection consisting of genes increased in expression across the time course from +1 week to +12 weeks post-infection.
Collapse
Affiliation(s)
- Kirsten E McLoughlin
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - Carolina N Correia
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - John A Browne
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - David A Magee
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - Nicolas C Nalpas
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - Kevin Rue-Albrecht
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - Adam O Whelan
- TB Immunology and Vaccinology Team, Department of Bacteriology, Animal and Plant Health Agency, Weybridge, United Kingdom
| | - Bernardo Villarreal-Ramos
- TB Immunology and Vaccinology Team, Department of Bacteriology, Animal and Plant Health Agency, Weybridge, United Kingdom
| | - H Martin Vordermeier
- TB Immunology and Vaccinology Team, Department of Bacteriology, Animal and Plant Health Agency, Weybridge, United Kingdom
| | - Eamonn Gormley
- UCD School of Veterinary Medicine, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland
| | - Stephen V Gordon
- UCD School of Veterinary Medicine, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland.,UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| | - David E MacHugh
- Animal Genomics Laboratory, UCD School of Agriculture and Food Science, UCD College of Health and Agricultural Sciences, University College Dublin, Dublin, Ireland.,UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland
| |
Collapse
|
6
|
Wong PS, Tamano K, Aburatani S. Improvement of Free Fatty Acid Secretory Productivity in Aspergillus oryzae by Comprehensive Analysis on Time-Series Gene Expression. Front Microbiol 2021; 12:605095. [PMID: 33897630 PMCID: PMC8062725 DOI: 10.3389/fmicb.2021.605095] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 03/11/2021] [Indexed: 12/13/2022] Open
Abstract
Aspergillus oryzae is a filamentous fungus that has historically been utilized in the fermentation of food products. In recent times, it has also been introduced as a component in the industrial biosynthesis of consumable compounds, including free fatty acids (FFAs), which are valuable and versatile products that can be utilized as feedstocks in the production of other commodities, such as pharmaceuticals and dietary supplements. To improve the FFA secretory productivity of A. oryzae in the presence of Triton X-100, we analyzed the gene expression of a wild-type control strain and a disruptant strain of an acyl-CoA synthetase gene, faaA, in a time-series experiment. We employed a comprehensive analysis strategy using the baySeq, DESeq2, and edgeR algorithms to clarify the vital pathways for FFA secretory productivity and select genes for gene modification. We found that the transport and metabolism of inorganic ions are crucial in the initial stages of FFA production and revealed 16 candidate genes to be modified in conjunction with the faaA disruption. These genes were verified through the construction of overexpression strains, and showed that the manipulation of reactions closer to the FFA biosynthesis step led to a higher increase in FFA secretory productivity. This resulted in the most successful overexpression strains to have an FFA secretory productivity more than two folds higher than that of the original faaA disruptant. Our study provides guidance for further gene modification for FFA biosynthesis in A. oryzae and for enhancing the productivity of other metabolites in other microorganisms through metabolic engineering.
Collapse
Affiliation(s)
- Pui Shan Wong
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Koichi Tamano
- Bioproduction Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Sapporo, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| | - Sachiyo Aburatani
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan.,Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan
| |
Collapse
|
7
|
Oh VKS, Li RW. Temporal Dynamic Methods for Bulk RNA-Seq Time Series Data. Genes (Basel) 2021; 12:352. [PMID: 33673721 PMCID: PMC7997275 DOI: 10.3390/genes12030352] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 02/19/2021] [Accepted: 02/22/2021] [Indexed: 02/06/2023] Open
Abstract
Dynamic studies in time course experimental designs and clinical approaches have been widely used by the biomedical community. These applications are particularly relevant in stimuli-response models under environmental conditions, characterization of gradient biological processes in developmental biology, identification of therapeutic effects in clinical trials, disease progressive models, cell-cycle, and circadian periodicity. Despite their feasibility and popularity, sophisticated dynamic methods that are well validated in large-scale comparative studies, in terms of statistical and computational rigor, are less benchmarked, comparing to their static counterparts. To date, a number of novel methods in bulk RNA-Seq data have been developed for the various time-dependent stimuli, circadian rhythms, cell-lineage in differentiation, and disease progression. Here, we comprehensively review a key set of representative dynamic strategies and discuss current issues associated with the detection of dynamically changing genes. We also provide recommendations for future directions for studying non-periodical, periodical time course data, and meta-dynamic datasets.
Collapse
Affiliation(s)
- Vera-Khlara S. Oh
- Animal Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA;
- Department of Computer Science and Statistics, College of Natural Sciences, Jeju National University, Jeju City 63243, Korea
| | - Robert W. Li
- Animal Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA;
| |
Collapse
|
8
|
Park HW, Weiss ST. Understanding the Molecular Mechanisms of Asthma through Transcriptomics. ALLERGY, ASTHMA & IMMUNOLOGY RESEARCH 2020; 12:399-411. [PMID: 32141255 PMCID: PMC7061151 DOI: 10.4168/aair.2020.12.3.399] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 01/01/2020] [Accepted: 01/11/2020] [Indexed: 12/18/2022]
Abstract
The transcriptome represents the complete set of RNA transcripts that are produced by the genome under a specific circumstance or in a specific cell. High-throughput methods, including microarray and bulk RNA sequencing, as well as recent advances in biostatistics based on machine learning approaches provides a quick and effective way of identifying novel genes and pathways related to asthma, which is a heterogeneous disease with diverse pathophysiological mechanisms. In this manuscript, we briefly review how to analyze transcriptome data and then provide a summary of recent transcriptome studies focusing on asthma pathogenesis and asthma drug responses. Studies reviewed here are classified into 2 classes based on the tissues utilized: blood and airway cells.
Collapse
Affiliation(s)
- Heung Woo Park
- The Channing Division of Network Medicine, Department of Medicine, Brigham & Women's Hospital and Harvard Medical School, Boston, MA, USA.,Department of Internal Medicine, Seoul National University College of Medicine, Seoul, Korea
| | - Scott T Weiss
- The Channing Division of Network Medicine, Department of Medicine, Brigham & Women's Hospital and Harvard Medical School, Boston, MA, USA.,Partners Center for Personalized Medicine, Partners Health Care, Boston, MA, USA.
| |
Collapse
|
9
|
Palu CC, Ribeiro-Alves M, Wu Y, Lawlor B, Baranov PV, Kelly B, Walsh P. Simplicity DiffExpress: A Bespoke Cloud-Based Interface for RNA-seq Differential Expression Modeling and Analysis. Front Genet 2019; 10:356. [PMID: 31139204 PMCID: PMC6527599 DOI: 10.3389/fgene.2019.00356] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2018] [Accepted: 04/02/2019] [Indexed: 11/23/2022] Open
Abstract
One of the key challenges for transcriptomics-based research is not only the processing of large data but also modeling the complexity of features that are sources of variation across samples, which is required for an accurate statistical analysis. Therefore, our goal is to foster access for wet lab researchers to bioinformatics tools, in order to enhance their ability to explore biological aspects and validate hypotheses with robust analysis. In this context, user-friendly interfaces can enable researchers to apply computational biology methods without requiring bioinformatics expertise. Such bespoke platforms can improve the quality of the findings by allowing the researcher to freely explore the data and test a new hypothesis with independence. Simplicity DiffExpress is a data-driven software platform dedicated to enabling non-bioinformaticians to take ownership of the differential expression analysis (DEA) step in a transcriptomics experiment while presenting the results in a comprehensible layout, which supports an efficient results exploration, information storage, and reproducibility. Simplicity DiffExpress’ key component is the bespoke statistical model validation that guides the user through any necessary alteration in the dataset or model, tackling the challenges behind complex data analysis. The software utilizes edgeR, and it is implemented as part of the SimplicityTM platform, providing a dynamic interface, with well-organized results that are easy to navigate and are shareable. Computational biologists and bioinformaticians can also benefit from its use since the data validation is more informative than the usual DEA resources. Wet-lab collaborators can benefit from receiving their results in an organized interface. Simplicity DiffExpress is freely available for academic use, and it is cloud-based (https://simplicity.nsilico.com/dea).
Collapse
Affiliation(s)
- Cintia C Palu
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland.,NSilico Life Science Ltd., Cork, Ireland
| | - Marcelo Ribeiro-Alves
- Laboratory of Clinical Research on STD/AIDS, National Institute of Infectology Evandro Chagas (INI) - Oswaldo Cruz Foundation (FIOCRUZ), Rio de Janeiro, Brazil
| | - Yanxin Wu
- NSilico Life Science Ltd., Cork, Ireland.,Cork Institute of Technology, Cork, Ireland
| | - Brendan Lawlor
- NSilico Life Science Ltd., Cork, Ireland.,Cork Institute of Technology, Cork, Ireland
| | - Pavel V Baranov
- School of Biochemistry and Cell Biology, University College Cork, Cork, Ireland.,Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Moscow, Russia
| | | | - Paul Walsh
- NSilico Life Science Ltd., Cork, Ireland.,Cork Institute of Technology, Cork, Ireland
| |
Collapse
|
10
|
Abstract
Identification of differentially expressed genes has been a high priority task of downstream analyses to further advances in biomedical research. Investigators have been faced with an array of issues in dealing with more complicated experiments and metadata, including batch effects, normalization, temporal dynamics (temporally differential expression), and isoform diversity (isoform-level quantification and differential splicing events). To date, there are currently no standard approaches to precisely and efficiently analyze these moderate or large-scale experimental designs, especially with combined metadata. In this report, we propose comprehensive analytical pipelines to precisely characterize temporal dynamics in differential expression of genes and other genomic features, i.e., the variability of transcripts, isoforms and exons, by controlling batch effects and other nuisance factors that could have significant confounding effects on the main effects of interest in comparative models and may result in misleading interpretations.
Collapse
|
11
|
Bellazzi R, Engel F, Ferrazzi F. Gene network analysis: from heart development to cardiac therapy. Thromb Haemost 2017; 113:522-31. [DOI: 10.1160/th14-06-0483] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2014] [Accepted: 08/14/2014] [Indexed: 12/31/2022]
Abstract
SummaryNetworks offer a flexible framework to represent and analyse the complex interactions between components of cellular systems. In particular gene networks inferred from expression data can support the identification of novel hypotheses on regulatory processes. In this review we focus on the use of gene network analysis in the study of heart development. Understanding heart development will promote the elucidation of the aetiology of congenital heart disease and thus possibly improve diagnostics. Moreover, it will help to establish cardiac therapies. For example, understanding cardiac differentiation during development will help to guide stem cell differentiation required for cardiac tissue engineering or to enhance endogenous repair mechanisms. We introduce different methodological frameworks to infer networks from expression data such as Boolean and Bayesian networks. Then we present currently available temporal expression data in heart development and discuss the use of network-based approaches in published studies. Collectively, our literature-based analysis indicates that gene network analysis constitutes a promising opportunity to infer therapy-relevant regulatory processes in heart development. However, the use of network-based approaches has so far been limited by the small amount of samples in available datasets. Thus, we propose to acquire high-resolution temporal expression data to improve the mathematical descriptions of regulatory processes obtained with gene network inference methodologies. Especially probabilistic methods that accommodate the intrinsic variability of biological systems have the potential to contribute to a deeper understanding of heart development.
Collapse
|
12
|
Orellana R, Chaput G, Markillie LM, Mitchell H, Gaffrey M, Orr G, DeAngelis KM. Multi-time series RNA-seq analysis of Enterobacter lignolyticus SCF1 during growth in lignin-amended medium. PLoS One 2017; 12:e0186440. [PMID: 29049419 PMCID: PMC5648182 DOI: 10.1371/journal.pone.0186440] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 10/02/2017] [Indexed: 12/11/2022] Open
Abstract
The production of lignocellulosic-derived biofuels is a highly promising source of alternative energy, but it has been constrained by the lack of a microbial platform capable to efficiently degrade this recalcitrant material and cope with by-products that can be toxic to cells. Species that naturally grow in environments where carbon is mainly available as lignin are promising for finding new ways of removing the lignin that protects cellulose for improved conversion of lignin to fuel precursors. Enterobacter lignolyticus SCF1 is a facultative anaerobic Gammaproteobacteria isolated from tropical rain forest soil collected in El Yunque forest, Puerto Rico under anoxic growth conditions with lignin as sole carbon source. Whole transcriptome analysis of SCF1 during E.lignolyticus SCF1 lignin degradation was conducted on cells grown in the presence (0.1%, w/w) and the absence of lignin, where samples were taken at three different times during growth, beginning of exponential phase, mid-exponential phase and beginning of stationary phase. Lignin-amended cultures achieved twice the cell biomass as unamended cultures over three days, and in this time degraded 60% of lignin. Transcripts in early exponential phase reflected this accelerated growth. A complement of laccases, aryl-alcohol dehydrogenases, and peroxidases were most up-regulated in lignin amended conditions in mid-exponential and early stationary phases compared to unamended growth. The association of hydrogen production by way of the formate hydrogenlyase complex with lignin degradation suggests a possible value added to lignin degradation in the future.
Collapse
Affiliation(s)
- Roberto Orellana
- Centro de Biotecnología Daniel Alkalay Lowitt, Universidad Técnica Federico Santa María, Valparaíso, Chile
| | - Gina Chaput
- Microbiology Department, University of Massachusetts Amherst, Amherst, United States of America
| | - Lye Meng Markillie
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Hugh Mitchell
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Matt Gaffrey
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Galya Orr
- Pacific Northwest National Laboratory, Richland, Washington, United States of America
| | - Kristen M. DeAngelis
- Microbiology Department, University of Massachusetts Amherst, Amherst, United States of America
- * E-mail:
| |
Collapse
|
13
|
Nascimento M, Silva FFE, Sáfadi T, Nascimento ACC, Ferreira TEM, Barroso LMA, Ferreira Azevedo C, Guimarães SEF, Serão NVL. Independent Component Analysis (ICA) based-clustering of temporal RNA-seq data. PLoS One 2017; 12:e0181195. [PMID: 28715507 PMCID: PMC5513449 DOI: 10.1371/journal.pone.0181195] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2017] [Accepted: 06/27/2017] [Indexed: 11/19/2022] Open
Abstract
Gene expression time series (GETS) analysis aims to characterize sets of genes according to their longitudinal patterns of expression. Due to the large number of genes evaluated in GETS analysis, an useful strategy to summarize biological functional processes and regulatory mechanisms is through clustering of genes that present similar expression pattern over time. Traditional cluster methods usually ignore the challenges in GETS, such as the lack of data normality and small number of temporal observations. Independent Component Analysis (ICA) is a statistical procedure that uses a transformation to convert raw time series data into sets of values of independent variables, which can be used for cluster analysis to identify sets of genes with similar temporal expression patterns. ICA allows clustering small series of distribution-free data while accounting for the dependence between subsequent time-points. Using temporal simulated and real (four libraries of two pig breeds at 21, 40, 70 and 90 days of gestation) RNA-seq data set we present a methodology (ICAclust) that jointly considers independent components analysis (ICA) and a hierarchical method for clustering GETS. We compare ICAclust results with those obtained for K-means clustering. ICAclust presented, on average, an absolute gain of 5.15% over the best K-means scenario. Considering the worst scenario for K-means, the gain was of 84.85%, when compared with the best ICAclust result. For the real data set, genes were grouped into six distinct clusters with 89, 51, 153, 67, 40, and 58 genes each, respectively. In general, it can be observed that the 6 clusters presented very distinct expression patterns. Overall, the proposed two-step clustering method (ICAclust) performed well compared to K-means, a traditional method used for cluster analysis of temporal gene expression data. In ICAclust, genes with similar expression pattern over time were clustered together.
Collapse
Affiliation(s)
- Moysés Nascimento
- Department of Statistics, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| | | | - Thelma Sáfadi
- Department of Exact Sciences, Federal University of Lavras, Lavras, Minas Gerais, Brazil
| | | | | | | | | | | | | |
Collapse
|
14
|
Do DN, Li R, Dudemaine PL, Ibeagha-Awemu EM. MicroRNA roles in signalling during lactation: an insight from differential expression, time course and pathway analyses of deep sequence data. Sci Rep 2017; 7:44605. [PMID: 28317898 PMCID: PMC5357959 DOI: 10.1038/srep44605] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Accepted: 02/09/2017] [Indexed: 01/30/2023] Open
Abstract
The study examined microRNA (miRNA) expression and regulatory patterns during an entire bovine lactation cycle. Total RNA from milk fat samples collected at the lactogenesis (LAC, day1 [D1] and D7), galactopoiesis (GAL, D30, D70, D130, D170 and D230) and involution (INV, D290 and when milk production dropped to 5 kg/day) stages from 9 cows was used for miRNA sequencing. A total of 475 known and 238 novel miRNAs were identified. Fifteen abundantly expressed miRNAs across lactation stages play regulatory roles in basic metabolic, cellular and immunological functions. About 344, 366 and 209 miRNAs were significantly differentially expressed (DE) between GAL and LAC, INV and GAL, and INV and LAC stages, respectively. MiR-29b/miR-363 and miR-874/miR-6254 are important mediators for transition signals from LAC to GAL and from GAL to INV, respectively. Moreover, 58 miRNAs were dynamically DE in all lactation stages and 19 miRNAs were significantly time-dependently DE throughout lactation. Relevant signalling pathways for transition between lactation stages are involved in apoptosis (PTEN and SAPK/JNK), intracellular signalling (protein kinase A, TGF-β and ERK5), cell cycle regulation (STAT3), cytokines, hormones and growth factors (prolactin, growth hormone and glucocorticoid receptor). Overall, our data suggest diverse, temporal and physiological signal-dependent regulatory and mediator functions for miRNAs during lactation.
Collapse
Affiliation(s)
- Duy N Do
- Agriculture and Agri-Food Canada, Sherbrooke Research and Development Centre, 2000 College Street, Sherbrooke, Quebec, J1M 0C8, Canada.,Department of Animal Science, McGill University, 21111, Lakeshore Road, Ste-Anne-de Bellevue, Quebec, J1M 0C8, Canada
| | - Ran Li
- Agriculture and Agri-Food Canada, Sherbrooke Research and Development Centre, 2000 College Street, Sherbrooke, Quebec, J1M 0C8, Canada.,College of Animal Science and Technology, Northwest A&F University, Xinong road 22, Shaanxi, 712100, China
| | - Pier-Luc Dudemaine
- Agriculture and Agri-Food Canada, Sherbrooke Research and Development Centre, 2000 College Street, Sherbrooke, Quebec, J1M 0C8, Canada
| | - Eveline M Ibeagha-Awemu
- Agriculture and Agri-Food Canada, Sherbrooke Research and Development Centre, 2000 College Street, Sherbrooke, Quebec, J1M 0C8, Canada
| |
Collapse
|
15
|
Systematic identification of an integrative network module during senescence from time-series gene expression. BMC SYSTEMS BIOLOGY 2017; 11:36. [PMID: 28298218 PMCID: PMC5353876 DOI: 10.1186/s12918-017-0417-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Accepted: 03/02/2017] [Indexed: 01/05/2023]
Abstract
Background Cellular senescence irreversibly arrests growth of human diploid cells. In addition, recent studies have indicated that senescence is a multi-step evolving process related to important complex biological processes. Most studies analyzed only the genes and their functions representing each senescence phase without considering gene-level interactions and continuously perturbed genes. It is necessary to reveal the genotypic mechanism inferred by affected genes and their interaction underlying the senescence process. Results We suggested a novel computational approach to identify an integrative network which profiles an underlying genotypic signature from time-series gene expression data. The relatively perturbed genes were selected for each time point based on the proposed scoring measure denominated as perturbation scores. Then, the selected genes were integrated with protein-protein interactions to construct time point specific network. From these constructed networks, the conserved edges across time point were extracted for the common network and statistical test was performed to demonstrate that the network could explain the phenotypic alteration. As a result, it was confirmed that the difference of average perturbation scores of common networks at both two time points could explain the phenotypic alteration. We also performed functional enrichment on the common network and identified high association with phenotypic alteration. Remarkably, we observed that the identified cell cycle specific common network played an important role in replicative senescence as a key regulator. Conclusions Heretofore, the network analysis from time series gene expression data has been focused on what topological structure was changed over time point. Conversely, we focused on the conserved structure but its context was changed in course of time and showed it was available to explain the phenotypic changes. We expect that the proposed method will help to elucidate the biological mechanism unrevealed by the existing approaches. Electronic supplementary material The online version of this article (doi:10.1186/s12918-017-0417-1) contains supplementary material, which is available to authorized users.
Collapse
|
16
|
Ayyar VS, Almon RR, DuBois DC, Sukumaran S, Qu J, Jusko WJ. Functional proteomic analysis of corticosteroid pharmacodynamics in rat liver: Relationship to hepatic stress, signaling, energy regulation, and drug metabolism. J Proteomics 2017; 160:84-105. [PMID: 28315483 DOI: 10.1016/j.jprot.2017.03.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 02/15/2017] [Accepted: 03/10/2017] [Indexed: 02/07/2023]
Abstract
Corticosteroids (CS) are anti-inflammatory agents that cause extensive pharmacogenomic and proteomic changes in multiple tissues. An understanding of the proteome-wide effects of CS in liver and its relationships to altered hepatic and systemic physiology remains incomplete. Here, we report the application of a functional pharmacoproteomic approach to gain integrated insight into the complex nature of CS responses in liver in vivo. An in-depth functional analysis was performed using rich pharmacodynamic (temporal-based) proteomic data measured over 66h in rat liver following a single dose of methylprednisolone (MPL). Data mining identified 451 differentially regulated proteins. These proteins were analyzed on the basis of temporal regulation, cellular localization, and literature-mined functional information. Of the 451 proteins, 378 were clustered into six functional groups based on major clinically-relevant effects of CS in liver. MPL-responsive proteins were highly localized in the mitochondria (20%) and cytosol (24%). Interestingly, several proteins were related to hepatic stress and signaling processes, which appear to be involved in secondary signaling cascades and in protecting the liver from CS-induced oxidative damage. Consistent with known adverse metabolic effects of CS, several rate-controlling enzymes involved in amino acid metabolism, gluconeogenesis, and fatty-acid metabolism were altered by MPL. In addition, proteins involved in the metabolism of endogenous compounds, xenobiotics, and therapeutic drugs including cytochrome P450 and Phase-II enzymes were differentially regulated. Proteins related to the inflammatory acute-phase response were up-regulated in response to MPL. Functionally-similar proteins showed large diversity in their temporal profiles, indicating complex mechanisms of regulation by CS. SIGNIFICANCE Clinical use of corticosteroid (CS) therapy is frequent and chronic. However, current knowledge on the proteome-level effects of CS in liver and other tissues is sparse. While transcriptomic regulation following methylprednisolone (MPL) dosing has been temporally examined in rat liver, proteomic assessments are needed to better characterize the tissue-specific functional aspects of MPL actions. This study describes a functional pharmacoproteomic analysis of dynamic changes in MPL-regulated proteins in liver and provides biological insight into how steroid-induced perturbations on a molecular level may relate to both adverse and therapeutic responses presented clinically.
Collapse
Affiliation(s)
- Vivaswath S Ayyar
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States
| | - Richard R Almon
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States; Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, United States
| | - Debra C DuBois
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States; Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY, United States
| | - Siddharth Sukumaran
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States
| | - Jun Qu
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States
| | - William J Jusko
- Department of Pharmaceutical Sciences, State University of New York at Buffalo, NY, United States.
| |
Collapse
|
17
|
Limit theorems for empirical Rényi entropy and divergence with applications to molecular diversity analysis. TEST-SPAIN 2016. [DOI: 10.1007/s11749-016-0489-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Kosztolányi G. It is time to take timing seriously in clinical genetics. Eur J Hum Genet 2015; 23:1435-7. [PMID: 25537357 PMCID: PMC4613471 DOI: 10.1038/ejhg.2014.271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Revised: 10/14/2014] [Accepted: 11/10/2014] [Indexed: 11/08/2022] Open
Abstract
Observations made by molecular techniques on the genome along the individuals' lifetime indicate that the genome in somatic cells displays changes at molecular, cellular, and organismal levels. Timing of genetic events leading to somatic mosaicism and gene expression dynamism results in a highly important variable for comprehending the role of genetics in health and disease. Consideration of time in clinical genetics should be enthusiastically invested into research strategy, interpretation of the results, diagnostic routine, and particularly in ethical discussions.
Collapse
|
19
|
Spies D, Ciaudo C. Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis. Comput Struct Biotechnol J 2015; 13:469-77. [PMID: 26430493 PMCID: PMC4564389 DOI: 10.1016/j.csbj.2015.08.004] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 08/05/2015] [Accepted: 08/07/2015] [Indexed: 12/17/2022] Open
Abstract
Analysis of gene expression has contributed to a plethora of biological and medical research studies. Microarrays have been intensively used for the profiling of gene expression during diverse developmental processes, treatments and diseases. New massively parallel sequencing methods, often named as RNA-sequencing (RNA-seq) are extensively improving our understanding of gene regulation and signaling networks. Computational methods developed originally for microarrays analysis can now be optimized and applied to genome-wide studies in order to have access to a better comprehension of the whole transcriptome. This review addresses current challenges on RNA-seq analysis and specifically focuses on new bioinformatics tools developed for time series experiments. Furthermore, possible improvements in analysis, data integration as well as future applications of differential expression analysis are discussed.
Collapse
Affiliation(s)
- Daniel Spies
- Swiss Federal Institute of Technology Zurich, Department of Biology, Institute of Molecular Health Sciences, Zurich, Otto-Stern Weg 7, 8093 Zurich, Switzerland
- Life Science Zurich Graduate School, Molecular Life Science Program, University of Zurich, Institute of Molecular Life Sciences, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Constance Ciaudo
- Swiss Federal Institute of Technology Zurich, Department of Biology, Institute of Molecular Health Sciences, Zurich, Otto-Stern Weg 7, 8093 Zurich, Switzerland
| |
Collapse
|
20
|
Abstract
BACKGROUND Dynamic expression data, nowadays obtained using high-throughput RNA sequencing, are essential to monitor transient gene expression changes and to study the dynamics of their transcriptional activity in the cell or response to stimuli. Several methods for data selection, clustering and functional analysis are available; however, these steps are usually performed independently, without exploiting and integrating the information derived from each step of the analysis. METHODS Here we present FunPat, an R package for time series RNA sequencing data that integrates gene selection, clustering and functional annotation into a single framework. FunPat exploits functional annotations by performing for each functional term, e.g. a Gene Ontology term, an integrated selection-clustering analysis to select differentially expressed genes that share, besides annotation, a common dynamic expression profile. RESULTS FunPat performance was assessed on both simulated and real data. With respect to a stand-alone selection step, the integration of the clustering step is able to improve the recall without altering the false discovery rate. FunPat also shows high precision and recall in detecting the correct temporal expression patterns; in particular, the recall is significantly higher than hierarchical, k-means and a model-based clustering approach specifically designed for RNA sequencing data. Moreover, when biological replicates are missing, FunPat is able to provide reproducible lists of significant genes. The application to real time series expression data shows the ability of FunPat to select differentially expressed genes with high reproducibility, indirectly confirming high precision and recall in gene selection. Moreover, the expression patterns obtained as output allow an easy interpretation of the results. CONCLUSIONS A novel analysis pipeline was developed to search the main temporal patterns in classes of genes similarly annotated, improving the sensitivity of gene selection by integrating the statistical evidence of differential expression with the information on temporal profiles and the functional annotations. Significant genes are associated to both the most informative functional terms, avoiding redundancy of information, and the most representative temporal patterns, thus improving the readability of the results. FunPat package is provided in R/Bioconductor at link: http://sysbiobig.dei.unipd.it/?q=node/79.
Collapse
|
21
|
Zheng CL, Kawane S, Bottomly D, Wilmot B. Analysis considerations for utilizing RNA-Seq to characterize the brain transcriptome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2014; 116:21-54. [PMID: 25172470 DOI: 10.1016/b978-0-12-801105-8.00002-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
RNA-Seq allows one to examine only gene expression as well as expression of noncoding RNAs, alternative splicing, and allele-specific expression. With this increased sensitivity and dynamic range, there are computational and statistical considerations that need to be contemplated, which are highly dependent on the biological question being asked. We highlight these to provide an overview of their importance and the impact they can have on downstream interpretation of the brain transcriptome.
Collapse
Affiliation(s)
- Christina L Zheng
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Knight Cancer Institute, Oregon Health, Oregon Health and Science University, Portland, Oregon, USA.
| | - Sunita Kawane
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Daniel Bottomly
- Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| | - Beth Wilmot
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, Oregon, USA; Clinical & Translational Research Institute, Oregon Health and Science University, Portland, Oregon, USA
| |
Collapse
|