1
|
Wu Y, Zhou D, Hu J. Reconstruction of gene regulatory networks for Caenorhabditis elegans using tree-shaped gene expression data. Brief Bioinform 2024; 25:bbae396. [PMID: 39133097 PMCID: PMC11318059 DOI: 10.1093/bib/bbae396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 06/11/2024] [Accepted: 08/07/2024] [Indexed: 08/13/2024] Open
Abstract
Constructing gene regulatory networks is a widely adopted approach for investigating gene regulation, offering diverse applications in biology and medicine. A great deal of research focuses on using time series data or single-cell RNA-sequencing data to infer gene regulatory networks. However, such gene expression data lack either cellular or temporal information. Fortunately, the advent of time-lapse confocal laser microscopy enables biologists to obtain tree-shaped gene expression data of Caenorhabditis elegans, achieving both cellular and temporal resolution. Although such tree-shaped data provide abundant knowledge, they pose challenges like non-pairwise time series, laying the inaccuracy of downstream analysis. To address this issue, a comprehensive framework for data integration and a novel Bayesian approach based on Boolean network with time delay are proposed. The pre-screening process and Markov Chain Monte Carlo algorithm are applied to obtain the parameter estimates. Simulation studies show that our method outperforms existing Boolean network inference algorithms. Leveraging the proposed approach, gene regulatory networks for five subtrees are reconstructed based on the real tree-shaped datatsets of Caenorhabditis elegans, where some gene regulatory relationships confirmed in previous genetic studies are recovered. Also, heterogeneity of regulatory relationships in different cell lineage subtrees is detected. Furthermore, the exploration of potential gene regulatory relationships that bear importance in human diseases is undertaken. All source code is available at the GitHub repository https://github.com/edawu11/BBTD.git.
Collapse
Affiliation(s)
- Yida Wu
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| | - Da Zhou
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| | - Jie Hu
- School of Mathematical Sciences, Xiamen University, Zengcuo'an West Road, Siming District, Xiamen 361000, China
| |
Collapse
|
2
|
Yang G, Hu W, He L, Dou L. Nonlinear causal network learning via Granger causality based on extreme support vector regression. CHAOS (WOODBURY, N.Y.) 2024; 34:023127. [PMID: 38377295 DOI: 10.1063/5.0183537] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 01/22/2024] [Indexed: 02/22/2024]
Abstract
For complex networked systems, based on the consideration of nonlinearity and causality, a novel general method of nonlinear causal network learning, termed extreme support vector regression Granger causality (ESVRGC), is proposed. The nonuniform time-delayed influence of the driving nodes on the target node is particularly considered. Then, the restricted model and the unrestricted model of Granger causality are, respectively, formulated based on extreme support vector regression, which uses the selected time-delayed components of system variables as the inputs of kernel functions. The nonlinear conditional Granger causality index is finally calculated to confirm the strength of a causal interaction. Generally, based on the simulation of a nonlinear vector autoregressive model and nonlinear discrete time-delayed dynamic systems, ESVRGC demonstrates better performance than other popular methods. Also, the validity and robustness of ESVRGC are also verified by the different cases of network types, sample sizes, noise intensities, and coupling strengths. Finally, the superiority of ESVRGC is successful verified by the experimental study on real benchmark datasets.
Collapse
Affiliation(s)
- Guanxue Yang
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Weiwei Hu
- School of Electrical and Information Engineering, Jiangsu University, Zhenjiang 212013, China
| | - Lidong He
- School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Liya Dou
- Department of Automation, Beijing University of Chemical Technology, Beijing 100029, China
| |
Collapse
|
3
|
Zhao N, Liu H, Yan F. Oscillation dynamic mechanism driven by time delays in the competent gene regulatory circuit of B. subtilis. INT J BIOMATH 2021. [DOI: 10.1142/s1793524522500176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Bacillus subtilis with competent states absorbs DNA and may improve the growth of bacteria by integrating new genetic material. Therefore, it is important to clarify how the genes interact in the circuit so that cells enter into a competent state or return to a vegetative state. The gene regulatory circuit consists of two positive feedback loops and one negative feedback loop. In this paper, a mathematical model is developed by considering transcription time delays to further study dynamic behavior of the B. subtilis competent gene regulatory network. Combined with theoretical calculation and numerical simulation, it is verified that the time delay in indirect transcription inhibition indeed has the effect of inducing the periodic oscillation of the B. subtilis competent system. In addition, some important chemical reaction rates can also regulate system dynamic behavior. However, under the control of time delay, the effects of the important chemical reaction rates have changed significantly. In particular, the time delay can advance critical value of the important chemical reaction rates where vibration occurs and can also weaken or even eliminate the effect of the important chemical reaction rates. These results will help us to analyze the competent state of B. subtilis.
Collapse
Affiliation(s)
- Na Zhao
- Department of Mathematics, Yunnan Normal University, Kunming 650500, P. R. China
- Key Laboratory of Complex System Modeling and Application, for Universities in Yunnan, Kunming 650500, P. R. China
| | - Haihong Liu
- Department of Mathematics, Yunnan Normal University, Kunming 650500, P. R. China
- Key Laboratory of Complex System Modeling and Application, for Universities in Yunnan, Kunming 650500, P. R. China
| | - Fang Yan
- Department of Mathematics, Yunnan Normal University, Kunming 650500, P. R. China
- Key Laboratory of Complex System Modeling and Application, for Universities in Yunnan, Kunming 650500, P. R. China
| |
Collapse
|
4
|
Hu J, Qin H, Fan X. Can ODE gene regulatory models neglect time lag or measurement scaling? Bioinformatics 2020; 36:4058-4064. [PMID: 32324854 DOI: 10.1093/bioinformatics/btaa268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Revised: 04/14/2020] [Accepted: 04/16/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Many ordinary differential equation (ODE) models have been introduced to replace linear regression models for inferring gene regulatory relationships from time-course gene expression data. But, since the observed data are usually not direct measurements of the gene products or there is an unknown time lag in gene regulation, it is problematic to directly apply traditional ODE models or linear regression models. RESULTS We introduce a lagged ODE model to infer lagged gene regulatory relationships from time-course measurements, which are modeled as linear transformation of the gene products. A time-course microarray dataset from a yeast cell-cycle study is used for simulation assessment of the methods and real data analysis. The results show that our method, by considering both time lag and measurement scaling, performs much better than other linear and ODE models. It indicates the necessity of explicitly modeling the time lag and measurement scaling in ODE gene regulatory models. AVAILABILITY AND IMPLEMENTATION R code is available at https://www.sta.cuhk.edu.hk/xfan/share/lagODE.zip.
Collapse
Affiliation(s)
- Jie Hu
- Department of Probability and Statistics, School of Mathematical Science, Xiamen University, Xiamen, Fujian, China
| | - Huihui Qin
- Department of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Xiaodan Fan
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
5
|
Dai CY, Liu HH, Liu HH. The role of time delays in P53 gene regulatory network stimulated by growth factor. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020; 17:3794-3835. [PMID: 32987556 DOI: 10.3934/mbe.2020213] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, a delayed mathematical model for the P53-Mdm2 network is developed. The P53-Mdm2 network we study is triggered by growth factor instead of DNA damage and the amount of DNA damage is regarded as zero. We study the influences of time delays, growth factor and other important chemical reaction rates on the dynamic behaviors in the system. It is shown that the time delay is a critical factor and its length determines the period, amplitude and stability of the P53 oscillation. Furthermore, as for some important chemical reaction rates, we also obtain some interesting results through numerical simulation. Especially, S (growth factor), k3 (rate constant for Mdm2p dephosphorylation), k10 (basal expression of PTEN) and k14 (Rate constant for PTEN-induced Akt dephosphorylation) could undermine the dynamic behavior of the system in different degree. These findings are expected to understand the mechanisms of action of several carcinogenic and tumor suppressor factors in humans under normal conditions.
Collapse
Affiliation(s)
- Chang Yong Dai
- Department of Mathematics, Yunnan Normal University, Kunming, 650500, China
| | - Hai Hong Liu
- Department of Mathematics, Yunnan Normal University, Kunming, 650500, China
| | - Hai Hong Liu
- Department of Mathematics, Yunnan Normal University, Kunming, 650500, China
- Department of Dynamics and Control, Beihang University, Beijing 100191, China
| |
Collapse
|
6
|
Haque S, Ahmad JS, Clark NM, Williams CM, Sozzani R. Computational prediction of gene regulatory networks in plant growth and development. CURRENT OPINION IN PLANT BIOLOGY 2019; 47:96-105. [PMID: 30445315 DOI: 10.1016/j.pbi.2018.10.005] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Revised: 10/05/2018] [Accepted: 10/18/2018] [Indexed: 05/22/2023]
Abstract
Plants integrate a wide range of cellular, developmental, and environmental signals to regulate complex patterns of gene expression. Recent advances in genomic technologies enable differential gene expression analysis at a systems level, allowing for improved inference of the network of regulatory interactions between genes. These gene regulatory networks, or GRNs, are used to visualize the causal regulatory relationships between regulators and their downstream target genes. Accordingly, these GRNs can represent spatial, temporal, and/or environmental regulations and can identify functional genes. This review summarizes recent computational approaches applied to different types of gene expression data to infer GRNs in the context of plant growth and development. Three stages of GRN inference are described: first, data collection and analysis based on the dataset type; second, network inference application based on data availability and proposed hypotheses; and third, validation based on in silico, in vivo, and in planta methods. In addition, this review relates data collection strategies to biological questions, organizes inference algorithms based on statistical methods and data types, discusses experimental design considerations, and provides guidelines for GRN inference with an emphasis on the benefits of integrative approaches, especially when a priori information is limited. Finally, this review concludes that computational frameworks integrating large-scale heterogeneous datasets are needed for a more accurate (e.g. fewer false interactions), detailed (e.g. discrimination between direct versus indirect interactions), and comprehensive (e.g. genetic regulation under various conditions and spatial locations) inference of GRNs.
Collapse
Affiliation(s)
- Samiul Haque
- Electrical and Computer Engineering, North Carolina State University, Raleigh, USA
| | - Jabeen S Ahmad
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA
| | - Natalie M Clark
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA
| | - Cranos M Williams
- Electrical and Computer Engineering, North Carolina State University, Raleigh, USA.
| | - Rosangela Sozzani
- Plant and Microbial Biology, North Carolina State University, Raleigh, USA.
| |
Collapse
|
7
|
Wang C, Yan F, Liu H, Zhang Y. Theoretical study on the oscillation mechanism of p53-Mdm2 network. INT J BIOMATH 2019. [DOI: 10.1142/s1793524518501127] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In this paper, a delayed mathematical model was developed based on experimental data to understand how the time delays required for transcription and translation in Mdm2 gene expression affect the kinetic behavior of the p53-Mdm2 network. Taking the time delays as the main research parameters, the stability of the system at the positive equilibrium was studied by using the theoretical method of delay differential equation. We found that such delays can induce oscillations by undergoing a supercritical Hopf bifurcation. Then, we used the normal form theory and the center manifold reduction to study the direction and stability of the bifurcation in detail. Furthermore, we also studied the effects of the length of time delays and the model parameters by numerical simulations. We found that time delays in Mdm2 synthesis are required for p53 oscillations and the length of such delays can determine the amplitude and period of the oscillations. In addition, the model parameters can also change the stability of the system. These results illustrate that the repair process after DNA damage can be regulated by varying time delays and the model parameters.
Collapse
Affiliation(s)
- Conghua Wang
- Department of Mathematics, Yunnan Normal University, Kunming, Yunnan 650500, P. R. China
| | - Fang Yan
- Department of Mathematics, Yunnan Normal University, Kunming, Yunnan 650500, P. R. China
| | - Haihong Liu
- Department of Mathematics, Yunnan Normal University, Kunming, Yunnan 650500, P. R. China
| | - Yuan Zhang
- Shanghai Institute of Applied Mathematics and Mechanics, Shanghai University, Shanghai 200072, P. R. China
| |
Collapse
|
8
|
Pirgazi J, Khanteymoori AR. A robust gene regulatory network inference method base on Kalman filter and linear regression. PLoS One 2018; 13:e0200094. [PMID: 30001352 PMCID: PMC6044105 DOI: 10.1371/journal.pone.0200094] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2018] [Accepted: 06/19/2018] [Indexed: 11/24/2022] Open
Abstract
The reconstruction of the topology of gene regulatory networks (GRNs) using high
throughput genomic data such as microarray gene expression data is an important
problem in systems biology. The main challenge in gene expression data is the
high number of genes and low number of samples; also the data are often
impregnated with noise. In this paper, in dealing with the noisy data, Kalman
filter based method that has the ability to use prior knowledge on learning the
network was used. In the proposed method namely (KFLR), in the
first phase by using mutual information, the noisy regulations with low
correlations were removed. The proposed method utilized a new closed form
solution to compute the posterior probabilities of the edges from regulators to
the target gene within a hybrid framework of Bayesian model averaging and linear
regression methods. In order to show the efficiency, the proposed method was
compared with several well know methods. The results of the evaluation indicate
that the inference accuracy was improved by the proposed method which also
demonstrated better regulatory relations with the noisy data.
Collapse
Affiliation(s)
- Jamshid Pirgazi
- Department of Computer Engineering, Engineering Faculty,
University of Zanjan, Zanjan, Iran
| | - Ali Reza Khanteymoori
- Department of Computer Engineering, Engineering Faculty,
University of Zanjan, Zanjan, Iran
- * E-mail:
| |
Collapse
|
9
|
BTNET : boosted tree based gene regulatory network inference algorithm using time-course measurement data. BMC SYSTEMS BIOLOGY 2018; 12:20. [PMID: 29560827 PMCID: PMC5861501 DOI: 10.1186/s12918-018-0547-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Background Identifying gene regulatory networks is an important task for understanding biological systems. Time-course measurement data became a valuable resource for inferring gene regulatory networks. Various methods have been presented for reconstructing the networks from time-course measurement data. However, existing methods have been validated on only a limited number of benchmark datasets, and rarely verified on real biological systems. Results We first integrated benchmark time-course gene expression datasets from previous studies and reassessed the baseline methods. We observed that GENIE3-time, a tree-based ensemble method, achieved the best performance among the baselines. In this study, we introduce BTNET, a boosted tree based gene regulatory network inference algorithm which improves the state-of-the-art. We quantitatively validated BTNET on the integrated benchmark dataset. The AUROC and AUPR scores of BTNET were higher than those of the baselines. We also qualitatively validated the results of BTNET through an experiment on neuroblastoma cells treated with an antidepressant. The inferred regulatory network from BTNET showed that brachyury, a transcription factor, was regulated by fluoxetine, an antidepressant, which was verified by the expression of its downstream genes. Conclusions We present BTENT that infers a GRN from time-course measurement data using boosting algorithms. Our model achieved the highest AUROC and AUPR scores on the integrated benchmark dataset. We further validated BTNET qualitatively through a wet-lab experiment and showed that BTNET can produce biologically meaningful results. Electronic supplementary material The online version of this article (10.1186/s12918-018-0547-0) contains supplementary material, which is available to authorized users.
Collapse
|
10
|
Yu B, Xu JM, Li S, Chen C, Chen RX, Wang L, Zhang Y, Wang MH. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method. Oncotarget 2017; 8:80373-80392. [PMID: 29113310 PMCID: PMC5655205 DOI: 10.18632/oncotarget.21268] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 08/27/2017] [Indexed: 01/31/2023] Open
Abstract
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Collapse
Affiliation(s)
- Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- CAS Key Laboratory of Geospace Environment, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Jia-Meng Xu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Shan Li
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Rui-Xin Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Lei Wang
- Key Laboratory of Eco-chemical Engineering, Ministry of Education, Laboratory of Inorganic Synthesis and Applied Chemistry, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Yan Zhang
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
- College of Electromechanical Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Ming-Hui Wang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| |
Collapse
|
11
|
Zhang Y, Liu H, Yan F, Zhou J. Oscillatory dynamics of p38 activity with transcriptional and translational time delays. Sci Rep 2017; 7:11495. [PMID: 28904347 PMCID: PMC5597677 DOI: 10.1038/s41598-017-11149-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2017] [Accepted: 07/31/2017] [Indexed: 01/30/2023] Open
Abstract
Recent experimental evidence reports that oscillations of p38 MAPK (p38) activity would efficiently induce pro-inflammatory gene expression, which might be deleterious to immune systems and may even cause cellular damage and apoptosis. It is widely accepted now that transcriptional and translational delays are ubiquitous in gene expression, which can typically result in oscillatory responses of gene regulations. Consequently, delay-driven sustained oscillations in p38 activity (p38*) could in principle be commonplace. Nevertheless, so far the studies of the impact of such delays on p38* have been lacking both experimentally and theoretically. Here, we use experimental data to develop a delayed mathematical model, with the aim of understanding how such delays affect oscillatory behaviour on p38*. We analyze the stability and oscillation of the model with and without explicit time delays. We show that a sufficiently input stimulation strength is prerequisite for generating p38* oscillations, and that an optimal rate of model parameters is also essential to these oscillations. Moreover, we find that the time delays required for transcription and translation in mitogen-activated protein kinase phosphatase-1 (MKP-1) gene expression can drive p38* to be oscillatory even when the concentration of p38* level is at a stable state. Furthermore, the length of these delays can determine the amplitude and period of the oscillations and can enormously extend the oscillatory ranges of model parameters. These results indicate that time delays in MKP-1 synthesis are required, albeit not sufficient, for p38* oscillations, which may lead to new insights related to p38 oscillations.
Collapse
Affiliation(s)
- Yuan Zhang
- Shanghai Institute of Applied Mathematics and Mechanics, Shanghai University, Shanghai, 200072, China
| | - Haihong Liu
- Department of mathematics, Yunnan Normal University, Kunming, 650092, China
| | - Fang Yan
- Department of mathematics, Yunnan Normal University, Kunming, 650092, China
| | - Jin Zhou
- Shanghai Institute of Applied Mathematics and Mechanics, Shanghai University, Shanghai, 200072, China.
| |
Collapse
|
12
|
Yalamanchili HK, Wan YW, Liu Z. Data Analysis Pipeline for RNA-seq Experiments: From Differential Expression to Cryptic Splicing. ACTA ACUST UNITED AC 2017; 59:11.15.1-11.15.21. [PMID: 28902396 DOI: 10.1002/cpbi.33] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
RNA sequencing (RNA-seq) is a high-throughput technology that provides unique insights into the transcriptome. It has a wide variety of applications in quantifying genes/isoforms and in detecting non-coding RNA, alternative splicing, and splice junctions. It is extremely important to comprehend the entire transcriptome for a thorough understanding of the cellular system. Several RNA-seq analysis pipelines have been proposed to date. However, no single analysis pipeline can capture dynamics of the entire transcriptome. Here, we compile and present a robust and commonly used analytical pipeline covering the entire spectrum of transcriptome analysis, including quality checks, alignment of reads, differential gene/transcript expression analysis, discovery of cryptic splicing events, and visualization. Challenges, critical parameters, and possible downstream functional analysis pipelines associated with each step are highlighted and discussed. This unit provides a comprehensive understanding of state-of-the-art RNA-seq analysis pipeline and a greater understanding of the transcriptome. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Hari Krishna Yalamanchili
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Bioinformatics Core, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, Texas
| | - Ying-Wooi Wan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Bioinformatics Core, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, Texas
| | - Zhandong Liu
- Bioinformatics Core, Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, Texas.,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas.,Department of Pediatrics-Neurology, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
13
|
Wang Z, Fang H, Tang NLS, Deng M. VCNet: vector-based gene co-expression network construction and its application to RNA-seq data. Bioinformatics 2017; 33:2173-2181. [DOI: 10.1093/bioinformatics/btx131] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 03/07/2017] [Indexed: 11/12/2022] Open
Affiliation(s)
- Zengmiao Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Huaying Fang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- LMAM, School of Mathematical Sciences, Peking University, Beijing, China
| | - Nelson Leung-Sang Tang
- Department of Chemical Pathology and Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, China
| | - Minghua Deng
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- LMAM, School of Mathematical Sciences, Peking University, Beijing, China
- Center for Statistical Science, Peking University, Beijing, China
| |
Collapse
|
14
|
Liu F, Zhang SW, Guo WF, Wei ZG, Chen L. Inference of Gene Regulatory Network Based on Local Bayesian Networks. PLoS Comput Biol 2016; 12:e1005024. [PMID: 27479082 PMCID: PMC4968793 DOI: 10.1371/journal.pcbi.1005024] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 06/20/2016] [Indexed: 11/18/2022] Open
Abstract
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce the computational cost of BN due to much smaller sizes of local GRNs, but also identify the directions of the regulations.
Collapse
Affiliation(s)
- Fei Liu
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Science, Baoji, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Wei-Feng Guo
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Ze-Gang Wei
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Luonan Chen
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
- Key Laboratory of Systems Biology, Innovation Center for Cell Signaling Network, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| |
Collapse
|
15
|
Analysis of spatial-temporal gene expression patterns reveals dynamics and regionalization in developing mouse brain. Sci Rep 2016; 6:19274. [PMID: 26786896 PMCID: PMC4726224 DOI: 10.1038/srep19274] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 12/10/2015] [Indexed: 01/14/2023] Open
Abstract
Allen Brain Atlas (ABA) provides a valuable resource of spatial/temporal gene expressions in mammalian brains. Despite rich information extracted from this database, current analyses suffer from several limitations. First, most studies are either gene-centric or region-centric, thus are inadequate to capture the superposition of multiple spatial-temporal patterns. Second, standard tools of expression analysis such as matrix factorization can capture those patterns but do not explicitly incorporate spatial dependency. To overcome those limitations, we proposed a computational method to detect recurrent patterns in the spatial-temporal gene expression data of developing mouse brains. We demonstrated that regional distinction in brain development could be revealed by localized gene expression patterns. The patterns expressed in the forebrain, medullary and pontomedullary, and basal ganglia are enriched with genes involved in forebrain development, locomotory behavior, and dopamine metabolism respectively. In addition, the timing of global gene expression patterns reflects the general trends of molecular events in mouse brain development. Furthermore, we validated functional implications of the inferred patterns by showing genes sharing similar spatial-temporal expression patterns with Lhx2 exhibited differential expression in the embryonic forebrains of Lhx2 mutant mice. These analysis outcomes confirm the utility of recurrent expression patterns in studying brain development.
Collapse
|
16
|
Hu J, Zhao Z, Yalamanchili HK, Wang J, Ye K, Fan X. Bayesian detection of embryonic gene expression onset in C. elegans. Ann Appl Stat 2015. [DOI: 10.1214/15-aoas820] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
17
|
Yalamanchili HK, Li Z, Wang P, Wong MP, Yao J, Wang J. SpliceNet: recovering splicing isoform-specific differential gene networks from RNA-Seq data of normal and diseased samples. Nucleic Acids Res 2014; 42:e121. [PMID: 25034693 PMCID: PMC4150760 DOI: 10.1093/nar/gku577] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Conventionally, overall gene expressions from microarrays are used to infer gene networks, but it is challenging to account splicing isoforms. High-throughput RNA Sequencing has made splice variant profiling practical. However, its true merit in quantifying splicing isoforms and isoform-specific exon expressions is not well explored in inferring gene networks. This study demonstrates SpliceNet, a method to infer isoform-specific co-expression networks from exon-level RNA-Seq data, using large dimensional trace. It goes beyond differentially expressed genes and infers splicing isoform network changes between normal and diseased samples. It eases the sample size bottleneck; evaluations on simulated data and lung cancer-specific ERBB2 and MAPK signaling pathways, with varying number of samples, evince the merit in handling high exon to sample size ratio datasets. Inferred network rewiring of well established Bcl-x and EGFR centered networks from lung adenocarcinoma expression data is in good agreement with literature. Gene level evaluations demonstrate a substantial performance of SpliceNet over canonical correlation analysis, a method that is currently applied to exon level RNA-Seq data. SpliceNet can also be applied to exon array data. SpliceNet is distributed as an R package available at http://www.jjwanglab.org/SpliceNet.
Collapse
Affiliation(s)
- Hari Krishna Yalamanchili
- Department of Biochemistry, The University of Hong Kong, Hong Kong (SAR), China Department of Pathology, The University of Hong Kong, Hong Kong (SAR), China
| | - Zhaoyuan Li
- Centre for Genomic Sciences, L.K.S. Faculty of Medicine, The University of Hong Kong, Hong Kong (SAR), China
| | - Panwen Wang
- Department of Biochemistry, The University of Hong Kong, Hong Kong (SAR), China Department of Pathology, The University of Hong Kong, Hong Kong (SAR), China
| | - Maria P Wong
- Department of Pathology, The University of Hong Kong, Hong Kong (SAR), China Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, China
| | - Jianfeng Yao
- Centre for Genomic Sciences, L.K.S. Faculty of Medicine, The University of Hong Kong, Hong Kong (SAR), China
| | - Junwen Wang
- Department of Biochemistry, The University of Hong Kong, Hong Kong (SAR), China Department of Pathology, The University of Hong Kong, Hong Kong (SAR), China Department of Statistics & Actuarial Science, Faculty of Science, The University of Hong Kong, Hong Kong (SAR), China
| |
Collapse
|
18
|
Inferring gene regulatory networks by integrating ChIP-seq/chip and transcriptome data via LASSO-type regularization methods. Methods 2014; 67:294-303. [DOI: 10.1016/j.ymeth.2014.03.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2013] [Revised: 03/04/2014] [Accepted: 03/05/2014] [Indexed: 01/14/2023] Open
|