1
|
Evans KV, Ransom E, Nayakoti S, Wilding B, Mohd Salleh F, Gržina I, Erber L, Tse C, Hill C, Polanski K, Holland A, Bukhat S, Herbert RJ, de Graaf BHJ, Denby K, Buchanan-Wollaston V, Rogers HJ. Expression of the Arabidopsis redox-related LEA protein, SAG21 is regulated by ERF, NAC and WRKY transcription factors. Sci Rep 2024; 14:7756. [PMID: 38565965 PMCID: PMC10987515 DOI: 10.1038/s41598-024-58161-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 03/26/2024] [Indexed: 04/04/2024] Open
Abstract
SAG21/LEA5 is an unusual late embryogenesis abundant protein in Arabidopsis thaliana, that is primarily mitochondrially located and may be important in regulating translation in both chloroplasts and mitochondria. SAG21 expression is regulated by a plethora of abiotic and biotic stresses and plant growth regulators indicating a complex regulatory network. To identify key transcription factors regulating SAG21 expression, yeast-1-hybrid screens were used to identify transcription factors that bind the 1685 bp upstream of the SAG21 translational start site. Thirty-three transcription factors from nine different families bound to the SAG21 promoter, including members of the ERF, WRKY and NAC families. Key binding sites for both NAC and WRKY transcription factors were tested through site directed mutagenesis indicating the presence of cryptic binding sites for both these transcription factor families. Co-expression in protoplasts confirmed the activation of SAG21 by WRKY63/ABO3, and SAG21 upregulation elicited by oligogalacturonide elicitors was partially dependent on WRKY63, indicating its role in SAG21 pathogen responses. SAG21 upregulation by ethylene was abolished in the erf1 mutant, while wound-induced SAG21 expression was abolished in anac71 mutants, indicating SAG21 expression can be regulated by several distinct transcription factors depending on the stress condition.
Collapse
Affiliation(s)
- Kelly V Evans
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Elspeth Ransom
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Swapna Nayakoti
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Ben Wilding
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Faezah Mohd Salleh
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
- Investigative and Forensic Sciences Research Group, Universiti Teknologi Malaysia, 81310, Johor Bahru, Johor, Malaysia
| | - Irena Gržina
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Lieselotte Erber
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Carmen Tse
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Claire Hill
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | | | - Alistair Holland
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Sherien Bukhat
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Robert J Herbert
- School of Science and the Environment, University of Worcester, Henwick Grove, Worcester, WR2 6AJ, UK
| | - Barend H J de Graaf
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK
| | - Katherine Denby
- Department of Biology, Centre for Novel Agricultural Products (CNAP), University of York, Heslington, York, YO10 5DD, UK
| | | | - Hilary J Rogers
- School of Biosciences, Cardiff University, Sir Martin Evans Building, Museum Avenue, Cardiff, CF10 3AT, UK.
| |
Collapse
|
2
|
Tomkins M, Hoerbst F, Gupta S, Apelt F, Kehr J, Kragler F, Morris RJ. Exact Bayesian inference for the detection of graft-mobile transcripts from sequencing data. J R Soc Interface 2022; 19:20220644. [PMID: 36514890 PMCID: PMC9748499 DOI: 10.1098/rsif.2022.0644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The long-distance transport of messenger RNAs (mRNAs) has been shown to be important for several developmental processes in plants. A popular method for identifying travelling mRNAs is to perform RNA-Seq on grafted plants. This approach depends on the ability to correctly assign sequenced mRNAs to the genetic background from which they originated. The assignment is often based on the identification of single-nucleotide polymorphisms (SNPs) between otherwise identical sequences. A major challenge is therefore to distinguish SNPs from sequencing errors. Here, we show how Bayes factors can be computed analytically using RNA-Seq data over all the SNPs in an mRNA. We used simulations to evaluate the performance of the proposed framework and demonstrate how Bayes factors accurately identify graft-mobile transcripts. The comparison with other detection methods using simulated data shows how not taking the variability in read depth, error rates and multiple SNPs per transcript into account can lead to incorrect classification. Our results suggest experimental design criteria for successful graft-mobile mRNA detection and show the pitfalls of filtering for sequencing errors or focusing on single SNPs within an mRNA.
Collapse
Affiliation(s)
- Melissa Tomkins
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR47UH, UK
| | - Franziska Hoerbst
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR47UH, UK
| | - Saurabh Gupta
- Max Planck Institute of Molecular Plant Physiology, Max Planck Institute, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| | - Federico Apelt
- Max Planck Institute of Molecular Plant Physiology, Max Planck Institute, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| | - Julia Kehr
- Institute of Plant Science and Microbiology, Universität Hamburg, Ohnhorststrasse 18, Hamburg 22609, Germany
| | - Friedrich Kragler
- Max Planck Institute of Molecular Plant Physiology, Max Planck Institute, Am Mühlenberg 1, Potsdam-Golm 14476, Germany
| | - Richard J. Morris
- Computational and Systems Biology, John Innes Centre, Norwich Research Park, Norwich NR47UH, UK
| |
Collapse
|
3
|
Costa-Silva J, Domingues DS, Menotti D, Hungria M, Lopes FM. Temporal progress of gene expression analysis with RNA-Seq data: A review on the relationship between computational methods. Comput Struct Biotechnol J 2022; 21:86-98. [PMID: 36514333 PMCID: PMC9730150 DOI: 10.1016/j.csbj.2022.11.051] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open
Abstract
Analysis of differential gene expression from RNA-seq data has become a standard for several research areas. The steps for the computational analysis include many data types and file formats, and a wide variety of computational tools that can be applied alone or together as pipelines. This paper presents a review of the differential expression analysis pipeline, addressing its steps and the respective objectives, the principal methods available in each step, and their properties, therefore introducing an organized overview to this context. This review aims to address mainly the aspects involved in the differentially expressed gene (DEG) analysis from RNA sequencing data (RNA-seq), considering the computational methods. In addition, a timeline of the computational methods for DEG is shown and discussed, and the relationships existing between the most important computational tools are presented by an interaction network. A discussion on the challenges and gaps in DEG analysis is also highlighted in this review. This paper will serve as a tutorial for new entrants into the field and help established users update their analysis pipelines.
Collapse
Affiliation(s)
- Juliana Costa-Silva
- Department of Informatics – Federal University of Paraná, Rua Coronel Francisco Heráclito dos Santos, 100, 81531-990 Curitiba, Paraná, Brazil
| | - Douglas S. Domingues
- Department of Genetics, “Luiz de Queiroz” College of Agriculture, University of São Paulo, Av. Pádua Dias, 11, 13418-900 Piracicaba, São Paulo, Brazil
| | - David Menotti
- Department of Informatics – Federal University of Paraná, Rua Coronel Francisco Heráclito dos Santos, 100, 81531-990 Curitiba, Paraná, Brazil
| | - Mariangela Hungria
- Department of Soil Biotecnology - Embrapa Soybean, Cx. Postal 231, 86000-970 Londrina, Paraná, Brazil
| | - Fabrício Martins Lopes
- Department of Computer Science, Universidade Tecnológica Federal do Paraná – UTFPR, Av. Alberto Carazzai, 1640, 86300-000, Cornélio Procópio, Paraná, Brazil
| |
Collapse
|
4
|
Van den Broeck L, Gordon M, Inzé D, Williams C, Sozzani R. Gene Regulatory Network Inference: Connecting Plant Biology and Mathematical Modeling. Front Genet 2020; 11:457. [PMID: 32547596 PMCID: PMC7270862 DOI: 10.3389/fgene.2020.00457] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2019] [Accepted: 04/14/2020] [Indexed: 12/26/2022] Open
Abstract
Plant responses to environmental and intrinsic signals are tightly controlled by multiple transcription factors (TFs). These TFs and their regulatory connections form gene regulatory networks (GRNs), which provide a blueprint of the transcriptional regulations underlying plant development and environmental responses. This review provides examples of experimental methodologies commonly used to identify regulatory interactions and generate GRNs. Additionally, this review describes network inference techniques that leverage gene expression data to predict regulatory interactions. These computational and experimental methodologies yield complex networks that can identify new regulatory interactions, driving novel hypotheses. Biological properties that contribute to the complexity of GRNs are also described in this review. These include network topology, network size, transient binding of TFs to DNA, and competition between multiple upstream regulators. Finally, this review highlights the potential of machine learning approaches to leverage gene expression data to predict phenotypic outputs.
Collapse
Affiliation(s)
- Lisa Van den Broeck
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, United States
| | - Max Gordon
- Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, United States
| | - Dirk Inzé
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Cranos Williams
- Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, United States
| | - Rosangela Sozzani
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, United States
| |
Collapse
|
5
|
Strauß ME, Reid JE, Wernisch L. GPseudoRank: a permutation sampler for single cell orderings. Bioinformatics 2019; 35:611-618. [PMID: 30052778 PMCID: PMC6230469 DOI: 10.1093/bioinformatics/bty664] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2018] [Revised: 06/13/2018] [Accepted: 07/24/2018] [Indexed: 11/30/2022] Open
Abstract
Motivation A number of pseudotime methods have provided point estimates of the ordering of cells for scRNA-seq data. A still limited number of methods also model the uncertainty of the pseudotime estimate. However, there is still a need for a method to sample from complicated and multi-modal distributions of orders, and to estimate changes in the amount of the uncertainty of the order during the course of a biological development, as this can support the selection of suitable cells for the clustering of genes or for network inference. Results In applications to scRNA-seq data we demonstrate the potential of GPseudoRank to sample from complex and multi-modal posterior distributions and to identify phases of lower and higher pseudotime uncertainty during a biological process. GPseudoRank also correctly identifies cells precocious in their antiviral response and links uncertainty in the ordering to metastable states. A variant of the method extends the advantages of Bayesian modelling and MCMC to large droplet-based scRNA-seq datasets. Availability and implementation Our method is available on github: https://github.com/magStra/GPseudoRank. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Magdalena E Strauß
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - John E Reid
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK.,Alan Turing Institute, London, UK
| | - Lorenz Wernisch
- MRC Biostatistics Unit, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| |
Collapse
|
6
|
Abstract
Gaussian process dynamical systems (GPDS) represent Bayesian nonparametric approaches to inference of nonlinear dynamical systems, and provide a principled framework for the learning of biological networks from multiple perturbed time series measurements of gene or protein expression. Such approaches are able to capture the full richness of complex ODE models, and can be scaled for inference in moderately large systems containing hundreds of genes. Related hierarchical approaches allow for inference from multiple datasets in which the underlying generative networks are assumed to have been rewired, either by context-dependent changes in network structure, evolutionary processes, or synthetic manipulation. These approaches can also be used to leverage experimentally determined network structures from one species into another where the network structure is unknown. Collectively, these methods provide a comprehensive and flexible platform for inference from a diverse range of data, with applications in systems and synthetic biology, as well as spatiotemporal modelling of embryo development. In this chapter we provide an overview of GPDS approaches and highlight their applications in the biological sciences, with accompanying tutorials available as a Jupyter notebook from https://github.com/cap76/GPDS .
Collapse
Affiliation(s)
| | - Iulia Gherman
- Warwick Integrative Synthetic Biology Centre, School of Engineering, University of Warwick, Coventry, UK
| | - Anastasiya Sybirna
- Wellcome/CRUK Gurdon Institute, University of Cambridge, Cambridge, UK
- Wellcome/MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge, UK
- Physiology, Development and Neuroscience Department, University of Cambridge, Cambridge, UK
| | - David L Wild
- Department of Statistics and Systems Biology Centre, University of Warwick, Coventry, UK
| |
Collapse
|
7
|
Dondelinger F, Mukherjee S. Statistical Network Inference for Time-Varying Molecular Data with Dynamic Bayesian Networks. Methods Mol Biol 2019; 1883:25-48. [PMID: 30547395 DOI: 10.1007/978-1-4939-8882-2_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
In this chapter, we review the problem of network inference from time-course data, focusing on a class of graphical models known as dynamic Bayesian networks (DBNs). We discuss the relationship of DBNs to models based on ordinary differential equations, and consider extensions to nonlinear time dynamics. We provide an introduction to time-varying DBN models, which allow for changes to the network structure and parameters over time. We also discuss causal perspectives on network inference, including issues around model semantics that can arise due to missing variables. We present a case study of applying time-varying DBNs to gene expression measurements over the life cycle of Drosophila melanogaster. We finish with a discussion of future perspectives, including possible applications of time-varying network inference to single-cell gene expression data.
Collapse
Affiliation(s)
| | - Sach Mukherjee
- German Center for Neurodegenerative Diseases (DZNE), Bonn, Germany
| |
Collapse
|
8
|
Abstract
Gene regulatory networks are powerful abstractions of biological systems. Since the advent of high-throughput measurement technologies in biology in the late 1990s, reconstructing the structure of such networks has been a central computational problem in systems biology. While the problem is certainly not solved in its entirety, considerable progress has been made in the last two decades, with mature tools now available. This chapter aims to provide an introduction to the basic concepts underpinning network inference tools, attempting a categorization which highlights commonalities and relative strengths. While the chapter is meant to be self-contained, the material presented should provide a useful background to the later, more specialized chapters of this book.
Collapse
Affiliation(s)
- Vân Anh Huynh-Thu
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium
| | | |
Collapse
|
9
|
Penfold CA, Sybirna A, Reid JE, Huang Y, Wernisch L, Ghahramani Z, Grant M, Surani MA. Branch-recombinant Gaussian processes for analysis of perturbations in biological time series. Bioinformatics 2018; 34:i1005-i1013. [PMID: 30423108 PMCID: PMC6129282 DOI: 10.1093/bioinformatics/bty603] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Motivation A common class of behaviour encountered in the biological sciences involves branching and recombination. During branching, a statistical process bifurcates resulting in two or more potentially correlated processes that may undergo further branching; the contrary is true during recombination, where two or more statistical processes converge. A key objective is to identify the time of this bifurcation (branch or recombination time) from time series measurements, e.g. by comparing a control time series with perturbed time series. Gaussian processes (GPs) represent an ideal framework for such analysis, allowing for nonlinear regression that includes a rigorous treatment of uncertainty. Currently, however, GP models only exist for two-branch systems. Here, we highlight how arbitrarily complex branching processes can be built using the correct composition of covariance functions within a GP framework, thus outlining a general framework for the treatment of branching and recombination in the form of branch-recombinant Gaussian processes (B-RGPs). Results We first benchmark the performance of B-RGPs compared to a variety of existing regression approaches, and demonstrate robustness to model misspecification. B-RGPs are then used to investigate the branching patterns of Arabidopsis thaliana gene expression following inoculation with the hemibotrophic bacteria, Pseudomonas syringae DC3000, and a disarmed mutant strain, hrpA. By grouping genes according to the number of branches, we could naturally separate out genes involved in basal immune response from those subverted by the virulent strain, and show enrichment for targets of pathogen protein effectors. Finally, we identify two early branching genes WRKY11 and WRKY17, and show that genes that branched at similar times to WRKY11/17 were enriched for W-box binding motifs, and overrepresented for genes differentially expressed in WRKY11/17 knockouts, suggesting that branch time could be used for identifying direct and indirect binding targets of key transcription factors. Availability and implementation https://github.com/cap76/BranchingGPs Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christopher A Penfold
- Wellcome Trust/Cancer Research UK Gurdon Institute, Henry Wellcome Building of Cancer and Developmental Biology, Cambridge, UK.,Department of Statistics, University of Warwick, Coventry, UK
| | - Anastasiya Sybirna
- Wellcome Trust/Cancer Research UK Gurdon Institute, Henry Wellcome Building of Cancer and Developmental Biology, Cambridge, UK.,Wellcome/MRC Stem Cell Institute, University of Cambridge, UK.,Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge
| | - John E Reid
- MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Cambridge, UK.,The Alan Turing Institute, London, UK
| | - Yun Huang
- Wellcome Trust/Cancer Research UK Gurdon Institute, Henry Wellcome Building of Cancer and Developmental Biology, Cambridge, UK.,Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge
| | - Lorenz Wernisch
- MRC Biostatistics Unit, University of Cambridge, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Cambridge, UK
| | | | - Murray Grant
- School of Life Sciences, Gibbet Hill Campus, The University of Warwick, Coventry, UK
| | - M Azim Surani
- Wellcome Trust/Cancer Research UK Gurdon Institute, Henry Wellcome Building of Cancer and Developmental Biology, Cambridge, UK.,Department of Statistics, University of Warwick, Coventry, UK.,Wellcome/MRC Stem Cell Institute, University of Cambridge, UK
| |
Collapse
|
10
|
Temporal transcriptional logic of dynamic regulatory networks underlying nitrogen signaling and use in plants. Proc Natl Acad Sci U S A 2018; 115:6494-6499. [PMID: 29769331 PMCID: PMC6016767 DOI: 10.1073/pnas.1721487115] [Citation(s) in RCA: 111] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Our study exploits time—the relatively unexplored fourth dimension of gene regulatory networks (GRNs)—to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. We introduce several conceptual innovations to the analysis of time-series data in the area of predictive GRNs. Our resulting network now provides the “transcriptional logic” for transcription factor perturbations aimed at improving N-use efficiency, an important issue for global food production in marginal soils and for sustainable agriculture. More broadly, the combination of the time-based approaches we develop and deploy can be applied to uncover the temporal “transcriptional logic” for any response system in biology, agriculture, or medicine. This study exploits time, the relatively unexplored fourth dimension of gene regulatory networks (GRNs), to learn the temporal transcriptional logic underlying dynamic nitrogen (N) signaling in plants. Our “just-in-time” analysis of time-series transcriptome data uncovered a temporal cascade of cis elements underlying dynamic N signaling. To infer transcription factor (TF)-target edges in a GRN, we applied a time-based machine learning method to 2,174 dynamic N-responsive genes. We experimentally determined a network precision cutoff, using TF-regulated genome-wide targets of three TF hubs (CRF4, SNZ, and CDF1), used to “prune” the network to 155 TFs and 608 targets. This network precision was reconfirmed using genome-wide TF-target regulation data for four additional TFs (TGA1, HHO5/6, and PHL1) not used in network pruning. These higher-confidence edges in the GRN were further filtered by independent TF-target binding data, used to calculate a TF “N-specificity” index. This refined GRN identifies the temporal relationship of known/validated regulators of N signaling (NLP7/8, TGA1/4, NAC4, HRS1, and LBD37/38/39) and 146 additional regulators. Six TFs—CRF4, SNZ, CDF1, HHO5/6, and PHL1—validated herein regulate a significant number of genes in the dynamic N response, targeting 54% of N-uptake/assimilation pathway genes. Phenotypically, inducible overexpression of CRF4 in planta regulates genes resulting in altered biomass, root development, and 15NO3− uptake, specifically under low-N conditions. This dynamic N-signaling GRN now provides the temporal “transcriptional logic” for 155 candidate TFs to improve nitrogen use efficiency with potential agricultural applications. Broadly, these time-based approaches can uncover the temporal transcriptional logic for any biological response system in biology, agriculture, or medicine.
Collapse
|
11
|
Scofield S, Murison A, Jones A, Fozard J, Aida M, Band LR, Bennett M, Murray JAH. Coordination of meristem and boundary functions by transcription factors in the SHOOT MERISTEMLESS regulatory network. Development 2018; 145:dev157081. [PMID: 29650590 PMCID: PMC5992597 DOI: 10.1242/dev.157081] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 03/21/2018] [Indexed: 01/29/2023]
Abstract
The Arabidopsis homeodomain transcription factor SHOOT MERISTEMLESS (STM) is crucial for shoot apical meristem (SAM) function, yet the components and structure of the STM gene regulatory network (GRN) are largely unknown. Here, we show that transcriptional regulators are overrepresented among STM-regulated genes and, using these as GRN components in Bayesian network analysis, we infer STM GRN associations and reveal regulatory relationships between STM and factors involved in multiple aspects of SAM function. These include hormone regulation, TCP-mediated control of cell differentiation, AIL/PLT-mediated regulation of pluripotency and phyllotaxis, and specification of meristem-organ boundary zones via CUC1. We demonstrate a direct positive transcriptional feedback loop between STM and CUC1, despite their distinct expression patterns in the meristem and organ boundary, respectively. Our further finding that STM activates expression of the CUC1-targeting microRNA miR164c combined with mathematical modelling provides a potential solution for this apparent contradiction, demonstrating that these proposed regulatory interactions coupled with STM mobility could be sufficient to provide a mechanism for CUC1 localisation at the meristem-organ boundary. Our findings highlight the central role for the STM GRN in coordinating SAM functions.
Collapse
Affiliation(s)
- Simon Scofield
- School of Biosciences, Cardiff University, Museum Avenue, Cardiff CF10 3AX, UK
| | - Alexander Murison
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada
| | - Angharad Jones
- School of Biosciences, Cardiff University, Museum Avenue, Cardiff CF10 3AX, UK
| | - John Fozard
- Department of Computational and Systems Biology, John Innes Centre, Norwich NR4 7UH, UK
| | - Mitsuhiro Aida
- International Research Organization for Advanced Science and Technology (IROAST) Kumamoto University, 2-39-1 Kurokami, Chuo-ku, Kumamoto 860-8555, Japan
| | - Leah R Band
- Centre for Plant Integrative Biology, Division of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough LE12 5RD, UK
- Centre for Mathematical Medicine and Biology, School of Mathematical Sciences, University of Nottingham, Nottingham NG7 2RD, UK
| | - Malcolm Bennett
- Centre for Plant Integrative Biology, Division of Plant and Crop Sciences, School of Biosciences, University of Nottingham, Loughborough LE12 5RD, UK
| | - James A H Murray
- School of Biosciences, Cardiff University, Museum Avenue, Cardiff CF10 3AX, UK
| |
Collapse
|
12
|
Thorne T. Approximate inference of gene regulatory network models from RNA-Seq time series data. BMC Bioinformatics 2018; 19:127. [PMID: 29642837 PMCID: PMC5896118 DOI: 10.1186/s12859-018-2125-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 03/22/2018] [Indexed: 01/08/2023] Open
Abstract
Background Inference of gene regulatory network structures from RNA-Seq data is challenging due to the nature of the data, as measurements take the form of counts of reads mapped to a given gene. Here we present a model for RNA-Seq time series data that applies a negative binomial distribution for the observations, and uses sparse regression with a horseshoe prior to learn a dynamic Bayesian network of interactions between genes. We use a variational inference scheme to learn approximate posterior distributions for the model parameters. Results The methodology is benchmarked on synthetic data designed to replicate the distribution of real world RNA-Seq data. We compare our method to other sparse regression approaches and find improved performance in learning directed networks. We demonstrate an application of our method to a publicly available human neuronal stem cell differentiation RNA-Seq time series data set to infer the underlying network structure. Conclusions Our method is able to improve performance on synthetic data by explicitly modelling the statistical distribution of the data when learning networks from RNA-Seq time series. Applying approximate inference techniques we can learn network structures quickly with only moderate computing resources.
Collapse
Affiliation(s)
- Thomas Thorne
- Department of Computer Science, University of Reading, Reading, UK.
| |
Collapse
|
13
|
Polanski K, Gao B, Mason SA, Brown P, Ott S, Denby KJ, Wild DL. Bringing numerous methods for expression and promoter analysis to a public cloud computing service. Bioinformatics 2018; 34:884-886. [PMID: 29126246 PMCID: PMC6030968 DOI: 10.1093/bioinformatics/btx692] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 11/03/2017] [Indexed: 12/24/2022] Open
Abstract
Summary Every year, a large number of novel algorithms are introduced to the scientific community for a myriad of applications, but using these across different research groups is often troublesome, due to suboptimal implementations and specific dependency requirements. This does not have to be the case, as public cloud computing services can easily house tractable implementations within self-contained dependency environments, making the methods easily accessible to a wider public. We have taken 14 popular methods, the majority related to expression data or promoter analysis, developed these up to a good implementation standard and housed the tools in isolated Docker containers which we integrated into the CyVerse Discovery Environment, making these easily usable for a wide community as part of the CyVerse UK project. Availability and implementation The integrated apps can be found at http://www.cyverse.org/discovery-environment, while the raw code is available at https://github.com/cyversewarwick and the corresponding Docker images are housed at https://hub.docker.com/r/cyversewarwick/. Contact info@cyverse.warwick.ac.uk or D.L.Wild@warwick.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - Paul Brown
- Department of Mathematics
- Systems Biology Centre
| | - Sascha Ott
- Systems Biology Centre
- Department of Computer Science, University of Warwick, Coventry CV4 7AL, UK
| | | | | |
Collapse
|
14
|
Improving Gene Regulatory Network Inference by Incorporating Rates of Transcriptional Changes. Sci Rep 2017; 7:17244. [PMID: 29222512 PMCID: PMC5722905 DOI: 10.1038/s41598-017-17143-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 11/22/2017] [Indexed: 11/18/2022] Open
Abstract
Organisms respond to changes in their environment through transcriptional regulatory networks (TRNs). The regulatory hierarchy of these networks can be inferred from expression data. Computational approaches to identify TRNs can be applied in any species where quality RNA can be acquired, However, ChIP-Seq and similar validation methods are challenging to employ in non-model species. Improving the accuracy of computational inference methods can significantly reduce the cost and time of subsequent validation experiments. We have developed ExRANGES, an approach that improves the ability to computationally infer TRN from time series expression data. ExRANGES utilizes both the rate of change in expression and the absolute expression level to identify TRN connections. We evaluated ExRANGES in five data sets from different model systems. ExRANGES improved the identification of experimentally validated transcription factor targets for all species tested, even in unevenly spaced and sparse data sets. This improved ability to predict known regulator-target relationships enhances the utility of network inference approaches in non-model species where experimental validation is challenging. We integrated ExRANGES with two different network construction approaches and it has been implemented as an R package available here: http://github.com/DohertyLab/ExRANGES. To install the package type: devtools::install_github(“DohertyLab/ExRANGES”).
Collapse
|
15
|
Minas G, Jenkins DJ, Rand DA, Finkenstädt B. Inferring transcriptional logic from multiple dynamic experiments. Bioinformatics 2017; 33:3437-3444. [PMID: 28666320 PMCID: PMC5860162 DOI: 10.1093/bioinformatics/btx407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Revised: 06/02/2017] [Accepted: 06/27/2017] [Indexed: 01/10/2023] Open
Abstract
MOTIVATION The availability of more data of dynamic gene expression under multiple experimental conditions provides new information that makes the key goal of identifying not only the transcriptional regulators of a gene but also the underlying logical structure attainable. RESULTS We propose a novel method for inferring transcriptional regulation using a simple, yet biologically interpretable, model to find the logic by which a set of candidate genes and their associated transcription factors (TFs) regulate the transcriptional process of a gene of interest. Our dynamic model links the mRNA transcription rate of the target gene to the activation states of the TFs assuming that these interactions are consistent across multiple experiments and over time. A trans-dimensional Markov Chain Monte Carlo (MCMC) algorithm is used to efficiently sample the regulatory logic under different combinations of parents and rank the estimated models by their posterior probabilities. We demonstrate and compare our methodology with other methods using simulation examples and apply it to a study of transcriptional regulation of selected target genes of Arabidopsis Thaliana from microarray time series data obtained under multiple biotic stresses. We show that our method is able to detect complex regulatory interactions that are consistent under multiple experimental conditions. AVAILABILITY AND IMPLEMENTATION Programs are written in MATLAB and Statistics Toolbox Release 2016b, The MathWorks, Inc., Natick, Massachusetts, United States and are available on GitHub https://github.com/giorgosminas/TRS and at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software. CONTACT giorgos.minas@warwick.ac.uk or b.f.finkenstadt@warwick.ac.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Giorgos Minas
- Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | - Dafyd J Jenkins
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | - David A Rand
- Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | | |
Collapse
|
16
|
Yu B, Xu JM, Li S, Chen C, Chen RX, Wang L, Zhang Y, Wang MH. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method. Oncotarget 2017; 8:80373-80392. [PMID: 29113310 PMCID: PMC5655205 DOI: 10.18632/oncotarget.21268] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 08/27/2017] [Indexed: 01/31/2023] Open
Abstract
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Collapse
Affiliation(s)
- Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- CAS Key Laboratory of Geospace Environment, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Jia-Meng Xu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Shan Li
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Rui-Xin Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Lei Wang
- Key Laboratory of Eco-chemical Engineering, Ministry of Education, Laboratory of Inorganic Synthesis and Applied Chemistry, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Yan Zhang
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
- College of Electromechanical Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Ming-Hui Wang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| |
Collapse
|
17
|
McGoff KA, Guo X, Deckard A, Kelliher CM, Leman AR, Francey LJ, Hogenesch JB, Haase SB, Harer JL. The Local Edge Machine: inference of dynamic models of gene regulation. Genome Biol 2016; 17:214. [PMID: 27760556 PMCID: PMC5072315 DOI: 10.1186/s13059-016-1076-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 10/03/2016] [Indexed: 12/31/2022] Open
Abstract
We present a novel approach, the Local Edge Machine, for the inference of regulatory interactions directly from time-series gene expression data. We demonstrate its performance, robustness, and scalability on in silico datasets with varying behaviors, sizes, and degrees of complexity. Moreover, we demonstrate its ability to incorporate biological prior information and make informative predictions on a well-characterized in vivo system using data from budding yeast that have been synchronized in the cell cycle. Finally, we use an atlas of transcription data in a mammalian circadian system to illustrate how the method can be used for discovery in the context of large complex networks.
Collapse
Affiliation(s)
- Kevin A McGoff
- Department of Mathematics and Statistics, UNC Charlotte, 9201 University City Blvd., Charlotte, 28269, NC, USA.
| | - Xin Guo
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| | | | | | - Adam R Leman
- Department of Biology, Duke University, Durham, NC, USA
| | - Lauren J Francey
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | - John B Hogenesch
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | | | - John L Harer
- Department of Mathematics, Duke University, Durham, NC, USA
| |
Collapse
|
18
|
Banf M, Rhee SY. Computational inference of gene regulatory networks: Approaches, limitations and opportunities. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2016; 1860:41-52. [PMID: 27641093 DOI: 10.1016/j.bbagrm.2016.09.003] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 09/08/2016] [Accepted: 09/08/2016] [Indexed: 10/21/2022]
Abstract
Gene regulatory networks lie at the core of cell function control. In E. coli and S. cerevisiae, the study of gene regulatory networks has led to the discovery of regulatory mechanisms responsible for the control of cell growth, differentiation and responses to environmental stimuli. In plants, computational rendering of gene regulatory networks is gaining momentum, thanks to the recent availability of high-quality genomes and transcriptomes and development of computational network inference approaches. Here, we review current techniques, challenges and trends in gene regulatory network inference and highlight challenges and opportunities for plant science. We provide plant-specific application examples to guide researchers in selecting methodologies that suit their particular research questions. Given the interdisciplinary nature of gene regulatory network inference, we tried to cater to both biologists and computer scientists to help them engage in a dialogue about concepts and caveats in network inference. Specifically, we discuss problems and opportunities in heterogeneous data integration for eukaryotic organisms and common caveats to be considered during network model evaluation. This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer.
Collapse
Affiliation(s)
- Michael Banf
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford 93405, United States.
| | - Seung Y Rhee
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford 93405, United States.
| |
Collapse
|
19
|
Penfold CA, Millar JBA, Wild DL. Inferring orthologous gene regulatory networks using interspecies data fusion. Bioinformatics 2015; 31:i97-105. [PMID: 26072515 PMCID: PMC4765882 DOI: 10.1093/bioinformatics/btv267] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Motivation: The ability to jointly learn gene regulatory networks (GRNs) in, or leverage GRNs between related species would allow the vast amount of legacy data obtained in model organisms to inform the GRNs of more complex, or economically or medically relevant counterparts. Examples include transferring information from Arabidopsis thaliana into related crop species for food security purposes, or from mice into humans for medical applications. Here we develop two related Bayesian approaches to network inference that allow GRNs to be jointly inferred in, or leveraged between, several related species: in one framework, network information is directly propagated between species; in the second hierarchical approach, network information is propagated via an unobserved ‘hypernetwork’. In both frameworks, information about network similarity is captured via graph kernels, with the networks additionally informed by species-specific time series gene expression data, when available, using Gaussian processes to model the dynamics of gene expression. Results: Results on in silico benchmarks demonstrate that joint inference, and leveraging of known networks between species, offers better accuracy than standalone inference. The direct propagation of network information via the non-hierarchical framework is more appropriate when there are relatively few species, while the hierarchical approach is better suited when there are many species. Both methods are robust to small amounts of mislabelling of orthologues. Finally, the use of Saccharomyces cerevisiae data and networks to inform inference of networks in the budding yeast Schizosaccharomyces pombe predicts a novel role in cell cycle regulation for Gas1 (SPAC19B12.02c), a 1,3-beta-glucanosyltransferase. Availability and implementation: MATLAB code is available from http://go.warwick.ac.uk/systemsbiology/software/. Contact:d.l.wild@warwick.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christopher A Penfold
- Warwick Systems Biology Centre and Biomedical Cell Biology, Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK
| | - Jonathan B A Millar
- Warwick Systems Biology Centre and Biomedical Cell Biology, Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK
| | - David L Wild
- Warwick Systems Biology Centre and Biomedical Cell Biology, Warwick Medical School, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
20
|
Reconstructing the hidden states in time course data of stochastic models. Math Biosci 2015; 269:117-29. [PMID: 26363082 DOI: 10.1016/j.mbs.2015.08.015] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2014] [Revised: 08/05/2015] [Accepted: 08/28/2015] [Indexed: 11/23/2022]
Abstract
Parameter estimation is central for analyzing models in Systems Biology. The relevance of stochastic modeling in the field is increasing. Therefore, the need for tailored parameter estimation techniques is increasing as well. Challenges for parameter estimation are partial observability, measurement noise, and the computational complexity arising from the dimension of the parameter space. This article extends the multiple shooting for stochastic systems' method, developed for inference in intrinsic stochastic systems. The treatment of extrinsic noise and the estimation of the unobserved states is improved, by taking into account the correlation between unobserved and observed species. This article demonstrates the power of the method on different scenarios of a Lotka-Volterra model, including cases in which the prey population dies out or explodes, and a Calcium oscillation system. Besides showing how the new extension improves the accuracy of the parameter estimates, this article analyzes the accuracy of the state estimates. In contrast to previous approaches, the new approach is well able to estimate states and parameters for all the scenarios. As it does not need stochastic simulations, it is of the same order of speed as conventional least squares parameter estimation methods with respect to computational time.
Collapse
|
21
|
Zhang W, Zhou T. A Sparse Reconstruction Approach for Identifying Gene Regulatory Networks Using Steady-State Experiment Data. PLoS One 2015. [PMID: 26207991 PMCID: PMC4514654 DOI: 10.1371/journal.pone.0130979] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Motivation Identifying gene regulatory networks (GRNs) which consist of a large number of interacting units has become a problem of paramount importance in systems biology. Situations exist extensively in which causal interacting relationships among these units are required to be reconstructed from measured expression data and other a priori information. Though numerous classical methods have been developed to unravel the interactions of GRNs, these methods either have higher computing complexities or have lower estimation accuracies. Note that great similarities exist between identification of genes that directly regulate a specific gene and a sparse vector reconstruction, which often relates to the determination of the number, location and magnitude of nonzero entries of an unknown vector by solving an underdetermined system of linear equations y = Φx. Based on these similarities, we propose a novel framework of sparse reconstruction to identify the structure of a GRN, so as to increase accuracy of causal regulation estimations, as well as to reduce their computational complexity. Results In this paper, a sparse reconstruction framework is proposed on basis of steady-state experiment data to identify GRN structure. Different from traditional methods, this approach is adopted which is well suitable for a large-scale underdetermined problem in inferring a sparse vector. We investigate how to combine the noisy steady-state experiment data and a sparse reconstruction algorithm to identify causal relationships. Efficiency of this method is tested by an artificial linear network, a mitogen-activated protein kinase (MAPK) pathway network and the in silico networks of the DREAM challenges. The performance of the suggested approach is compared with two state-of-the-art algorithms, the widely adopted total least-squares (TLS) method and those available results on the DREAM project. Actual results show that, with a lower computational cost, the proposed method can significantly enhance estimation accuracy and greatly reduce false positive and negative errors. Furthermore, numerical calculations demonstrate that the proposed algorithm may have faster convergence speed and smaller fluctuation than other methods when either estimate error or estimate bias is considered.
Collapse
Affiliation(s)
- Wanhong Zhang
- School of Chemical Machinery, Qinghai University, Qinghai, China
- Department of Automation, Tsinghua University, Beijing, China
- * E-mail:
| | - Tong Zhou
- School of Chemical Machinery, Qinghai University, Qinghai, China
- Tsinghua National Laboratory for Information Science and Technology(TNList), Tsinghua University, Beijing, China
| |
Collapse
|
22
|
Abstract
Mathematical models of natural systems are abstractions of much more complicated processes. Developing informative and realistic models of such systems typically involves suitable statistical inference methods, domain expertise, and a modicum of luck. Except for cases where physical principles provide sufficient guidance, it will also be generally possible to come up with a large number of potential models that are compatible with a given natural system and any finite amount of data generated from experiments on that system. Here we develop a computational framework to systematically evaluate potentially vast sets of candidate differential equation models in light of experimental and prior knowledge about biological systems. This topological sensitivity analysis enables us to evaluate quantitatively the dependence of model inferences and predictions on the assumed model structures. Failure to consider the impact of structural uncertainty introduces biases into the analysis and potentially gives rise to misleading conclusions.
Collapse
|
23
|
Piquerez SJM, Harvey SE, Beynon JL, Ntoukakis V. Improving crop disease resistance: lessons from research on Arabidopsis and tomato. FRONTIERS IN PLANT SCIENCE 2014; 5:671. [PMID: 25520730 PMCID: PMC4253662 DOI: 10.3389/fpls.2014.00671] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Accepted: 11/10/2014] [Indexed: 05/04/2023]
Abstract
One of the great challenges for food security in the 21st century is to improve yield stability through the development of disease-resistant crops. Crop research is often hindered by the lack of molecular tools, growth logistics, generation time and detailed genetic annotations, hence the power of model plant species. Our knowledge of plant immunity today has been largely shaped by the use of models, specifically through the use of mutants. We examine the importance of Arabidopsis and tomato as models in the study of plant immunity and how they help us in revealing a detailed and deep understanding of the various layers contributing to the immune system. Here we describe examples of how knowledge from models can be transferred to economically important crops resulting in new tools to enable and accelerate classical plant breeding. We will also discuss how models, and specifically transcriptomics and effectoromics approaches, have contributed to the identification of core components of the defense response which will be key to future engineering of durable and sustainable disease resistance in plants.
Collapse
Affiliation(s)
| | | | - Jim L. Beynon
- School of Life Sciences, University of WarwickCoventry, UK
| | | |
Collapse
|
24
|
Marjoram P, Thomas DC. Next-Generation Sequencing Studies: Optimal Design and Analysis, Missing Heritability and Rare Variants. CURR EPIDEMIOL REP 2014. [DOI: 10.1007/s40471-014-0022-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
25
|
Oates CJ, Korkola J, Gray JW, Mukherjee S. Joint estimation of multiple related biological networks. Ann Appl Stat 2014. [DOI: 10.1214/14-aoas761] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
26
|
Penfold CA, Buchanan-Wollaston V. Modelling transcriptional networks in leaf senescence. JOURNAL OF EXPERIMENTAL BOTANY 2014; 65:3859-73. [PMID: 24600015 DOI: 10.1093/jxb/eru054] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
The process of leaf senescence is induced by an extensive range of developmental and environmental signals and controlled by multiple, cross-linking pathways, many of which overlap with plant stress-response signals. Elucidation of this complex regulation requires a step beyond a traditional one-gene-at-a-time analysis. Application of a more global analysis using statistical and mathematical tools of systems biology is an approach that is being applied to address this problem. A variety of modelling methods applicable to the analysis of current and future senescence data are reviewed and discussed using some senescence-specific examples. Network modelling with a senescence transcriptome time course followed by testing predictions with gene-expression data illustrates the application of systems biology tools.
Collapse
Affiliation(s)
| | - Vicky Buchanan-Wollaston
- Warwick Systems Biology Centre, University of Warwick, Coventry CV4 7AL, UK School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
27
|
von der Heyde S, Bender C, Henjes F, Sonntag J, Korf U, Beißbarth T. Boolean ErbB network reconstructions and perturbation simulations reveal individual drug response in different breast cancer cell lines. BMC SYSTEMS BIOLOGY 2014; 8:75. [PMID: 24970389 PMCID: PMC4087127 DOI: 10.1186/1752-0509-8-75] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 06/10/2014] [Indexed: 12/19/2022]
Abstract
Background Despite promising progress in targeted breast cancer therapy, drug resistance remains challenging. The monoclonal antibody drugs trastuzumab and pertuzumab as well as the small molecule inhibitor erlotinib were designed to prevent ErbB-2 and ErbB-1 receptor induced deregulated protein signalling, contributing to tumour progression. The oncogenic potential of ErbB receptors unfolds in case of overexpression or mutations. Dimerisation with other receptors allows to bypass pathway blockades. Our intention is to reconstruct the ErbB network to reveal resistance mechanisms. We used longitudinal proteomic data of ErbB receptors and downstream targets in the ErbB-2 amplified breast cancer cell lines BT474, SKBR3 and HCC1954 treated with erlotinib, trastuzumab or pertuzumab, alone or combined, up to 60 minutes and 30 hours, respectively. In a Boolean modelling approach, signalling networks were reconstructed based on these data in a cell line and time course specific manner, including prior literature knowledge. Finally, we simulated network response to inhibitor combinations to detect signalling nodes reflecting growth inhibition. Results The networks pointed to cell line specific activation patterns of the MAPK and PI3K pathway. In BT474, the PI3K signal route was favoured, while in SKBR3, novel edges highlighted MAPK signalling. In HCC1954, the inferred edges stimulated both pathways. For example, we uncovered feedback loops amplifying PI3K signalling, in line with the known trastuzumab resistance of this cell line. In the perturbation simulations on the short-term networks, we analysed ERK1/2, AKT and p70S6K. The results indicated a pathway specific drug response, driven by the type of growth factor stimulus. HCC1954 revealed an edgetic type of PIK3CA-mutation, contributing to trastuzumab inefficacy. Drug impact on the AKT and ERK1/2 signalling axes is mirrored by effects on RB and RPS6, relating to phenotypic events like cell growth or proliferation. Therefore, we additionally analysed RB and RPS6 in the long-term networks. Conclusions We derived protein interaction models for three breast cancer cell lines. Changes compared to the common reference network hint towards individual characteristics and potential drug resistance mechanisms. Simulation of perturbations were consistent with the experimental data, confirming our combined reverse and forward engineering approach as valuable for drug discovery and personalised medicine.
Collapse
Affiliation(s)
| | | | | | | | | | - Tim Beißbarth
- Statistical Bioinformatics, Department of Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, 37073 Göttingen, Germany.
| |
Collapse
|
28
|
Abstract
Deciphering the networks that underpin complex biological processes using experimental data remains a significant, but promising, challenge, a task made all the harder by the added complexity of host-pathogen interactions. The aim of this article is to review the progress in understanding plant immunity made so far by applying network modeling algorithms and to show how this computational/mathematical strategy is facilitating a systems view of plant defense. We review the different types of network modeling that have been used, the data required, and the type of insight that such modeling can provide. We discuss the current challenges in modeling the regulatory networks that underlie plant defense and the future developments that may help address these challenges.
Collapse
Affiliation(s)
- Oliver Windram
- Department of Life Sciences, Imperial College London, SL5 7PY, United Kingdom;
| | | | | |
Collapse
|
29
|
Hickman R, Hill C, Penfold CA, Breeze E, Bowden L, Moore JD, Zhang P, Jackson A, Cooke E, Bewicke-Copley F, Mead A, Beynon J, Wild DL, Denby KJ, Ott S, Buchanan-Wollaston V. A local regulatory network around three NAC transcription factors in stress responses and senescence in Arabidopsis leaves. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2013; 75:26-39. [PMID: 23578292 PMCID: PMC3781708 DOI: 10.1111/tpj.12194] [Citation(s) in RCA: 133] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Revised: 03/26/2013] [Accepted: 03/28/2013] [Indexed: 05/18/2023]
Abstract
A model is presented describing the gene regulatory network surrounding three similar NAC transcription factors that have roles in Arabidopsis leaf senescence and stress responses. ANAC019, ANAC055 and ANAC072 belong to the same clade of NAC domain genes and have overlapping expression patterns. A combination of promoter DNA/protein interactions identified using yeast 1-hybrid analysis and modelling using gene expression time course data has been applied to predict the regulatory network upstream of these genes. Similarities and divergence in regulation during a variety of stress responses are predicted by different combinations of upstream transcription factors binding and also by the modelling. Mutant analysis with potential upstream genes was used to test and confirm some of the predicted interactions. Gene expression analysis in mutants of ANAC019 and ANAC055 at different times during leaf senescence has revealed a distinctly different role for each of these genes. Yeast 1-hybrid analysis is shown to be a valuable tool that can distinguish clades of binding proteins and be used to test and quantify protein binding to predicted promoter motifs.
Collapse
Affiliation(s)
- Richard Hickman
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
- These authors contributed equally
| | - Claire Hill
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
- These authors contributed equally
| | | | - Emily Breeze
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - Laura Bowden
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
- Present address: Department of Biology, Faculty of Science,
Utrecht University, PO Box 800.56, 3508 TB, Utrecht, The Netherlands
| | - Jonathan D Moore
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
| | - Peijun Zhang
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - Alison Jackson
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - Emma Cooke
- Molecular Organisation and Assembly of Cells Doctoral
Training Centre, University of WarwickCoventry CV4 7AL, UK
| | | | - Andrew Mead
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - Jim Beynon
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - David L Wild
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
| | - Katherine J Denby
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
| | - Sascha Ott
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
| | - Vicky Buchanan-Wollaston
- Warwick Systems Biology Centre, University of
WarwickCoventry CV4 7AL, UK
- School of Life Sciences, University of
WarwickCoventry CV4 7AL, UK, and
- For correspondence (e-mail
)
| |
Collapse
|
30
|
Thorne T, Fratta P, Hanna MG, Cortese A, Plagnol V, Fisher EM, Stumpf MPH. Graphical modelling of molecular networks underlying sporadic inclusion body myositis. MOLECULAR BIOSYSTEMS 2013; 9:1736-42. [PMID: 23595110 DOI: 10.1039/c3mb25497f] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Here we present a novel statistical methodology that allows us to analyze gene expression data that have been collected from a number of different cases or conditions in a unified framework. Using a Bayesian nonparametric framework we develop a hierarchical model wherein genes can maintain a shared set of interactions between different cases, whilst also exhibiting behaviour that is unique to specific cases, sets of conditions, or groups of data points. By doing so we are able to not only combine data from different cases but also to discern the unique regulatory interactions that differentiate the cases. We apply our method to clinical data collected from patients suffering from sporadic Inclusion Body Myositis (sIBM), as well as control samples, and demonstrate the ability of our method to infer regulatory interactions that are unique to the disease cases of interest. The method thus balances the statistical need to include as many patients and controls as possible, and the clinical need to maintain potentially cryptic differences among patients and between patients and controls at the regulatory level.
Collapse
Affiliation(s)
- Thomas Thorne
- Centre for Bioinformatics and Systems Biology, Imperial College London, London, UK.
| | | | | | | | | | | | | |
Collapse
|
31
|
Äijö T, Granberg K, Lähdesmäki H. Sorad: a systems biology approach to predict and modulate dynamic signaling pathway response from phosphoproteome time-course measurements. ACTA ACUST UNITED AC 2013; 29:1283-91. [PMID: 23505293 DOI: 10.1093/bioinformatics/btt130] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Signaling networks mediate responses to different stimuli using a multitude of feed-forward, feedback and cross-talk mechanisms, and malfunctions in these mechanisms have an important role in various diseases. To understand a disease and to help discover novel therapeutic approaches, we have to reveal the molecular mechanisms underlying signal transduction and use that information to design targeted perturbations. RESULTS We have pursued this direction by developing an efficient computational approach, Sorad, which can estimate the structure of signal transduction networks and the associated continuous signaling dynamics from phosphoprotein time-course measurements. Further, Sorad can identify experimental conditions that modulate the signaling toward a desired response. We have analyzed comprehensive phosphoprotein time-course data from a human hepatocellular liver carcinoma cell line and demonstrate here that Sorad provides more accurate predictions of phosphoprotein responses to given stimuli than previously presented methods and, importantly, that Sorad can estimate experimental conditions to achieve a desired signaling response. Because Sorad is data driven, it has a high potential to generate novel hypotheses for further research. Our analysis of the hepatocellular liver carcinoma data predict a regulatory connection where AKT activity is dependent on IKK in TGFα stimulated cells, which is supported by the original data but not included in the original model. AVAILABILITY An implementation of the proposed computational methods will be available at http://research.ics.aalto.fi/csb/software/. CONTACT tarmo.aijo@aalto.fi or harri.lahdesmaki@aalto.fi SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tarmo Äijö
- Department of Information and Computer Science, Aalto University, FI-00076 AALTO, Finland.
| | | | | |
Collapse
|
32
|
Muraro D, Voβ U, Wilson M, Bennett M, Byrne H, De Smet I, Hodgman C, King J. Inference of the genetic network regulating lateral root initiation in Arabidopsis thaliana. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:50-60. [PMID: 23702543 DOI: 10.1109/tcbb.2013.3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Regulation of gene expression is crucial for organism growth, and it is one of the challenges in systems biology to reconstruct the underlying regulatory biological networks from transcriptomic data. The formation of lateral roots in Arabidopsis thaliana is stimulated by a cascade of regulators of which only the interactions of its initial elements have been identified. Using simulated gene expression data with known network topology, we compare the performance of inference algorithms, based on different approaches, for which ready-to-use software is available. We show that their performance improves with the network size and the inclusion of mutants. We then analyze two sets of genes, whose activity is likely to be relevant to lateral root initiation in Arabidopsis, and assess causality of their regulatory interactions by integrating sequence analysis with the intersection of the results of the best performing methods on time series and mutants. The methods applied capture known interactions between genes that are candidate regulators at early stages of development. The network inferred from genes significantly expressed during lateral root formation exhibits distinct scale free, small world and hierarchical properties and the nodes with a high out-degree may warrant further investigation.
Collapse
Affiliation(s)
- Daniele Muraro
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, United Kingdom.
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Thorne T, Stumpf MPH. Inference of temporally varying Bayesian networks. Bioinformatics 2012; 28:3298-305. [PMID: 23074260 PMCID: PMC3519458 DOI: 10.1093/bioinformatics/bts614] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Revised: 10/04/2012] [Accepted: 10/11/2012] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION When analysing gene expression time series data, an often overlooked but crucial aspect of the model is that the regulatory network structure may change over time. Although some approaches have addressed this problem previously in the literature, many are not well suited to the sequential nature of the data. RESULTS Here, we present a method that allows us to infer regulatory network structures that may vary between time points, using a set of hidden states that describe the network structure at a given time point. To model the distribution of the hidden states, we have applied the Hierarchical Dirichlet Process Hidden Markov Model, a non-parametric extension of the traditional Hidden Markov Model, which does not require us to fix the number of hidden states in advance. We apply our method to existing microarray expression data as well as demonstrating is efficacy on simulated test data.
Collapse
Affiliation(s)
- Thomas Thorne
- Centre of Integrative Systems Biology and Bioinformatics, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK.
| | | |
Collapse
|