1
|
Pomiès L, Brouard C, Duruflé H, Maigné É, Carré C, Gody L, Trösser F, Katsirelos G, Mangin B, Langlade NB, de Givry S. Gene regulatory network inference methodology for genomic and transcriptomic data acquired in genetically related heterozygote individuals. Bioinformatics 2022; 38:4127-4134. [PMID: 35792837 DOI: 10.1093/bioinformatics/btac445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Revised: 06/17/2022] [Accepted: 07/05/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks in non-independent genetically related panels is a methodological challenge. This hampers evolutionary and biological studies using heterozygote individuals such as in wild sunflower populations or cultivated hybrids. RESULTS First, we simulated 100 datasets of gene expressions and polymorphisms, displaying the same gene expression distributions, heterozygosities and heritabilities as in our dataset including 173 genes and 353 genotypes measured in sunflower hybrids. Secondly, we performed a meta-analysis based on six inference methods [least absolute shrinkage and selection operator (Lasso), Random Forests, Bayesian Networks, Markov Random Fields, Ordinary Least Square and fast inference of networks from directed regulation (Findr)] and selected the minimal density networks for better accuracy with 64 edges connecting 79 genes and 0.35 area under precision and recall (AUPR) score on average. We identified that triangles and mutual edges are prone to errors in the inferred networks. Applied on classical datasets without heterozygotes, our strategy produced a 0.65 AUPR score for one dataset of the DREAM5 Systems Genetics Challenge. Finally, we applied our method to an experimental dataset from sunflower hybrids. We successfully inferred a network composed of 105 genes connected by 106 putative regulations with a major connected component. AVAILABILITY AND IMPLEMENTATION Our inference methodology dedicated to genomic and transcriptomic data is available at https://forgemia.inra.fr/sunrise/inference_methods. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lise Pomiès
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Céline Brouard
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Harold Duruflé
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Élise Maigné
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Clément Carré
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - Louise Gody
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Fulya Trösser
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| | - George Katsirelos
- MIA-Paris, AgroParisTech, Université Paris-Saclay, INRAE, Paris 75231, France
| | - Brigitte Mangin
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Nicolas B Langlade
- LIPME, Université de Toulouse, INRAE, CNRS, Castanet-Tolosan 31326, France
| | - Simon de Givry
- MIAT, Université Fédérale de Toulouse, INRAE, Castanet-Tolosan 31326, France
| |
Collapse
|
2
|
Alvarez JM, Brooks MD, Swift J, Coruzzi GM. Time-Based Systems Biology Approaches to Capture and Model Dynamic Gene Regulatory Networks. ANNUAL REVIEW OF PLANT BIOLOGY 2021; 72:105-131. [PMID: 33667112 PMCID: PMC9312366 DOI: 10.1146/annurev-arplant-081320-090914] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
All aspects of transcription and its regulation involve dynamic events. However, capturing these dynamic events in gene regulatory networks (GRNs) offers both a promise and a challenge. The promise is that capturing and modeling the dynamic changes in GRNs will allow us to understand how organisms adapt to a changing environment. The ability to mount a rapid transcriptional response to environmental changes is especially important in nonmotile organisms such as plants. The challenge is to capture these dynamic, genome-wide events and model them in GRNs. In this review, we cover recent progress in capturing dynamic interactions of transcription factors with their targets-at both the local and genome-wide levels-and how they are used to learn how GRNs operate as a function of time. We also discuss recent advances that employ time-based machine learning approaches to forecast gene expression at future time points, a key goal of systems biology.
Collapse
Affiliation(s)
- Jose M Alvarez
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Universidad Mayor, Santiago, Chile
- ANID-Millennium Science Initiative Program-Millennium Institute for Integrative Biology (iBio), Santiago, Chile
| | - Matthew D Brooks
- Global Change and Photosynthesis Research Unit, US Department of Agriculture Agricultural Research Service, Urbana, Illinois 61801, USA
| | - Joseph Swift
- Salk Institute for Biological Studies, La Jolla, California 92037, USA
| | - Gloria M Coruzzi
- Center for Genomics and Systems Biology, Department of Biology, New York University, New York, NY 10003, USA;
| |
Collapse
|
3
|
Harrington SA, Backhaus AE, Singh A, Hassani-Pak K, Uauy C. The Wheat GENIE3 Network Provides Biologically-Relevant Information in Polyploid Wheat. G3 (BETHESDA, MD.) 2020; 10:3675-3686. [PMID: 32747342 PMCID: PMC7534433 DOI: 10.1534/g3.120.401436] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 08/01/2020] [Indexed: 11/18/2022]
Abstract
Gene regulatory networks are powerful tools which facilitate hypothesis generation and candidate gene discovery. However, the extent to which the network predictions are biologically relevant is often unclear. Recently a GENIE3 network which predicted targets of wheat transcription factors was produced. Here we used an independent RNA-Seq dataset to test the predictions of the wheat GENIE3 network for the senescence-regulating transcription factor NAM-A1 (TraesCS6A02G108300). We re-analyzed the RNA-Seq data against the RefSeqv1.0 genome and identified a set of differentially expressed genes (DEGs) between the wild-type and nam-a1 mutant which recapitulated the known role of NAM-A1 in senescence and nutrient remobilisation. We found that the GENIE3-predicted target genes of NAM-A1 overlap significantly with the DEGs, more than would be expected by chance. Based on high levels of overlap between GENIE3-predicted target genes and the DEGs, we identified candidate senescence regulators. We then explored genome-wide trends in the network related to polyploidy and found that only homeologous transcription factors are likely to share predicted targets in common. However, homeologs which vary in expression levels across tissues are less likely to share predicted targets than those that do not, suggesting that they may be more likely to act in distinct pathways. This work demonstrates that the wheat GENIE3 network can provide biologically-relevant predictions of transcription factor targets, which can be used for candidate gene prediction and for global analyses of transcription factor function. The GENIE3 network has now been integrated into the KnetMiner web application, facilitating its use in future studies.
Collapse
Affiliation(s)
- Sophie A Harrington
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom
| | - Anna E Backhaus
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom
| | - Ajit Singh
- Computational and Analytical Sciences, Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, United Kingdom
| | - Keywan Hassani-Pak
- Computational and Analytical Sciences, Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, United Kingdom
| | - Cristobal Uauy
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, United Kingdom
| |
Collapse
|
4
|
Huynh-Thu VA, Geurts P. Unsupervised Gene Network Inference with Decision Trees and Random Forests. Methods Mol Biol 2019; 1883:195-215. [PMID: 30547401 DOI: 10.1007/978-1-4939-8882-2_8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
In this chapter, we introduce the reader to a popular family of machine learning algorithms, called decision trees. We then review several approaches based on decision trees that have been developed for the inference of gene regulatory networks (GRNs). Decision trees have indeed several nice properties that make them well-suited for tackling this problem: they are able to detect multivariate interacting effects between variables, are non-parametric, have good scalability, and have very few parameters. In particular, we describe in detail the GENIE3 algorithm, a state-of-the-art method for GRN inference.
Collapse
Affiliation(s)
- Vân Anh Huynh-Thu
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium.
| | - Pierre Geurts
- Department of Electrical Engineering and Computer Science, University of Liège, Liège, Belgium
| |
Collapse
|
5
|
Causal Queries from Observational Data in Biological Systems via Bayesian Networks: An Empirical Study in Small Networks. Methods Mol Biol 2018. [PMID: 30547398 DOI: 10.1007/978-1-4939-8882-2_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2023]
Abstract
Biological networks are a very convenient modeling and visualization tool to discover knowledge from modern high-throughput genomics and post-genomics data sets. Indeed, biological entities are not isolated but are components of complex multilevel systems. We go one step further and advocate for the consideration of causal representations of the interactions in living systems. We present the causal formalism and bring it out in the context of biological networks, when the data is observational. We also discuss its ability to decipher the causal information flow as observed in gene expression. We also illustrate our exploration by experiments on small simulated networks as well as on a real biological data set.
Collapse
|
6
|
Baedke J, Mc Manus SF. From seconds to eons: Time scales, hierarchies, and processes in evo-devo. STUDIES IN HISTORY AND PHILOSOPHY OF BIOLOGICAL AND BIOMEDICAL SCIENCES 2018; 72:38-48. [PMID: 30391127 DOI: 10.1016/j.shpsc.2018.10.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Revised: 07/13/2018] [Accepted: 10/01/2018] [Indexed: 06/08/2023]
Abstract
This paper addresses the role of time scales in conceptualizing biological hierarchies. So far, the concept of hierarchies in philosophy of science has been dominated by the idea of composition and parthood, respectively. However, this view does not exhaust the diversity of hierarchical descriptions in the biosciences. Therefore, we highlight a type of hierarchy usually overlooked by philosophers of science. It distinguishes processes based on the different time scales (i.e. rates, frequencies, and rhythms) on which they occur. These time scale hierarchies often are connected with assumptions defended in process ontology. Due to their ability to describe interlevel dynamics of various kinds, we call these hierarchies 'dynamic hierarchies.' In order to highlight and discuss their organization, explanatory roles, and epistemic virtues we focus on dynamic hierarchies in developmental biology and evolutionary developmental biology (evo-devo). In these fields, dynamic hierarchies offer crucial complementary information to descriptions of compositional hierarchies.
Collapse
Affiliation(s)
- Jan Baedke
- Department of Philosophy I, Ruhr University Bochum, Universitätsstr. 150, 44801, Bochum, Germany.
| | - Siobhan F Mc Manus
- Center for Interdisciplinary Research in the Sciences and Humanities (CEIICH), UNAM, Av. Universidad Nacional 3000, C. P. 04510, Mexico City, Mexico.
| |
Collapse
|
7
|
dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data. Sci Rep 2018; 8:3384. [PMID: 29467401 PMCID: PMC5821733 DOI: 10.1038/s41598-018-21715-0] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Accepted: 02/06/2018] [Indexed: 11/22/2022] Open
Abstract
The elucidation of gene regulatory networks is one of the major challenges of systems biology. Measurements about genes that are exploited by network inference methods are typically available either in the form of steady-state expression vectors or time series expression data. In our previous work, we proposed the GENIE3 method that exploits variable importance scores derived from Random forests to identify the regulators of each target gene. This method provided state-of-the-art performance on several benchmark datasets, but it could however not specifically be applied to time series expression data. We propose here an adaptation of the GENIE3 method, called dynamical GENIE3 (dynGENIE3), for handling both time series and steady-state expression data. The proposed method is evaluated extensively on the artificial DREAM4 benchmarks and on three real time series expression datasets. Although dynGENIE3 does not systematically yield the best performance on each and every network, it is competitive with diverse methods from the literature, while preserving the main advantages of GENIE3 in terms of scalability.
Collapse
|
8
|
Mochida K, Koda S, Inoue K, Nishii R. Statistical and Machine Learning Approaches to Predict Gene Regulatory Networks From Transcriptome Datasets. FRONTIERS IN PLANT SCIENCE 2018; 9:1770. [PMID: 30555503 PMCID: PMC6281826 DOI: 10.3389/fpls.2018.01770] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Accepted: 11/14/2018] [Indexed: 05/20/2023]
Abstract
Statistical and machine learning (ML)-based methods have recently advanced in construction of gene regulatory network (GRNs) based on high-throughput biological datasets. GRNs underlie almost all cellular phenomena; hence, comprehensive GRN maps are essential tools to elucidate gene function, thereby facilitating the identification and prioritization of candidate genes for functional analysis. High-throughput gene expression datasets have yielded various statistical and ML-based algorithms to infer causal relationship between genes and decipher GRNs. This review summarizes the recent advancements in the computational inference of GRNs, based on large-scale transcriptome sequencing datasets of model plants and crops. We highlight strategies to select contextual genes for GRN inference, and statistical and ML-based methods for inferring GRNs based on transcriptome datasets from plants. Furthermore, we discuss the challenges and opportunities for the elucidation of GRNs based on large-scale datasets obtained from emerging transcriptomic applications, such as from population-scale, single-cell level, and life-course transcriptome analyses.
Collapse
Affiliation(s)
- Keiichi Mochida
- Bioproductivity Informatics Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
- Microalgae Production Control Technology Laboratory, RIKEN Baton Zone Program, RIKEN Cluster for Science, Technology and Innovation Hub, Yokohama, Japan
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
- Kihara Institute for Biological Research, Yokohama City University, Yokohama, Japan
- *Correspondence: Keiichi Mochida, Ryuei Nishii,
| | - Satoru Koda
- Graduate School of Mathematics, Kyushu University, Fukuoka, Japan
| | - Komaki Inoue
- Bioproductivity Informatics Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Ryuei Nishii
- Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan
- *Correspondence: Keiichi Mochida, Ryuei Nishii,
| |
Collapse
|
9
|
Jia J, Zhou J, Shi W, Cao X, Luo J, Polle A, Luo ZB. Comparative transcriptomic analysis reveals the roles of overlapping heat-/drought-responsive genes in poplars exposed to high temperature and drought. Sci Rep 2017; 7:43215. [PMID: 28233854 PMCID: PMC5324098 DOI: 10.1038/srep43215] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 01/20/2017] [Indexed: 02/03/2023] Open
Abstract
High temperature (HT) and drought are both critical factors that constrain tree growth and survival under global climate change, but it is surprising that the transcriptomic reprogramming and physiological relays involved in the response to HT and/or drought remain unknown in woody plants. Thus, Populus simonii saplings were exposed to either ambient temperature or HT combined with sufficient watering or drought. RNA-sequencing analysis showed that a large number of genes were differentially expressed in poplar roots and leaves in response to HT and/or desiccation, but only a small number of these genes were identified as overlapping heat-/drought-responsive genes that are mainly involved in RNA regulation, transport, hormone metabolism, and stress. Furthermore, the overlapping heat-/drought-responsive genes were co-expressed and formed hierarchical genetic regulatory networks under each condition compared. HT-/drought-induced transcriptomic reprogramming is linked to physiological relays in poplar roots and leaves. For instance, HT- and/or drought-induced abscisic acid accumulation and decreases in auxin and other phytohormones corresponded well with the differential expression of a few genes involved in hormone metabolism. These results suggest that overlapping heat-/drought-responsive genes will play key roles in the transcriptional and physiological reconfiguration of poplars to HT and/or drought under future climatic scenarios.
Collapse
Affiliation(s)
- Jingbo Jia
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Silviculture of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China.,College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, 712100, P. R. China
| | - Jing Zhou
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Silviculture of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Wenguang Shi
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Silviculture of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| | - Xu Cao
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, 712100, P. R. China
| | - Jie Luo
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, 712100, P. R. China
| | - Andrea Polle
- Büsgen-Institute, Department of Forest Botany and Tree Physiology, Georg-August University, Büsgenweg 2, 37077 Göttingen, Germany
| | - Zhi-Bin Luo
- State Key Laboratory of Tree Genetics and Breeding, Key Laboratory of Silviculture of the State Forestry Administration, Research Institute of Forestry, Chinese Academy of Forestry, Beijing 100091, China
| |
Collapse
|
10
|
Turner KG, Nurkowski KA, Rieseberg LH. Gene expression and drought response in an invasive thistle. Biol Invasions 2016. [DOI: 10.1007/s10530-016-1308-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Gallagher JP, Grover CE, Hu G, Wendel JF. Insights into the Ecology and Evolution of Polyploid Plants through Network Analysis. Mol Ecol 2016; 25:2644-60. [PMID: 27027619 DOI: 10.1111/mec.13626] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2015] [Revised: 03/09/2016] [Accepted: 03/22/2016] [Indexed: 12/18/2022]
Abstract
Polyploidy is a widespread phenomenon throughout eukaryotes, with important ecological and evolutionary consequences. Although genes operate as components of complex pathways and networks, polyploid changes in genes and gene expression have typically been evaluated as either individual genes or as a part of broad-scale analyses. Network analysis has been fruitful in associating genomic and other 'omic'-based changes with phenotype for many systems. In polyploid species, network analysis has the potential not only to facilitate a better understanding of the complex 'omic' underpinnings of phenotypic and ecological traits common to polyploidy, but also to provide novel insight into the interaction among duplicated genes and genomes. This adds perspective to the global patterns of expression (and other 'omic') change that accompany polyploidy and to the patterns of recruitment and/or loss of genes following polyploidization. While network analysis in polyploid species faces challenges common to other analyses of duplicated genomes, present technologies combined with thoughtful experimental design provide a powerful system to explore polyploid evolution. Here, we demonstrate the utility and potential of network analysis to questions pertaining to polyploidy with an example involving evolution of the transgressively superior cotton fibres found in polyploid Gossypium hirsutum. By combining network analysis with prior knowledge, we provide further insights into the role of profilins in fibre domestication and exemplify the potential for network analysis in polyploid species.
Collapse
Affiliation(s)
- Joseph P Gallagher
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Corrinne E Grover
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Guanjing Hu
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Jonathan F Wendel
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| |
Collapse
|
12
|
Des Marais DL, Juenger TE. Brachypodium and the Abiotic Environment. GENETICS AND GENOMICS OF BRACHYPODIUM 2015. [DOI: 10.1007/7397_2015_13] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|