76
|
Chevin LM, Leung C, Le Rouzic A, Uller T. Using phenotypic plasticity to understand the structure and evolution of the genotype-phenotype map. Genetica 2021; 150:209-221. [PMID: 34617196 DOI: 10.1007/s10709-021-00135-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 09/22/2021] [Indexed: 10/20/2022]
Abstract
Deciphering the genotype-phenotype map necessitates relating variation at the genetic level to variation at the phenotypic level. This endeavour is inherently limited by the availability of standing genetic variation, the rate of spontaneous mutation to novo genetic variants, and possible biases associated with induced mutagenesis. An interesting alternative is to instead rely on the environment as a source of variation. Many phenotypic traits change plastically in response to the environment, and these changes are generally underlain by changes in gene expression. Relating gene expression plasticity to the phenotypic plasticity of more integrated organismal traits thus provides useful information about which genes influence the development and expression of which traits, even in the absence of genetic variation. We here appraise the prospects and limits of such an environment-for-gene substitution for investigating the genotype-phenotype map. We review models of gene regulatory networks, and discuss the different ways in which they can incorporate the environment to mechanistically model phenotypic plasticity and its evolution. We suggest that substantial progress can be made in deciphering this genotype-environment-phenotype map, by connecting theory on gene regulatory network to empirical patterns of gene co-expression, and by more explicitly relating gene expression to the expression and development of phenotypes, both theoretically and empirically.
Collapse
|
77
|
Massri AJ, Greenstreet L, Afanassiev A, Berrio A, Wray GA, Schiebinger G, McClay DR. Developmental single-cell transcriptomics in the Lytechinus variegatus sea urchin embryo. Development 2021; 148:271986. [PMID: 34463740 DOI: 10.1242/dev.198614] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Accepted: 08/20/2021] [Indexed: 12/30/2022]
Abstract
Using scRNA-seq coupled with computational approaches, we studied transcriptional changes in cell states of sea urchin embryos during development to the larval stage. Eighteen closely spaced time points were taken during the first 24 h of development of Lytechinus variegatus (Lv). Developmental trajectories were constructed using Waddington-OT, a computational approach to 'stitch' together developmental time points. Skeletogenic and primordial germ cell trajectories diverged early in cleavage. Ectodermal progenitors were distinct from other lineages by the 6th cleavage, although a small percentage of ectoderm cells briefly co-expressed endoderm markers that indicated an early ecto-endoderm cell state, likely in cells originating from the equatorial region of the egg. Endomesoderm cells also originated at the 6th cleavage and this state persisted for more than two cleavages, then diverged into distinct endoderm and mesoderm fates asynchronously, with some cells retaining an intermediate specification status until gastrulation. Seventy-nine out of 80 genes (99%) examined, and included in published developmental gene regulatory networks (dGRNs), are present in the Lv-scRNA-seq dataset and are expressed in the correct lineages in which the dGRN circuits operate.
Collapse
|
78
|
Saint-André V. Computational biology approaches for mapping transcriptional regulatory networks. Comput Struct Biotechnol J 2021; 19:4884-4895. [PMID: 34522292 PMCID: PMC8426465 DOI: 10.1016/j.csbj.2021.08.028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 12/13/2022] Open
Abstract
Transcriptional Regulatory Networks (TRNs) are mainly responsible for the cell-type- or cell-state-specific expression of gene sets from the same DNA sequence. However, so far there are no precise maps of TRNs available for each cell-type or cell-state, and no ideal tool to map those networks clearly and in full from biological samples. In this review, major approaches and tools to map TRNs from high-throughput data are presented, depending on the type of methods or data used to infer them, and their advantages and limitations are discussed. After summarizing the main principles defining the topology and structure–function relationships in TRNs, an overview of the extensive work done to map TRNs from bulk transcriptomic data will be presented by type of methodological approach. Most recent modellings of TRNs using other types of molecular data or integrating different data types, including single-cell RNA-sequencing and chromatin information, will then be discussed, before briefly concluding with improvements expected to come in the field.
Collapse
|
79
|
Yu H, Wang X, Cao H. Construction and investigation of a circRNA-associated ceRNA regulatory network in Tetralogy of Fallot. BMC Cardiovasc Disord 2021; 21:437. [PMID: 34521346 PMCID: PMC8442392 DOI: 10.1186/s12872-021-02217-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 08/20/2021] [Indexed: 12/14/2022] Open
Abstract
Background As the most frequent type of cyanotic congenital heart disease (CHD), tetralogy of Fallot (TOF) has a relatively poor prognosis without corrective surgery. Circular RNAs (circRNAs) represent a novel class of endogenous noncoding RNAs that regulate target gene expression posttranscriptionally in heart development. Here, we investigated the potential role of the ceRNA network in the pathogenesis of TOF. Methods To identify circRNA expression profiles in TOF, microarrays were used to screen the differentially expressed circRNAs between 3 TOF and 3 control human myocardial tissue samples. Then, a dysregulated circRNA-associated ceRNA network was constructed using the established multistep screening strategy. Results In summary, a total of 276 differentially expressed circRNAs were identified, including 214 upregulated and 62 downregulated circRNAs in TOF samples. By constructing the circRNA-associated ceRNA network based on bioinformatics data, a total of 19 circRNAs, 9 miRNAs, and 34 mRNAs were further screened. Moreover, by enlarging the sample size, the qPCR results validated the positive correlations between hsa_circ_0007798 and HIF1A. Conclusions The findings in this study provide a comprehensive understanding of the ceRNA network involved in TOF biology, such as the hsa_circ_0007798/miR-199b-5p/HIF1A signalling axis, and may offer candidate diagnostic biomarkers or potential therapeutic targets for TOF. In addition, we propose that the ceRNA network regulates TOF progression. Supplementary Information The online version contains supplementary material available at 10.1186/s12872-021-02217-w.
Collapse
|
80
|
Liu W, Jiang Y, Peng L, Sun X, Gan W, Zhao Q, Tang H. Inferring Gene Regulatory Networks Using the Improved Markov Blanket Discovery Algorithm. Interdiscip Sci 2021; 14:168-181. [PMID: 34495484 DOI: 10.1007/s12539-021-00478-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 11/26/2022]
Abstract
Inferring gene regulatory networks (GRNs) from microarray data can help us understand the mechanisms of life and eventually develop effective therapies. Currently, many computational methods have been used in inferring GRNs. However, owing to high-dimensional data and small samples, these methods often tend to introduce redundant regulatory relationships. Therefore, a novel network inference method based on the improved Markov blanket discovery algorithm, IMBDANET, is proposed to infer GRNs. Specifically, for each target gene, data processing inequality was applied to the Markov blanket discovery algorithm for the accurate differentiation of direct regulatory genes from indirect regulatory genes. Finally, direct regulatory genes were used in constructing GRNs, and the network structure was optimized according to the importance degree score. Experimental results on six public network datasets show that the proposed method can be effectively used to infer GRNs.
Collapse
|
81
|
Bernal V, Bischoff R, Horvatovich P, Guryev V, Grzegorczyk M. The 'un-shrunk' partial correlation in Gaussian graphical models. BMC Bioinformatics 2021; 22:424. [PMID: 34493207 PMCID: PMC8424921 DOI: 10.1186/s12859-021-04313-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 08/02/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In systems biology, it is important to reconstruct regulatory networks from quantitative molecular profiles. Gaussian graphical models (GGMs) are one of the most popular methods to this end. A GGM consists of nodes (representing the transcripts, metabolites or proteins) inter-connected by edges (reflecting their partial correlations). Learning the edges from quantitative molecular profiles is statistically challenging, as there are usually fewer samples than nodes ('high dimensional problem'). Shrinkage methods address this issue by learning a regularized GGM. However, it remains open to study how the shrinkage affects the final result and its interpretation. RESULTS We show that the shrinkage biases the partial correlation in a non-linear way. This bias does not only change the magnitudes of the partial correlations but also affects their order. Furthermore, it makes networks obtained from different experiments incomparable and hinders their biological interpretation. We propose a method, referred to as 'un-shrinking' the partial correlation, which corrects for this non-linear bias. Unlike traditional methods, which use a fixed shrinkage value, the new approach provides partial correlations that are closer to the actual (population) values and that are easier to interpret. This is demonstrated on two gene expression datasets from Escherichia coli and Mus musculus. CONCLUSIONS GGMs are popular undirected graphical models based on partial correlations. The application of GGMs to reconstruct regulatory networks is commonly performed using shrinkage to overcome the 'high-dimensional problem'. Besides it advantages, we have identified that the shrinkage introduces a non-linear bias in the partial correlations. Ignoring this type of effects caused by the shrinkage can obscure the interpretation of the network, and impede the validation of earlier reported results.
Collapse
|
82
|
The early Drosophila embryo as a model system for quantitative biology. Cells Dev 2021; 168:203722. [PMID: 34298230 DOI: 10.1016/j.cdev.2021.203722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2021] [Revised: 06/03/2021] [Accepted: 07/13/2021] [Indexed: 11/20/2022]
Abstract
With the rise of new tools, from controlled genetic manipulations and optogenetics to improved microscopy, it is now possible to make clear, quantitative and reproducible measurements of biological processes. The humble fruit fly Drosophila melanogaster, with its ease of genetic manipulation combined with excellent imaging accessibility, has become a major model system for performing quantitative in vivo measurements. Such measurements are driving a new wave of interest from physicists and engineers, who are developing a range of testable dynamic models of active systems to understand fundamental biological processes. The reproducibility of the early Drosophila embryo has been crucial for understanding how biological systems are robust to unavoidable noise during development. Insights from quantitative in vivo experiments in the Drosophila embryo are having an impact on our understanding of critical biological processes, such as how cells make decisions and how complex tissue shape emerges. Here, to highlight the power of using Drosophila embryogenesis for quantitative biology, I focus on three main areas: (1) formation and robustness of morphogen gradients; (2) how gene regulatory networks ensure precise boundary formation; and (3) how mechanical interactions drive packing and tissue folding. I further discuss how such data has driven advances in modelling.
Collapse
|
83
|
Wittmann MT, Katada S, Sock E, Kirchner P, Ekici AB, Wegner M, Nakashima K, Lie DC, Reis A. scRNA sequencing uncovers a TCF4-dependent transcription factor network regulating commissure development in mouse. Development 2021; 148:269257. [PMID: 34184026 PMCID: PMC8327186 DOI: 10.1242/dev.196022] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 06/15/2021] [Indexed: 01/21/2023]
Abstract
Transcription factor 4 (TCF4) is a crucial regulator of neurodevelopment and has been linked to the pathogenesis of autism, intellectual disability and schizophrenia. As a class I bHLH transcription factor (TF), it is assumed that TCF4 exerts its neurodevelopmental functions through dimerization with proneural class II bHLH TFs. Here, we aim to identify TF partners of TCF4 in the control of interhemispheric connectivity formation. Using a new bioinformatic strategy integrating TF expression levels and regulon activities from single cell RNA-sequencing data, we find evidence that TCF4 interacts with non-bHLH TFs and modulates their transcriptional activity in Satb2+ intercortical projection neurons. Notably, this network comprises regulators linked to the pathogenesis of neurodevelopmental disorders, e.g. FOXG1, SOX11 and BRG1. In support of the functional interaction of TCF4 with non-bHLH TFs, we find that TCF4 and SOX11 biochemically interact and cooperatively control commissure formation in vivo, and regulate the transcription of genes implicated in this process. In addition to identifying new candidate interactors of TCF4 in neurodevelopment, this study illustrates how scRNA-Seq data can be leveraged to predict TF networks in neurodevelopmental processes. Summary: Single-cell RNA sequencing identifies interactions of TCF4 with non-bHLH transcription factors linked to neurodevelopmental and neuropsychiatric disease in the regulation of interhemispheric projection neuron development.
Collapse
|
84
|
Oriola D, Spagnoli FM. Engineering life in synthetic systems. Development 2021; 148:270849. [PMID: 34251450 DOI: 10.1242/dev.199497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Accepted: 06/01/2021] [Indexed: 11/20/2022]
Abstract
The second EMBO-EMBL Symposium 'Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture' was held virtually in March 2021, with participants from all over the world joining from the comfort of their sofas to discuss synthetic morphogenesis at large. Leading scientists from a range of disciplines, including developmental biology, physics, chemistry and computer science, covered a gamut of topics from the principles of cell and tissue organization, patterning and gene regulatory networks, to synthetic approaches for exploring evolutionary and developmental biology principles. Here, we describe some of the high points.
Collapse
|
85
|
Grimes T, Datta S. SeqNet: An R Package for Generating Gene-Gene Networks and Simulating RNA-Seq Data. J Stat Softw 2021; 98:10.18637/jss.v098.i12. [PMID: 34321962 PMCID: PMC8315007 DOI: 10.18637/jss.v098.i12] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Gene expression data provide an abundant resource for inferring connections in gene regulatory networks. While methodologies developed for this task have shown success, a challenge remains in comparing the performance among methods. Gold-standard datasets are scarce and limited in use. And while tools for simulating expression data are available, they are not designed to resemble the data obtained from RNA-seq experiments. SeqNet is an R package that provides tools for generating a rich variety of gene network structures and simulating RNA-seq data from them. This produces in silico RNA-seq data for benchmarking and assessing gene network inference methods. The package is available on CRAN and on GitHub at https://github.com/tgrimes/SeqNet.
Collapse
|
86
|
Melkus G, Rucevskis P, Celms E, Čerāns K, Freivalds K, Kikusts P, Lace L, Opmanis M, Rituma D, Viksna J. Network motif-based analysis of regulatory patterns in paralogous gene pairs. J Bioinform Comput Biol 2021; 18:2040008. [PMID: 32698721 DOI: 10.1142/s0219720020400089] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Current high-throughput experimental techniques make it feasible to infer gene regulatory interactions at the whole-genome level with reasonably good accuracy. Such experimentally inferred regulatory networks have become available for a number of simpler model organisms such as S. cerevisiae, and others. The availability of such networks provides an opportunity to compare gene regulatory processes at the whole genome level, and in particular, to assess similarity of regulatory interactions for homologous gene pairs either from the same or from different species. We present here a new technique for analyzing the regulatory interaction neighborhoods of paralogous gene pairs. Our central focus is the analysis of S. cerevisiae gene interaction graphs, which are of particular interest due to the ancestral whole-genome duplication (WGD) that allows to distinguish between paralogous transcription factors that are traceable to this duplication event and other paralogues. Similar analysis is also applied to E. coli and C. elegans networks. We compare paralogous gene pairs according to the presence and size of bi-fan arrays, classically associated in the literature with gene duplication, within other network motifs. We further extend this framework beyond transcription factor comparison to obtain topology-based similarity metrics based on the overlap of interaction neighborhoods applicable to most genes in a given organism. We observe that our network divergence metrics show considerably larger similarity between paralogues, especially those traceable to WGD. This is the case for both yeast and C. elegans, but not for E. coli regulatory network. While there is no obvious cross-species link between metrics, different classes of paralogues show notable differences in interaction overlap, with traceable duplications tending toward higher overlap compared to genes with shared protein families. Our findings indicate that divergence in paralogous interaction networks reflects a shared genetic origin, and that our approach may be useful for investigating structural similarity in the interaction networks of paralogous genes.
Collapse
|
87
|
Bartlett T. Fusion of single-cell transcriptome and DNA-binding data, for genomic network inference in cortical development. BMC Bioinformatics 2021; 22:301. [PMID: 34088262 PMCID: PMC8176738 DOI: 10.1186/s12859-021-04201-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 05/12/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Network models are well-established as very useful computational-statistical tools in cell biology. However, a genomic network model based only on gene expression data can, by definition, only infer gene co-expression networks. Hence, in order to infer gene regulatory patterns, it is necessary to also include data related to binding of regulatory factors to DNA. RESULTS We propose a new dynamic genomic network model, for inferring patterns of genomic regulatory influence in dynamic processes such as development. Our model fuses experiment-specific gene expression data with publicly available DNA-binding data. The method we propose is computationally efficient, and can be applied to genome-wide data with tens of thousands of transcripts. Thus, our method is well suited for use as an exploratory tool for genome-wide data. We apply our method to data from human fetal cortical development, and our findings confirm genomic regulatory patterns which are recognised as being fundamental to neuronal development. CONCLUSIONS Our method provides a mathematical/computational toolbox which, when coupled with targeted experiments, will reveal and confirm important new functional genomic regulatory processes in mammalian development.
Collapse
|
88
|
Gao X, Guo P, Wang Z, Chen C, Ren Z. Transcriptome profiling reveals response genes for downy mildew resistance in cucumber. PLANTA 2021; 253:112. [PMID: 33914134 DOI: 10.1007/s00425-021-03603-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 03/22/2021] [Indexed: 06/12/2023]
Abstract
We discovered a potential defense pathway of cucumber to downy mildew. The signaling that activates the pathways of ROS and lignin accumulation may play an important role in the defense response. Many resistance genes were identified by transcriptome analysis. Downy mildew (DM), caused by Pseudoperonospora cubensis, is one of the most destructive diseases and causes severe yield losses of cucumber. However, the genes and pathways involved in regulating DM resistance were still poorly understood. In our study, we observed that the highly sensitive inbred line 53 (IL53) exhibited more severe disease symptoms than the highly resistant inbred line 51 (IL51) under P. cubensis infection. Furthermore, lignin, limiting the germination and extension of P. cubensis, and H2O2, as a signaling molecule during the resistant process, were both shown to increase, indicating that the signaling that activates these pathways might be responsible for the resistance divergence between IL51 and IL53. Transcriptome analysis, using the resistant and susceptible pools in F2 populations with IL51 and IL53 as parents, showed that a series of differentially expressed genes was involved in multiple functions of defense response: pathogen-associated molecular pattern recognition, signal transduction, reactive oxygen species and lignin accumulation, and transcription regulators. Combining physiological data with transcriptomes, we predicted a potential molecular mechanism of cucumber resistance to DM. Our research provided a foundation for further studies on the mechanism of cucumber resistance to DM.
Collapse
|
89
|
Shafiee Kamalabad M, Grzegorczyk M. A new Bayesian piecewise linear regression model for dynamic network reconstruction. BMC Bioinformatics 2021; 22:196. [PMID: 33902443 PMCID: PMC8074473 DOI: 10.1186/s12859-021-03998-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 02/05/2021] [Indexed: 11/10/2022] Open
Abstract
Background Linear regression models are important tools for learning regulatory networks from gene expression time series. A conventional assumption for non-homogeneous regulatory processes on a short time scale is that the network structure stays constant across time, while the network parameters are time-dependent. The objective is then to learn the network structure along with changepoints that divide the time series into time segments. An uncoupled model learns the parameters separately for each segment, while a coupled model enforces the parameters of any segment to stay similar to those of the previous segment. In this paper, we propose a new consensus model that infers for each individual time segment whether it is coupled to (or uncoupled from) the previous segment. Results The results show that the new consensus model is superior to the uncoupled and the coupled model, as well as superior to a recently proposed generalized coupled model. Conclusions The newly proposed model has the uncoupled and the coupled model as limiting cases, and it is able to infer the best trade-off between them from the data. Supplementary Information The online version supplementary material available at 10.1186/s12859-021-03998-9.
Collapse
|
90
|
Feng S, Gao Y, Yin D, Lv L, Wen Y, Li Z, Wang B, Wu M, Liu B. Identification of Lumican and Fibromodulin as Hub Genes Associated with Accumulation of Extracellular Matrix in Diabetic Nephropathy. Kidney Blood Press Res 2021; 46:275-285. [PMID: 33887734 DOI: 10.1159/000514013] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 12/22/2020] [Indexed: 01/27/2023] Open
Abstract
INTRODUCTION Diabetic nephropathy (DN) remains a major cause of end-stage renal disease. The development of novel biomarkers and early diagnosis of DN are of great clinical importance. The goal of this study was to identify hub genes with diagnostic potential for DN by weighted gene co-expression network analysis (WGCNA). METHODS Gene Expression Omnibus database was searched for microarray data including distinct types of CKD. Gene co-expression network was constructed, and modules specific for DN were identified by WGCNA. Gene ontology (GO) analysis was performed, and the hub genes were screened out within the selected gene modules. In addition, cross-validation was performed in an independent dataset and in samples of renal biopsies with DN and other types of glomerular diseases. RESULTS Dataset GSE99339 was selected, and a total of 179 microdissected glomeruli samples were analyzed, including DN, normal control, and 7 groups of other glomerular diseases. Twenty-three modules of the total 10,947 genes were grouped by WGCNA, and a module was specifically correlated with DN (r = 0.54, p = 9e-15). GO analysis showed that module genes were mainly enriched in the accumulation of extracellular matrix (ECM). LUM, ELN, FBLN1, MMP2, FBLN5, and FMOD were identified as hub genes. Cross verification showed LUM and FMOD were higher in the DN group and were negatively correlated with estimated glomerular filtration rate (eGFR). In renal biopsies, expression levels of LUM and FMOD were higher in DN than IgA nephropathy, membranous nephropathy, and normal controls. CONCLUSION By using WGCNA approach, we identified LUM and FMOD related to ECM accumulation and were specific for DN. These 2 genes may represent potential candidate diagnostic biomarkers of DN.
Collapse
|
91
|
Layous M, Khalaily L, Gildor T, Ben-Tabou de-Leon S. The tolerance to hypoxia is defined by a time-sensitive response of the gene regulatory network in sea urchin embryos. Development 2021; 148:dev.195859. [PMID: 33795230 PMCID: PMC8077511 DOI: 10.1242/dev.195859] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 03/22/2021] [Indexed: 12/12/2022]
Abstract
Deoxygenation, the reduction of oxygen level in the oceans induced by global warming and anthropogenic disturbances, is a major threat to marine life. This change in oxygen level could be especially harmful to marine embryos that use endogenous hypoxia and redox gradients as morphogens during normal development. Here, we show that the tolerance to hypoxic conditions changes between different developmental stages of the sea urchin embryo, possibly due to the structure of the gene regulatory networks (GRNs). We demonstrate that during normal development, the bone morphogenetic protein (BMP) pathway restricts the activity of the vascular endothelial growth factor (VEGF) pathway to two lateral domains and this restriction controls proper skeletal patterning. Hypoxia applied during early development strongly perturbs the activity of Nodal and BMP pathways that affect the VEGF pathway, dorsal-ventral (DV) and skeletogenic patterning. These pathways are largely unaffected by hypoxia applied after DV-axis formation. We propose that the use of redox and hypoxia as morphogens makes the sea urchin embryo highly sensitive to environmental hypoxia during early development, but the GRN structure provides higher tolerance to hypoxia at later stages. Summary: The use of hypoxia and redox gradients as morphogens makes sea urchin early development sensitive to environmental hypoxia. This sensitivity decreases later, possibly due to the gene regulatory network structure.
Collapse
|
92
|
Qualitative Modeling, Analysis and Control of Synthetic Regulatory Circuits. Methods Mol Biol 2021; 2229:1-40. [PMID: 33405215 DOI: 10.1007/978-1-0716-1032-9_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Qualitative modeling approaches are promising and still underexploited tools for the analysis and design of synthetic circuits. They can make predictions of circuit behavior in the absence of precise, quantitative information. Moreover, they provide direct insight into the relation between the feedback structure and the dynamical properties of a network. We review qualitative modeling approaches by focusing on two specific formalisms, Boolean networks and piecewise-linear differential equations, and illustrate their application by means of three well-known synthetic circuits. We describe various methods for the analysis of state transition graphs, discrete representations of the network dynamics that are generated in both modeling frameworks. We also briefly present the problem of controlling synthetic circuits, an emerging topic that could profit from the capacity of qualitative modeling approaches to rapidly scan a space of design alternatives.
Collapse
|
93
|
Nandi M. Role of integrated noise in pathway-specific signal propagation in feed-forward loops. Theory Biosci 2021; 140:139-155. [PMID: 33751398 DOI: 10.1007/s12064-021-00338-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 02/15/2021] [Indexed: 11/25/2022]
Abstract
Cells impose optimal noise control mechanism in diverse situations to cope with distinct environmental cues. Sometimes, it is desirable for the cell to utilize fluctuations for noise-driven processes. In other cases, noise can be harmful to the cell to show optimal fitness. It is, therefore, important to unravel the noise propagation mechanism inside the cell. Such noise controlling mechanism is accomplished by using gene transcription regulatory networks. One such gene regulatory network is feed-forward loop, having three regulatory nodes S, X and Y. Here, we consider the most abundant type 1 of coherent and incoherent feed-forward loops with both OR and AND logic functions, forming four different architectures. In OR logic function, the functions representing S and X act additively for the regulation of Y, while in AND logic function, the same functions (S and X) act multiplicatively for the regulation of Y. Measurement of susceptibility of the signal at output Y is done using elasticity of each regulation in FFLs. Using susceptibility, we demonstrate the nature of pathway integration by which one-step and two-step pathways get overlapped. The integration type is competitive for motifs having OR gate, while it is noncompetitive for the same with AND gate. The pathway integration property explains the output noise behavior of the motifs properly but cannot infer about the mechanism by which the upstream noise propagates to output. To account this, the total output noise is decomposed, which results in integrated noise as an additional noise source along with pathway-specific noise components. The integrated noise is found to appear as a consequence of integration between the pathways and has different functional characteristics explaining noise amplification and noise attenuation property of coherent and incoherent feed-forward loops, respectively. The noise decomposition also quantifies the contribution of different noise sources toward total noise. Finally, the noise propagation is being tuned as a function of input signal noise and its time scale of fluctuations, which shows considerable intrinsic noise strength and relatively slow relaxation time scale causes a higher degree of noise propagation in FFLs.
Collapse
|
94
|
Network-driven discovery yields new insight into Shox2-dependent cardiac rhythm control. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2021; 1864:194702. [PMID: 33706013 DOI: 10.1016/j.bbagrm.2021.194702] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 02/22/2021] [Accepted: 02/23/2021] [Indexed: 11/23/2022]
Abstract
The homeodomain transcription factor SHOX2 is involved in the development and function of the heart's primary pacemaker, the sinoatrial node (SAN), and has been associated with cardiac conduction-related diseases such as atrial fibrillation and sinus node dysfunction. To shed light on Shox2-dependent genetic processes involved in these diseases, we established a murine embryonic stem cell (ESC) cardiac differentiation model to investigate Shox2 pathways in SAN-like cardiomyocytes. Differential RNA-seq-based expression profiling of Shox2+/+ and Shox2-/- ESCs revealed 94 dysregulated transcripts in Shox2-/- ESC-derived SAN-like cells. Of these, 15 putative Shox2 target genes were selected for further validation based on comparative expression analysis with SAN- and right atria-enriched genes. Network-based analyses, integrating data from the Mouse Organogenesis Cell Atlas and the Ingenuity pathways, as well as validation in mouse and zebrafish models confirmed a regulatory role for the novel identified Shox2 target genes including Cav1, Fkbp10, Igfbp5, Mcf2l and Nr2f2. Our results indicate that genetic networks involving SHOX2 may contribute to conduction traits through the regulation of these genes.
Collapse
|
95
|
Yang Y, Yao S, Ding JM, Chen W, Guo Y. Enhancer-Gene Interaction Analyses Identified the Epidermal Growth Factor Receptor as a Susceptibility Gene for Type 2 Diabetes Mellitus. Diabetes Metab J 2021; 45:241-250. [PMID: 32602275 PMCID: PMC8024152 DOI: 10.4093/dmj.2019.0204] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Accepted: 01/03/2020] [Indexed: 01/06/2023] Open
Abstract
Background Genetic interactions are known to play an important role in the missing heritability problem for type 2 diabetes mellitus (T2DM). Interactions between enhancers and their target genes play important roles in gene regulation and disease pathogenesis. In the present study, we aimed to identify genetic interactions between enhancers and their target genes associated with T2DM. Methods We performed genetic interaction analyses of enhancers and protein-coding genes for T2DM in 2,696 T2DM patients and 3,548 controls of European ancestry. A linear regression model was used to identify single nucleotide polymorphism (SNP) pairs that could affect the expression of the protein-coding genes. Differential expression analyses were used to identify differentially expressed susceptibility genes in diabetic and nondiabetic subjects. Results We identified one SNP pair, rs4947941×rs7785013, significantly associated with T2DM (combined P=4.84×10-10). The SNP rs4947941 was annotated as an enhancer, and rs7785013 was located in the epidermal growth factor receptor (EGFR) gene. This SNP pair was significantly associated with EGFR expression in the pancreas (P=0.033), and the minor allele "A" of rs7785013 decreased EGFR gene expression and the risk of T2DM with an increase in the dosage of "T" of rs4947941. EGFR expression was significantly upregulated in T2DM patients, which was consistent with the effect of rs4947941×rs7785013 on T2DM and EGFR expression. A functional validation study using the Mouse Genome Informatics (MGI) database showed that EGFR was associated with diabetes-relevant phenotypes. Conclusion Genetic interaction analyses of enhancers and protein-coding genes suggested that EGFR may be a novel susceptibility gene for T2DM.
Collapse
|
96
|
Åkesson J, Lubovac-Pilav Z, Magnusson R, Gustafsson M. ComHub: Community predictions of hubs in gene regulatory networks. BMC Bioinformatics 2021; 22:58. [PMID: 33563211 PMCID: PMC7871572 DOI: 10.1186/s12859-021-03987-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 01/29/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Hub transcription factors, regulating many target genes in gene regulatory networks (GRNs), play important roles as disease regulators and potential drug targets. However, while numerous methods have been developed to predict individual regulator-gene interactions from gene expression data, few methods focus on inferring these hubs. RESULTS We have developed ComHub, a tool to predict hubs in GRNs. ComHub makes a community prediction of hubs by averaging over predictions by a compendium of network inference methods. Benchmarking ComHub against the DREAM5 challenge data and two independent gene expression datasets showed a robust performance of ComHub over all datasets. CONCLUSIONS In contrast to other evaluated methods, ComHub consistently scored among the top performing methods on data from different sources. Lastly, we implemented ComHub to work with both predefined networks and to perform stand-alone network inference, which will make the method generally applicable.
Collapse
|
97
|
Turki T, Taguchi YH. Discriminating the single-cell gene regulatory networks of human pancreatic islets: A novel deep learning application. Comput Biol Med 2021; 132:104257. [PMID: 33740535 DOI: 10.1016/j.compbiomed.2021.104257] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 02/01/2021] [Accepted: 02/03/2021] [Indexed: 12/24/2022]
Abstract
Analysis of single-cell pancreatic data can play an important role in understanding various metabolic diseases and health conditions. Due to the sparsity and noise present in such single-cell gene expression data, inference of single-cell gene regulatory networks remains a challenge. Since recent studies have reported the reliable inference of single-cell gene regulatory networks (SCGRNs), the current study focused on discriminating the SCGRNs of T2D patients from those of healthy controls. By accurately distinguishing SCGRNs of healthy pancreas from those of T2D pancreas, it would be possible to annotate, organize, visualize, and identify common patterns of SCGRNs in metabolic diseases. Such annotated SCGRNs could play an important role in accelerating the process of building large data repositories. This study aimed to contribute to the development of a novel deep learning (DL) application. First, we generated a dataset consisting of 224 SCGRNs belonging to both T2D and healthy pancreas and made it freely available. Next, we chose seven DL architectures, including VGG16, VGG19, Xception, ResNet50, ResNet101, DenseNet121, and DenseNet169, trained each of them on the dataset, and checked their prediction based on a test set. Of note, we evaluated the DL architectures on a single NVIDIA GeForce RTX 2080Ti GPU. Experimental results on the whole dataset, using several performance measures, demonstrated the superiority of VGG19 DL model in the automatic classification of SCGRNs, derived from the single-cell pancreatic data.
Collapse
|
98
|
Wani N, Raza K. MKL-GRNI: A parallel multiple kernel learning approach for supervised inference of large-scale gene regulatory networks. PeerJ Comput Sci 2021; 7:e363. [PMID: 33817013 PMCID: PMC7924726 DOI: 10.7717/peerj-cs.363] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 12/29/2020] [Indexed: 06/12/2023]
Abstract
High throughput multi-omics data generation coupled with heterogeneous genomic data fusion are defining new ways to build computational inference models. These models are scalable and can support very large genome sizes with the added advantage of exploiting additional biological knowledge from the integration framework. However, the limitation with such an arrangement is the huge computational cost involved when learning from very large datasets in a sequential execution environment. To overcome this issue, we present a multiple kernel learning (MKL) based gene regulatory network (GRN) inference approach wherein multiple heterogeneous datasets are fused using MKL paradigm. We formulate the GRN learning problem as a supervised classification problem, whereby genes regulated by a specific transcription factor are separated from other non-regulated genes. A parallel execution architecture is devised to learn a large scale GRN by decomposing the initial classification problem into a number of subproblems that run as multiple processes on a multi-processor machine. We evaluate the approach in terms of increased speedup and inference potential using genomic data from Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The results thus obtained demonstrate that the proposed method exhibits better classification accuracy and enhanced speedup compared to other state-of-the-art methods while learning large scale GRNs from multiple and heterogeneous datasets.
Collapse
|
99
|
Muley VY. Mathematical Programming for Modeling Expression of a Gene Using Gurobi Optimizer to Identify Its Transcriptional Regulators. Methods Mol Biol 2021; 2328:99-113. [PMID: 34251621 DOI: 10.1007/978-1-0716-1534-8_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The cell expresses various genes in specific contexts with respect to internal and external perturbations to invoke appropriate responses. Transcription factors (TFs) orchestrate and define the expression level of genes by binding to their regulatory regions. Dysregulated expression of TFs often leads to aberrant expression changes of their target genes and is responsible for several diseases including cancers. In the last two decades, several studies experimentally identified target genes of several TFs. However, these studies are limited to a small fraction of the total TFs encoded by an organism, and only for those amenable to experimental settings. Experimental limitations lead to many computational techniques having been proposed to predict target genes of TFs. Linear modeling of gene expression is one of the most promising computational approaches, readily applicable to the thousands of expression datasets available in the public domain across diverse phenotypes. Linear models assume that the expression of a gene is the sum of expression of TFs regulating it. In this chapter, I introduce mathematical programming for the linear modeling of gene expression, which has certain advantages over the conventional statistical modeling approaches. It is fast, scalable to genome level and most importantly, allows mixed integer programming to tune the model outcome with prior knowledge on gene regulation.
Collapse
|
100
|
Abstract
Diverse cellular phenotypes are determined by groups of transcription factors (TFs) and other regulators that influence each others' gene expression, forming transcriptional gene regulatory networks (GRNs). In many biological contexts, especially in development and associated diseases, the expression of the genes in GRNs is not static but evolves in time. Modeling the dynamics of GRN state is an important approach for understanding diverse cellular phenomena such as cell-fate specification, pluripotency and cell-fate reprogramming, oncogenesis, and tissue regeneration. In this protocol, we describe how to model GRNs using a data-driven dynamic modeling methodology, gene circuits. Gene circuits do not require knowledge of the GRN topology and connectivity but instead learn them from training data, making them very general and applicable to diverse biological contexts. We utilize the MATLAB-based gene circuit modeling software Fast Inference of Gene Regulation (FIGR) for training the model on quantitative gene expression data and simulating the GRN. We describe all the steps in the modeling life cycle, from formulating the model, training the model using FIGR, simulating the GRN, to analyzing and interpreting the model output. This protocol highlights these steps with the example of a dynamical model of the gap gene GRN involved in Drosophila segmentation and includes example MATLAB statements for each step.
Collapse
|