1
|
Zhou M, Li H, Gao B, Zhao Y. The prognostic impact of pathogenic stromal cell-associated genes in lung adenocarcinoma. Comput Biol Med 2024; 178:108692. [PMID: 38879932 DOI: 10.1016/j.compbiomed.2024.108692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 04/22/2024] [Accepted: 06/01/2024] [Indexed: 06/18/2024]
Abstract
BACKGROUND Lung adenocarcinoma (LUAD) stands as the most prevalent subtype among lung cancers. Interactions between stromal and cancer cells influence tumor growth, invasion, and metastasis. However, the regulatory mechanisms of stromal cells in the lung adenocarcinoma tumor microenvironment remain unclear. This study seeks to elucidate the regulatory connections among critical pathogenic genes and their associated expression variations within distinct stromal cell subtypes. METHOD Analysis and investigation were conducted on a total of 114,019 single-cell RNA data and 346 The Cancer Genome Atlas (TCGA) LUAD-related samples using bioinformatics and statistical algorithms. Differential gene expression analysis was performed for tumor samples and controls, followed by Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Differential genes between stromal cells and other cell clusters were identified and intersected with the differential genes from TCGA. We employed a combination of LASSO regression and multivariable Cox regression to identify the ultimate set of pathogenic gene. Survival models were trained to predict the relationship between patient survival and these pathogenic genes. Analysis of transcription factor (TF) cell specificity and pseudotime trajectories within stromal cell subpopulations revealed that vascular endothelial cells (ECs) and matrix cancer-associated fibroblasts (CAFs) are key in regulation of the prognosis-associated genes CAV2, COL1A1, TIMP1, ETS2, AKAP12, ID1 and COL1A2. RESULTS Seven pathogenic genes associated with LUAD in stromal cells were identified and used to develop a survival model. High expression of these genes is linked to a greater risk of poor survival. Stromal cells were categorized into eight subtypes and one unannotated cluster. Mesothelial cells, vascular endothelial cells (ECs), and matrix cancer-associated fibroblasts (CAFs) showed cell-specific regulation of the pathogenic genes. CONCLUSIONS The seven disease-causing genes in vascular ECs and matrix CAFs can be used to detect the survival status of LUAD patients, providing new directions for future targeted drug design.
Collapse
|
2
|
Ripley DM, Garner T, Stevens A. Developing the 'omic toolkit of comparative physiologists. COMPARATIVE BIOCHEMISTRY AND PHYSIOLOGY. PART D, GENOMICS & PROTEOMICS 2024; 52:101287. [PMID: 38972179 DOI: 10.1016/j.cbd.2024.101287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 06/22/2024] [Accepted: 07/01/2024] [Indexed: 07/09/2024]
Abstract
Typical 'omic analyses reduce complex biological systems to simple lists of supposedly independent variables, failing to account for changes in the wider transcriptional landscape. In this commentary, we discuss the utility of network approaches for incorporating this wider context into the study of physiological phenomena. We highlight opportunities to build on traditional network tools by utilising cutting-edge techniques to account for higher order interactions (i.e. beyond pairwise associations) within datasets, allowing for more accurate models of complex 'omic systems. Finally, we show examples of previous works utilising network approaches to gain additional insight into their organisms of interest. As 'omics grow in both their popularity and breadth of application, so does the requirement for flexible analytical tools capable of interpreting and synthesising complex datasets.
Collapse
|
3
|
Zhang M, Zhang X, Niu J, Hua C, Liu P, Zhong G. Integrated analysis of single-cell RNA sequencing and bulk RNA data reveals gene regulatory networks and targets in dilated cardiomyopathy. Sci Rep 2024; 14:13942. [PMID: 38886541 PMCID: PMC11183045 DOI: 10.1038/s41598-024-64693-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 06/12/2024] [Indexed: 06/20/2024] Open
Abstract
Dilated cardiomyopathy (DCM) is a common cause of heart failure, thromboembolism, arrhythmias, and sudden cardiac death. The quality of life and long-term survival rates of patients with dilated DCM have greatly improved in recent decades. Nevertheless, the clinical prognosis for DCM patients remains unfavorable. The primary driving factors underlying the pathogenesis of DCM remain incompletely understood. The present study aimed to identify driving factors underlying the pathogenesis of DCM from the perspective of gene regulatory networks. Single-cell RNA sequencing data and bulk RNA data were obtained from the Gene Expression Omnibus (GEO) database. Differential gene analysis, single-cell genomics analysis, and functional enrichment analysis were conducted using R software. The construction of Gene Regulatory Networks was performed using Python. We used the pySCENIC method to analyze the single-cell data and identified 401 regulons. Through variance decomposition, we selected 19 regulons that showed significant responsiveness to DCM. Next, we employed the ssGSEA method to assess regulons in two bulk RNA datasets. Significant statistical differences were observed in 9 and 13 regulons in each dataset. By intersecting these differentiated regulons and identifying shared targets that appeared at least twice, we successfully pinpointed three differentially expressed targets across both datasets. In this study, we assessed and identified 19 gene regulatory networks that were responsive to the disease. Furthermore, we validated these networks using two bulk RNA datasets of DCM. The elucidation of dysregulated regulons and targets (CDKN1A, SAT1, ZFP36) enhances the molecular understanding of DCM, aiding in the development of tailored therapies for patients.
Collapse
|
4
|
Xu C. The Oryza sativa transcriptome responds spatiotemporally to polystyrene nanoplastic stress. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 928:172449. [PMID: 38615784 DOI: 10.1016/j.scitotenv.2024.172449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/20/2024] [Accepted: 04/10/2024] [Indexed: 04/16/2024]
Abstract
Nanoplastic represents an emerging abiotic stress facing modern agriculture, impacting global crop production. However, the molecular response of crop plants to this stress remains poorly understood at a spatiotemporal resolution. We therefore used RNA sequencing to profile the transcriptome expressed in rice (Oryza sativa) root and leaf organs at 1, 2, 4, and 8 d post exposure with nanoplastic. We revealed a striking similarity between the rice biomass dynamics in aboveground parts to that in belowground parts during nanoplastic stress, but transcriptome did not. At the global transcriptomic level, a total of 2332 differentially expressed genes were identified, with the majority being spatiotemporal specific, reflecting that nanoplastics predominantly regulate three processes in rice seedlings: (1) down-regulation of chlorophyll biosynthesis, photosynthesis, and starch, sucrose and nitrogen metabolism, (2) activation of defense responses such as brassinosteroid biosynthesis and phenylpropanoid biosynthesis, and (3) modulation of jasmonic acid and cytokinin signaling pathways by transcription factors. Notably, the genes involved in plant-pathogen interaction were shown to be successively modulated by both root and leaf organs, particularly plant disease defense genes (OsWRKY24, OsWRKY53, Os4CL3, OsPAL4, and MPK5), possibly indicating that nanoplastics affect rice growth indirectly through other biota. Finally, we associated biomass phenotypes with the temporal reprogramming of rice transcriptome by weighted gene co-expression network analysis, noting a significantly correlation with photosynthesis, carbon metabolism, and phenylpropanoid biosynthesis that may reflect the mechanisms of biomass reduction. Functional analysis further identified PsbY, MYB, cytochrome P450, and AP2/ERF as hub genes governing these pathways. Overall, our work provides the understanding of molecular mechanisms of rice in response to nanoplastics, which in turn suggests how rice might behave in a nanoplastic pollution scenario.
Collapse
|
5
|
Xie X, Wang F, Wang G, Zhu W, Du X, Wang H. Learning the cellular activity representation based on gene regulatory networks for prediction of tumor response to drugs. Artif Intell Med 2024; 152:102864. [PMID: 38640702 DOI: 10.1016/j.artmed.2024.102864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 01/28/2024] [Accepted: 03/30/2024] [Indexed: 04/21/2024]
Abstract
Predicting the response of tumor cells to anti-tumor drugs is critical to realizing cancer precision medicine. Currently, most existing methods ignore the regulatory relationships between genes and thus have unsatisfactory predictive performance. In this paper, we propose to predict anti-tumor drug efficacy via learning the activity representation of tumor cells based on a priori knowledge of gene regulation networks (GRNs). Specifically, the method simulates the cellular biosystem by synthesizing a cell-gene activity network and then infers a new low-dimensional activity representation for tumor cells from the raw high-dimensional expression profile. The simulated cell-gene network mainly comprises known gene regulatory networks collected from multiple resources and fuses tumor cells by linking them to hotspot genes that are over- or under-expressed in them. The resulting activity representation could not only reflect the shallow expression profile (hotspot genes) but also mines in-depth information of gene regulation activity in tumor cells before treatment. Finally, we build deep learning models on the activity representation for predicting drug efficacy in tumor cells. Experimental results on the benchmark GDSC dataset demonstrate the superior performance of the proposed method over SOTA methods with the highest AUC of 0.954 in the efficacy label prediction and the best R2 of 0.834 in the regression of half maximal inhibitory concentration (IC50) values, suggesting the potential value of the proposed method in practice.
Collapse
|
6
|
Ma X, Li Z, Du Z, Xu Y, Chen Y, Zhuo L, Fu X, Liu R. Advancing cancer driver gene detection via Schur complement graph augmentation and independent subspace feature extraction. Comput Biol Med 2024; 174:108484. [PMID: 38643595 DOI: 10.1016/j.compbiomed.2024.108484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 03/18/2024] [Accepted: 04/15/2024] [Indexed: 04/23/2024]
Abstract
Accurately identifying cancer driver genes (CDGs) is crucial for guiding cancer treatment and has recently received great attention from researchers. However, the high complexity and heterogeneity of cancer gene regulatory networks limit the precition accuracy of existing deep learning models. To address this, we introduce a model called SCIS-CDG that utilizes Schur complement graph augmentation and independent subspace feature extraction techniques to effectively predict potential CDGs. Firstly, a random Schur complement strategy is adopted to generate two augmented views of gene network within a graph contrastive learning framework. Rapid randomization of the random Schur complement strategy enhances the model's generalization and its ability to handle complex networks effectively. Upholding the Schur complement principle in expectations promotes the preservation of the original gene network's vital structure in the augmented views. Subsequently, we employ feature extraction technology using multiple independent subspaces, each trained with independent weights to reduce inter-subspace dependence and improve the model's expressiveness. Concurrently, we introduced a feature expansion component based on the structure of the gene network to address issues arising from the limited dimensionality of node features. Moreover, it can alleviate the challenges posed by the heterogeneity of cancer gene networks to some extent. Finally, we integrate a learnable attention weight mechanism into the graph neural network (GNN) encoder, utilizing feature expansion technology to optimize the significance of various feature levels in the prediction task. Following extensive experimental validation, the SCIS-CDG model has exhibited high efficiency in identifying known CDGs and uncovering potential unknown CDGs in external datasets. Particularly when compared to previous conventional GNN models, its performance has seen significant improved. The code and data are publicly available at: https://github.com/mxqmxqmxq/SCIS-CDG.
Collapse
|
7
|
Li F, Zhu Y, Wang T, Tang J, Huang Y, Gu J, Mai Y, Wang M, Zhang Z, Ning J, Kang B, Wang J, Zhou T, Cui Y, Pan G. Characterization of gene regulatory networks underlying key properties in human hematopoietic stem cell ontogeny. CELL REGENERATION (LONDON, ENGLAND) 2024; 13:9. [PMID: 38630195 PMCID: PMC11024070 DOI: 10.1186/s13619-024-00192-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/23/2024] [Indexed: 04/20/2024]
Abstract
Human hematopoiesis starts at early yolk sac and undergoes site- and stage-specific changes over development. The intrinsic mechanism underlying property changes in hematopoiesis ontogeny remains poorly understood. Here, we analyzed single-cell transcriptome of human primary hematopoietic stem/progenitor cells (HSPCs) at different developmental stages, including yolk-sac (YS), AGM, fetal liver (FL), umbilical cord blood (UCB) and adult peripheral blood (PB) mobilized HSPCs. These stage-specific HSPCs display differential intrinsic properties, such as metabolism, self-renewal, differentiating potentialities etc. We then generated highly co-related gene regulatory network (GRNs) modules underlying the differential HSC key properties. Particularly, we identified GRNs and key regulators controlling lymphoid potentiality, self-renewal as well as aerobic respiration in human HSCs. Introducing selected regulators promotes key HSC functions in HSPCs derived from human pluripotent stem cells. Therefore, GRNs underlying key intrinsic properties of human HSCs provide a valuable guide to generate fully functional HSCs in vitro.
Collapse
|
8
|
Mitra S, Sil P, Subbaroyan A, Martin OC, Samal A. Preponderance of generalized chain functions in reconstructed Boolean models of biological networks. Sci Rep 2024; 14:6734. [PMID: 38509145 PMCID: PMC10954731 DOI: 10.1038/s41598-024-57086-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024] Open
Abstract
Boolean networks (BNs) have been extensively used to model gene regulatory networks (GRNs). The dynamics of BNs depend on the network architecture and regulatory logic rules (Boolean functions (BFs)) associated with nodes. Nested canalyzing functions (NCFs) have been shown to be enriched among the BFs in the large-scale studies of reconstructed Boolean models. The central question we address here is whether that enrichment is due to certain sub-types of NCFs. We build on one sub-type of NCFs, the chain functions (or chain-0 functions) proposed by Gat-Viks and Shamir. First, we propose two other sub-types of NCFs, namely, the class of chain-1 functions and generalized chain functions, the union of the chain-0 and chain-1 types. Next, we find that the fraction of NCFs that are chain-0 (also holds for chain-1) functions decreases exponentially with the number of inputs. We provide analytical treatment for this and other observations on BFs. Then, by analyzing three different datasets of reconstructed Boolean models we find that generalized chain functions are significantly enriched within the NCFs. Lastly we illustrate that upon imposing the constraints of generalized chain functions on three different GRNs we are able to obtain biologically viable Boolean models.
Collapse
|
9
|
Vahab N, Bonu T, Kuhlmann L, Ramialison M, Tyagi S. Uncovering co-regulatory modules and gene regulatory networks in the heart through machine learning-based analysis of large-scale epigenomic data. Comput Biol Med 2024; 171:108068. [PMID: 38354497 DOI: 10.1016/j.compbiomed.2024.108068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Revised: 12/30/2023] [Accepted: 01/27/2024] [Indexed: 02/16/2024]
Abstract
The availability of large-scale epigenomic data from various cell types and conditions has yielded valuable insights for evaluating and learning features predicting the co-binding of transcription factors (TF). However, prior attempts to develop models predicting motif co-occurrence lacked scalability for globally analyzing any motif combination or making cross-species predictions. Moreover, mapping co-regulatory modules (CRM) to gene regulatory networks (GRN) is crucial for understanding underlying function. Currently, no comprehensive pipeline exists for large-scale, rapid, and accurate CRM and GRN identification. In this study, we analyzed and evaluated different TF binding characteristics facilitating biologically significant co-binding to identify all potential clusters of co-binding TFs. We curated the UniBind database, containing ChIP-Seq data from over 1983 samples and 232 TFs, and implemented two machine learning models to predict CRMs and the potential regulatory networks they operate on. Two machine learning models, Convolution Neural Networks (CNN) and Random Forest Classifier(RFC), used to predict co-binding between TFs, were compared using precision-recall Receiver Operating Characteristic (ROC) curves. CNN outperformed RFC (AUC 0.94 vs. 0.88) and achieved higher F1 scores (0.938 vs. 0.872). The CRMs generated by the clustering algorithm were validated against ChipAtlas and MCOT, revealing additional motifs forming CRMs. We predicted 200k CRMs for 50k+ human genes, validated against recent CRM prediction methods with 100% overlap. Further, we narrowed our focus to study heart-related regulatory motifs, filtering the generated CRMs to report 1784 Cardiac CRMs containing at least four cardiac TFs. Identified cardiac CRMs revealed potential novel regulators like ARID3A and RXRB for SCAD, including known TFs like PPARG for F11R. Our findings highlight the importance of the NKX family of transcription factors in cardiac development and provide potential targets for further investigation in cardiac disease.
Collapse
|
10
|
Wu S, Jin K, Tang M, Xia Y, Gao W. Inference of Gene Regulatory Networks Based on Multi-view Hierarchical Hypergraphs. Interdiscip Sci 2024:10.1007/s12539-024-00604-3. [PMID: 38342857 DOI: 10.1007/s12539-024-00604-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/26/2023] [Accepted: 01/03/2024] [Indexed: 02/13/2024]
Abstract
Since gene regulation is a complex process in which multiple genes act simultaneously, accurately inferring gene regulatory networks (GRNs) is a long-standing challenge in systems biology. Although graph neural networks can formally describe intricate gene expression mechanisms, current GRN inference methods based on graph learning regard only transcription factor (TF)-target gene interactions as pairwise relationships, and cannot model the many-to-many high-order regulatory patterns that prevail among genes. Moreover, these methods often rely on limited prior regulatory knowledge, ignoring the structural information of GRNs in gene expression profiles. Therefore, we propose a multi-view hierarchical hypergraphs GRN (MHHGRN) inference model. Specifically, multiple heterogeneous biological information is integrated to construct multi-view hierarchical hypergraphs of TFs and target genes, using hypergraph convolution networks to model higher order complex regulatory relationships. Meanwhile, the coupled information diffusion mechanism and the cross-domain messaging mechanism facilitate the information sharing between genes to optimise gene embedding representations. Finally, a unique channel attention mechanism is used to adaptively learn feature representations from multiple views for GRN inference. Experimental results show that MHHGRN achieves better results than the baseline methods on the E. coli and S. cerevisiae benchmark datasets of the DREAM5 challenge, and it has excellent cross-species generalization, achieving comparable or better performance on scRNA-seq datasets from five mouse and two human cell lines.
Collapse
|
11
|
Vághy MA, Otero-Muras I, Pájaro M, Szederkényi G. A Kinetic Finite Volume Discretization of the Multidimensional PIDE Model for Gene Regulatory Networks. Bull Math Biol 2024; 86:22. [PMID: 38253903 PMCID: PMC10803439 DOI: 10.1007/s11538-023-01251-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 12/22/2023] [Indexed: 01/24/2024]
Abstract
In this paper, a finite volume discretization scheme for partial integro-differential equations (PIDEs) describing the temporal evolution of protein distribution in gene regulatory networks is proposed. It is shown that the obtained set of ODEs can be formally represented as a compartmental kinetic system with a strongly connected reaction graph. This allows the application of the theory of nonnegative and compartmental systems for the qualitative analysis of the approximating dynamics. In this framework, it is straightforward to show the existence, uniqueness and stability of equilibria. Moreover, the computation of the stationary probability distribution can be traced back to the solution of linear equations. The discretization scheme is presented for one and multiple dimensional models separately. Illustrative computational examples show the precision of the approach, and good agreement with previous results in the literature.
Collapse
|
12
|
Gutierrez-Tordera L, Papandreou C, Novau-Ferré N, García-González P, Rojas M, Marquié M, Chapado LA, Papagiannopoulos C, Fernàndez-Castillo N, Valero S, Folch J, Ettcheto M, Camins A, Boada M, Ruiz A, Bulló M. Exploring small non-coding RNAs as blood-based biomarkers to predict Alzheimer's disease. Cell Biosci 2024; 14:8. [PMID: 38229129 PMCID: PMC10790437 DOI: 10.1186/s13578-023-01190-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/27/2023] [Indexed: 01/18/2024] Open
Abstract
BACKGROUND Alzheimer's disease (AD) diagnosis relies on clinical symptoms complemented with biological biomarkers, the Amyloid Tau Neurodegeneration (ATN) framework. Small non-coding RNA (sncRNA) in the blood have emerged as potential predictors of AD. We identified sncRNA signatures specific to ATN and AD, and evaluated both their contribution to improving AD conversion prediction beyond ATN alone. METHODS This nested case-control study was conducted within the ACE cohort and included MCI patients matched by sex. Patients free of type 2 diabetes underwent cerebrospinal fluid (CSF) and plasma collection and were followed-up for a median of 2.45-years. Plasma sncRNAs were profiled using small RNA-sequencing. Conditional logistic and Cox regression analyses with elastic net penalties were performed to identify sncRNA signatures for A+(T|N)+ and AD. Weighted scores were computed using cross-validation, and the association of these scores with AD risk was assessed using multivariable Cox regression models. Gene ontology (GO) and Kyoto encyclopaedia of genes and genomes (KEGG) enrichment analysis of the identified signatures were performed. RESULTS The study sample consisted of 192 patients, including 96 A+(T|N)+ and 96 A-T-N- patients. We constructed a classification model based on a 6-miRNAs signature for ATN. The model could classify MCI patients into A-T-N- and A+(T|N)+ groups with an area under the curve of 0.7335 (95% CI, 0.7327 to 0.7342). However, the addition of the model to conventional risk factors did not improve the prediction of AD beyond the conventional model plus ATN status (C-statistic: 0.805 [95% CI, 0.758 to 0.852] compared to 0.829 [95% CI, 0.786, 0.872]). The AD-related 15-sncRNAs signature exhibited better predictive performance than the conventional model plus ATN status (C-statistic: 0.849 [95% CI, 0.808 to 0.890]). When ATN was included in this model, the prediction further improved to 0.875 (95% CI, 0.840 to 0.910). The miRNA-target interaction network and functional analysis, including GO and KEGG pathway enrichment analysis, suggested that the miRNAs in both signatures are involved in neuronal pathways associated with AD. CONCLUSIONS The AD-related sncRNA signature holds promise in predicting AD conversion, providing insights into early AD development and potential targets for prevention.
Collapse
|
13
|
Manosalva Pérez N, Ferrari C, Engelhorn J, Depuydt T, Nelissen H, Hartwig T, Vandepoele K. MINI-AC: inference of plant gene regulatory networks using bulk or single-cell accessible chromatin profiles. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:280-301. [PMID: 37788349 DOI: 10.1111/tpj.16483] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 09/13/2023] [Accepted: 09/16/2023] [Indexed: 10/05/2023]
Abstract
Gene regulatory networks (GRNs) represent the interactions between transcription factors (TF) and their target genes. Plant GRNs control transcriptional programs involved in growth, development, and stress responses, ultimately affecting diverse agricultural traits. While recent developments in accessible chromatin (AC) profiling technologies make it possible to identify context-specific regulatory DNA, learning the underlying GRNs remains a major challenge. We developed MINI-AC (Motif-Informed Network Inference based on Accessible Chromatin), a method that combines AC data from bulk or single-cell experiments with TF binding site (TFBS) information to learn GRNs in plants. We benchmarked MINI-AC using bulk AC datasets from different Arabidopsis thaliana tissues and showed that it outperforms other methods to identify correct TFBS. In maize, a crop with a complex genome and abundant distal AC regions, MINI-AC successfully inferred leaf GRNs with experimentally confirmed, both proximal and distal, TF-target gene interactions. Furthermore, we showed that both AC regions and footprints are valid alternatives to infer AC-based GRNs with MINI-AC. Finally, we combined MINI-AC predictions from bulk and single-cell AC datasets to identify general and cell-type specific maize leaf regulators. Focusing on C4 metabolism, we identified diverse regulatory interactions in specialized cell types for this photosynthetic pathway. MINI-AC represents a powerful tool for inferring accurate AC-derived GRNs in plants and identifying known and novel candidate regulators, improving our understanding of gene regulation in plants.
Collapse
|
14
|
Fox J, Cummins B, Moseley RC, Gameiro M, Haase SB. A yeast cell cycle pulse generator model shows consistency with multiple oscillatory and checkpoint mutant datasets. Math Biosci 2024; 367:109102. [PMID: 37939998 PMCID: PMC10842220 DOI: 10.1016/j.mbs.2023.109102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 09/13/2023] [Accepted: 10/27/2023] [Indexed: 11/10/2023]
Abstract
Modeling biological systems holds great promise for speeding up the rate of discovery in systems biology by predicting experimental outcomes and suggesting targeted interventions. However, this process is dogged by an identifiability issue, in which network models and their parameters are not sufficiently constrained by coarse and noisy data to ensure unique solutions. In this work, we evaluated the capability of a simplified yeast cell-cycle network model to reproduce multiple observed transcriptomic behaviors under genomic mutations. We matched time-series data from both cycling and checkpoint arrested cells to model predictions using an asynchronous multi-level Boolean approach. We showed that this single network model, despite its simplicity, is capable of exhibiting dynamical behavior similar to the datasets in most cases, and we demonstrated the drop in severity of the identifiability issue that results from matching multiple datasets.
Collapse
|
15
|
Kim H, Choi H, Lee D, Kim J. A review on gene regulatory network reconstruction algorithms based on single cell RNA sequencing. Genes Genomics 2024; 46:1-11. [PMID: 38032470 DOI: 10.1007/s13258-023-01473-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023]
Abstract
BACKGROUND Understanding gene regulatory networks (GRNs) is essential for unraveling the molecular mechanisms governing cellular behavior. With the advent of high-throughput transcriptome measurement technology, researchers have aimed to reverse engineer the biological systems, extracting gene regulatory rules from their outputs, which represented by gene expression data. Bulk RNA sequencing, a widely used method for measuring gene expression, has been employed for GRN reconstruction. However, it falls short in capturing dynamic changes in gene expression at the level of individual cells since it averages gene expression across mixed cell populations. OBJECTIVE In this review, we provide an overview of 15 GRN reconstruction tools and discuss their respective strengths and limitations, particularly in the context of single cell RNA sequencing (scRNA-seq). METHODS Recent advancements in scRNA-seq break new ground of GRN reconstruction. They offer snapshots of the individual cell transcriptomes and capturing dynamic changes. We emphasize how these technological breakthroughs have enhanced GRN reconstruction. CONCLUSION GRN reconstructors can be classified based on their requirement for cellular trajectory, which represents a dynamical cellular process including differentiation, aging, or disease progression. Benchmarking studies support the superiority of GRN reconstructors that do not require trajectory analysis in identifying regulator-target relationships. However, methods equipped with trajectory analysis demonstrate better performance in identifying key regulatory factors. In conclusion, researchers should select a suitable GRN reconstructor based on their specific research objectives.
Collapse
|
16
|
Sigvardsson M. Early B-Cell Factor 1: An Archetype for a Lineage-Restricted Transcription Factor Linking Development to Disease. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2024; 1459:143-156. [PMID: 39017843 DOI: 10.1007/978-3-031-62731-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
The development of highly specialized blood cells from hematopoietic stem cells (HSCs) in the bone marrow (BM) is dependent upon a stringently orchestrated network of stage- and lineage-restricted transcription factors (TFs). Thus, the same stem cell can give rise to various types of differentiated blood cells. One of the key regulators of B-lymphocyte development is early B-cell factor 1 (EBF1). This TF belongs to a small, but evolutionary conserved, family of proteins that harbor a Zn-coordinating motif and an IPT/TIG (immunoglobulin-like, plexins, transcription factors/transcription factor immunoglobulin) domain, creating a unique DNA-binding domain (DBD). EBF proteins play critical roles in diverse developmental processes, including body segmentation in the Drosophila melanogaster embryo, and retina formation in mice. While several EBF family members are expressed in neuronal cells, adipocytes, and BM stroma cells, only B-lymphoid cells express EBF1. In the absence of EBF1, hematopoietic progenitor cells (HPCs) fail to activate the B-lineage program. This has been attributed to the ability of EBF1 to act as a pioneering factor with the ability to remodel chromatin, thereby creating a B-lymphoid-specific epigenetic landscape. Conditional inactivation of the Ebf1 gene in B-lineage cells has revealed additional functions of this protein in relation to the control of proliferation and apoptosis. This may explain why EBF1 is frequently targeted by mutations in human leukemia cases. This chapter provides an overview of the biochemical and functional properties of the EBF family proteins, with a focus on the roles of EBF1 in normal and malignant B-lymphocyte development.
Collapse
|
17
|
Cheng J, Cheng M, Lusis AJ, Yang X. Gene Regulatory Networks in Coronary Artery Disease. Curr Atheroscler Rep 2023; 25:1013-1023. [PMID: 38008808 DOI: 10.1007/s11883-023-01170-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2023] [Indexed: 11/28/2023]
Abstract
PURPOSE OF REVIEW Coronary artery disease is a complex disorder and the leading cause of mortality worldwide. As technologies for the generation of high-throughput multiomics data have advanced, gene regulatory network modeling has become an increasingly powerful tool in understanding coronary artery disease. This review summarizes recent and novel gene regulatory network tools for bulk tissue and single cell data, existing databases for network construction, and applications of gene regulatory networks in coronary artery disease. RECENT FINDINGS New gene regulatory network tools can integrate multiomics data to elucidate complex disease mechanisms at unprecedented cellular and spatial resolutions. At the same time, updates to coronary artery disease expression data in existing databases have enabled researchers to build gene regulatory networks to study novel disease mechanisms. Gene regulatory networks have proven extremely useful in understanding CAD heritability beyond what is explained by GWAS loci and in identifying mechanisms and key driver genes underlying disease onset and progression. Gene regulatory networks can holistically and comprehensively address the complex nature of coronary artery disease. In this review, we discuss key algorithmic approaches to construct gene regulatory networks and highlight state-of-the-art methods that model specific modes of gene regulation. We also explore recent applications of these tools in coronary artery disease patient data repositories to understand disease heritability and shared and distinct disease mechanisms and key driver genes across tissues, between sexes, and between species.
Collapse
Grants
- DK120342, HL148577, and HL147883 (AJL). NS111378, NS117148, HL147883 (XY) NIH HHS
- DK120342, HL148577, and HL147883 (AJL). NS111378, NS117148, HL147883 (XY) NIH HHS
- DK120342, HL148577, and HL147883 (AJL). NS111378, NS117148, HL147883 (XY) NIH HHS
Collapse
|
18
|
Hsiao YC, Dutta A. Nonlinear control designs and their application to cancer differentiation therapy. Math Biosci 2023; 366:109105. [PMID: 37944795 DOI: 10.1016/j.mbs.2023.109105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/29/2023] [Accepted: 10/31/2023] [Indexed: 11/12/2023]
Abstract
We designed three new controllers: a sigmoid-based controller, a polynomial dynamic inversion-based controller, and a proportional-integral-derivative (PID) impulsive controller for cancer differentiation therapy. We compared these three controllers to existing control strategies to show the improvement in performance and compare their robustness. The sigmoid-based controller adds a sigmoid term associated with the error of the controlled state and a selected observed state. The sigmoid term is multiplied by a control gain, thereby decreasing the control effort for state transition. The polynomial dynamic inversion-based controller adds a cubic error term in the error dynamic aiming to achieve a shorter convergence time to the desired value of the controlled state. The PID impulsive controller considers the accumulated controlled state error and the rate of change of the controlled state error, thereby forcing the controlled state to converge to the desired value and alleviating the damping effect in the steady state. For the considered cancer network, the 3 new cancer control strategies exhibit superior and robust performance. The PID impulsive controller has a significant improvement in robustness compared to the impulsive controller and has greater potential for cancer differentiation therapy.
Collapse
|
19
|
Cingiz MÖ. k- Strong Inference Algorithm: A Hybrid Information Theory Based Gene Network Inference Algorithm. Mol Biotechnol 2023:10.1007/s12033-023-00929-2. [PMID: 37950851 DOI: 10.1007/s12033-023-00929-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 10/05/2023] [Indexed: 11/13/2023]
Abstract
Gene networks allow researchers to understand the underlying mechanisms between diseases and genes while reducing the need for wet lab experiments. Numerous gene network inference (GNI) algorithms have been presented in the literature to infer accurate gene networks. We proposed a hybrid GNI algorithm, k-Strong Inference Algorithm (ksia), to infer more reliable and robust gene networks from omics datasets. To increase reliability, ksia integrates Pearson correlation coefficient (PCC) and Spearman rank correlation coefficient (SCC) scores to determine mutual information scores between molecules to increase diversity of relation predictions. To infer a more robust gene network, ksia applies three different elimination steps to remove redundant and spurious relations between genes. The performance of ksia was evaluated on microbe microarrays database in the overlap analysis with other GNI algorithms, namely ARACNE, C3NET, CLR, and MRNET. Ksia inferred less number of relations due to its strict elimination steps. However, ksia generally performed better on Escherichia coli (E.coli) and Saccharomyces cerevisiae (yeast) gene expression datasets due to F- measure and precision values. The integration of association estimator scores and three elimination stages slightly increases the performance of ksia based gene networks. Users can access ksia R package and user manual of package via https://github.com/ozgurcingiz/ksia .
Collapse
|
20
|
Pulver C, Grun D, Duc J, Sheppard S, Planet E, Coudray A, de Fondeville R, Pontis J, Trono D. Statistical learning quantifies transposable element-mediated cis-regulation. Genome Biol 2023; 24:258. [PMID: 37950299 PMCID: PMC10637000 DOI: 10.1186/s13059-023-03085-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/09/2023] [Indexed: 11/12/2023] Open
Abstract
BACKGROUND Transposable elements (TEs) have colonized the genomes of most metazoans, and many TE-embedded sequences function as cis-regulatory elements (CREs) for genes involved in a wide range of biological processes from early embryogenesis to innate immune responses. Because of their repetitive nature, TEs have the potential to form CRE platforms enabling the coordinated and genome-wide regulation of protein-coding genes by only a handful of trans-acting transcription factors (TFs). RESULTS Here, we directly test this hypothesis through mathematical modeling and demonstrate that differences in expression at protein-coding genes alone are sufficient to estimate the magnitude and significance of TE-contributed cis-regulatory activities, even in contexts where TE-derived transcription fails to do so. We leverage hundreds of overexpression experiments and estimate that, overall, gene expression is influenced by TE-embedded CREs situated within approximately 500 kb of promoters. Focusing on the cis-regulatory potential of TEs within the gene regulatory network of human embryonic stem cells, we find that pluripotency-specific and evolutionarily young TE subfamilies can be reactivated by TFs involved in post-implantation embryogenesis. Finally, we show that TE subfamilies can be split into truly regulatorily active versus inactive fractions based on additional information such as matched epigenomic data, observing that TF binding may better predict TE cis-regulatory activity than differences in histone marks. CONCLUSION Our results suggest that TE-embedded CREs contribute to gene regulation during and beyond gastrulation. On a methodological level, we provide a statistical tool that infers TE-dependent cis-regulation from RNA-seq data alone, thus facilitating the study of TEs in the next-generation sequencing era.
Collapse
|
21
|
Xu B, Hwangbo DS, Saurabh S, Rosensweig C, Allada R, Kath WL, Braun R. Temperature-driven coordination of circadian transcriptome regulation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.27.563979. [PMID: 37961403 PMCID: PMC10634908 DOI: 10.1101/2023.10.27.563979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The circadian rhythm is an evolutionarily-conserved molecular oscillator that enables species to anticipate rhythmic changes in their environment. At a molecular level, the core clock genes induce a circadian oscillation in thousands of genes in a tissue-specific manner, orchestrating myriad biological processes. While studies have investigated how the core clock circuit responds to environmental perturbations such as temperature, the downstream effects of such perturbations on circadian regulation remain poorly understood. By analyzing bulk-RNA sequencing of Drosophila fat bodies harvested from flies subjected to different environmental conditions, we demonstrate a highly condition-specific circadian transcriptome. Further employing a reference-based gene regulatory network (Reactome), we find evidence of increased gene-gene coordination at low temperatures and synchronization of rhythmic genes that are network neighbors. Our results point to the mechanisms by which the circadian clock mediates the fly's response to seasonal changes in temperature.
Collapse
|
22
|
Nemati Bajestan M, Piroozkhah M, Chaleshi V, Ghiasi NE, Jamshidi N, Mirfakhraie R, Balaii H, Shahrokh S, Asadzadeh Aghdaei H, Salehi Z, Nazemalhosseini Mojarad E. Expression Analysis of Long Noncoding RNA-MALAT1 and Interleukin-6 in Inflammatory Bowel Disease Patients. IRANIAN JOURNAL OF ALLERGY, ASTHMA, AND IMMUNOLOGY 2023; 22:482-494. [PMID: 38085149 DOI: 10.18502/ijaai.v22i5.13997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 07/29/2023] [Indexed: 12/18/2023]
Abstract
Inflammatory bowel disease (IBD) manifests as chronic inflammation within the gastrointestinal tract. The study focuses on a long noncoding RNA (lncRNA) known as Metastasis-associated lung adenocarcinoma transcript 1 (MALAT1). MALAT1's misregulation has been linked with various autoimmune diseases and regulates proinflammatory cytokines. The role of IL6 in immune-triggered conditions, including IBD, is another focal point. In this research, the expression of MALAT1 and IL6 in IBD patients was meticulously analyzed to uncover potential interactions. The study involved 33 IBD patients (13 with Crohn's disease and 20 with ulcerative colitis) and 20 healthy counterparts. Quantitative real-time polymerase chain reaction determined the MALAT1 and IL6 gene expression levels. The competitive endogenous RNA (ceRNA) regulatory network was constructed using several tools, including LncRRIsearch and Cytoscape. A deep dive into the Inflammatory Bowel Disease database was undertaken to understand IL6's role in IBD. Drugs potentially targeting these genes were also pinpointed using DGIdb. Results indicated a notable elevation in the expression levels of MALAT1 and IL6 in IBD patients versus healthy controls. MALAT1 and IL6 did not show a direct linear correlation, but IL6 could serve as MALAT1's target. Analyses unveiled interactions between MALAT1 and IL6, regulated by hsa-miR-202-3p, hsa-miR-1-3p, and has-miR-9-5p. IL6's pivotal role in IBD-associated inflammation, likely interacting with other cytokines, was accentuated. Moreover, potential drugs like CILOBRADINE for MALAT1 and SILTUXIMAB for IL6 were identified. This research underscored MALAT1 and IL6's potential value as targets in diagnosis and treatment for IBD patients.
Collapse
|
23
|
Ovadia S, Cui G, Elkon R, Cohen-Gulkar M, Zuk-Bar N, Tuoc T, Jing N, Ashery-Padan R. SWI/SNF complexes are required for retinal pigmented epithelium differentiation and for the inhibition of cell proliferation and neural differentiation programs. Development 2023; 150:dev201488. [PMID: 37522516 PMCID: PMC10482007 DOI: 10.1242/dev.201488] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 07/14/2023] [Indexed: 08/01/2023]
Abstract
During embryonic development, tissue-specific transcription factors and chromatin remodelers function together to ensure gradual, coordinated differentiation of multiple lineages. Here, we define this regulatory interplay in the developing retinal pigmented epithelium (RPE), a neuroectodermal lineage essential for the development, function and maintenance of the adjacent retina. We present a high-resolution spatial transcriptomic atlas of the developing mouse RPE and the adjacent ocular mesenchyme obtained by geographical position sequencing (Geo-seq) of a single developmental stage of the eye that encompasses young and more mature ocular progenitors. These transcriptomic data, available online, reveal the key transcription factors and their gene regulatory networks during RPE and ocular mesenchyme differentiation. Moreover, conditional inactivation followed by Geo-seq revealed that this differentiation program is dependent on the activity of SWI/SNF complexes, shown here to control the expression and activity of RPE transcription factors and, at the same time, inhibit neural progenitor and cell proliferation genes. The findings reveal the roles of the SWI/SNF complexes in controlling the intersection between RPE and neural cell fates and the coupling of cell-cycle exit and differentiation.
Collapse
|
24
|
Owen LJ, Rainger J, Bengani H, Kilanowski F, FitzPatrick DR, Papanastasiou AS. Characterization of an eye field-like state during optic vesicle organoid development. Development 2023; 150:dev201432. [PMID: 37306293 PMCID: PMC10445745 DOI: 10.1242/dev.201432] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 06/02/2023] [Indexed: 06/13/2023]
Abstract
Specification of the eye field (EF) within the neural plate marks the earliest detectable stage of eye development. Experimental evidence, primarily from non-mammalian model systems, indicates that the stable formation of this group of cells requires the activation of a set of key transcription factors. This crucial event is challenging to probe in mammals and, quantitatively, little is known regarding the regulation of the transition of cells to this ocular fate. Using optic vesicle organoids to model the onset of the EF, we generate time-course transcriptomic data allowing us to identify dynamic gene expression programmes that characterize this cellular-state transition. Integrating this with chromatin accessibility data suggests a direct role of canonical EF transcription factors in regulating these gene expression changes, and highlights candidate cis-regulatory elements through which these transcription factors act. Finally, we begin to test a subset of these candidate enhancer elements, within the organoid system, by perturbing the underlying DNA sequence and measuring transcriptomic changes during EF activation.
Collapse
|
25
|
Wang XM, Ming K, Wang S, Wang J, Li PL, Tian RF, Liu SY, Cheng X, Chen Y, Shi W, Wan J, Hu M, Tian S, Zhang X, She ZG, Li H, Ding Y, Zhang XJ. Network-based analysis identifies key regulatory transcription factors involved in skin aging. Exp Gerontol 2023; 178:112202. [PMID: 37178875 DOI: 10.1016/j.exger.2023.112202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 05/07/2023] [Accepted: 05/08/2023] [Indexed: 05/15/2023]
Abstract
Skin aging is a complex process involving intricate genetic and environmental factors. In this study, we performed a comprehensive analysis of the transcriptional regulatory landscape of skin aging in canines. Weighted Gene Co-expression Network Analysis (WGCNA) was employed to identify aging-related gene modules. We subsequently validated the expression changes of these module genes in single-cell RNA sequencing (scRNA-seq) data of human aging skin. Notably, basal cell (BC), spinous cell (SC), mitotic cell (MC), and fibroblast (FB) were identified as the cell types with the most significant gene expression changes during aging. By integrating GENIE3 and RcisTarget, we constructed gene regulation networks (GRNs) for aging-related modules and identified core transcription factors (TFs) by intersecting significantly enriched TFs within the GRNs with hub TFs from WGCNA analysis, revealing key regulators of skin aging. Furthermore, we demonstrated the conserved role of CTCF and RAD21 in skin aging using an H2O2-stimulated cell aging model in HaCaT cells. Our findings provide new insights into the transcriptional regulatory landscape of skin aging and unveil potential targets for future intervention strategies against age-related skin disorders in both canines and humans.
Collapse
|