1
|
Musilova J, Vafek Z, Puniya BL, Zimmer R, Helikar T, Sedlar K. Augusta: From RNA-Seq to gene regulatory networks and Boolean models. Comput Struct Biotechnol J 2024; 23:783-790. [PMID: 38312198 PMCID: PMC10837063 DOI: 10.1016/j.csbj.2024.01.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/17/2024] [Accepted: 01/19/2024] [Indexed: 02/06/2024] Open
Abstract
Computational models of gene regulations help to understand regulatory mechanisms and are extensively used in a wide range of areas, e.g., biotechnology or medicine, with significant benefits. Unfortunately, there are only a few computational gene regulatory models of whole genomes allowing static and dynamic analysis due to the lack of sophisticated tools for their reconstruction. Here, we describe Augusta, an open-source Python package for Gene Regulatory Network (GRN) and Boolean Network (BN) inference from the high-throughput gene expression data. Augusta can reconstruct genome-wide models suitable for static and dynamic analyses. Augusta uses a unique approach where the first estimation of a GRN inferred from expression data is further refined by predicting transcription factor binding motifs in promoters of regulated genes and by incorporating verified interactions obtained from databases. Moreover, a refined GRN is transformed into a draft BN by searching in the curated model database and setting logical rules to incoming edges of target genes, which can be further manually edited as the model is provided in the SBML file format. The approach is applicable even if information about the organism under study is not available in the databases, which is typically the case for non-model organisms including most microbes. Augusta can be operated from the command line and, thus, is easy to use for automated prediction of models for various genomes. The Augusta package is freely available at github.com/JanaMus/Augusta. Documentation and tutorials are available at augusta.readthedocs.io.
Collapse
Affiliation(s)
- Jana Musilova
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno 61600, Czech Republic
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln 68588, NE, USA
| | - Zdenek Vafek
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln 68588, NE, USA
- Institute of Forensic Engineering, Brno University of Technology, Brno 61200, Czech Republic
| | - Bhanwar Lal Puniya
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln 68588, NE, USA
| | - Ralf Zimmer
- Department of Informatics, Ludwig-Maximilians-Universität München, Munich 80539, Germany
| | - Tomas Helikar
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln 68588, NE, USA
| | - Karel Sedlar
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno 61600, Czech Republic
- Department of Informatics, Ludwig-Maximilians-Universität München, Munich 80539, Germany
| |
Collapse
|
2
|
Castano-Duque L, Lebar MD, Mack BM, Lohmar JM, Carter-Wientjes C. Investigating the Impact of Flavonoids on Aspergillus flavus: Insights into Cell Wall Damage and Biofilms. J Fungi (Basel) 2024; 10:665. [PMID: 39330424 PMCID: PMC11433479 DOI: 10.3390/jof10090665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 09/19/2024] [Accepted: 09/20/2024] [Indexed: 09/28/2024] Open
Abstract
Aspergillus flavus, a fungus known for producing aflatoxins, poses significant threats to agriculture and global health. Flavonoids, plant-derived compounds, inhibit A. flavus proliferation and mitigate aflatoxin production, although the precise molecular and physical mechanisms underlying these effects remain poorly understood. In this study, we investigated three flavonoids-apigenin, luteolin, and quercetin-applied to A. flavus NRRL 3357. We determined the following: (1) glycosylated luteolin led to a 10% reduction in maximum fungal growth capacity; (2) quercetin affected cell wall integrity by triggering extreme mycelial collapse, while apigenin and luteolin caused peeling of the outer layer of cell wall; (3) luteolin exhibited the highest antioxidant capacity in the environment compared to apigenin and quercetin; (4) osmotic stress assays did not reveal morphological defects; (5) flavonoids promoted cell adherence, a precursor for biofilm formation; and (6) RNA sequencing analysis revealed that flavonoids impact expression of putative cell wall and plasma membrane biosynthesis genes. Our findings suggest that the differential effects of quercetin, luteolin, and apigenin on membrane integrity and biofilm formation may be driven by their interactions with fungal cell walls. These insights may inform the development of novel antifungal additives or plant breeding strategies focusing on plant-derived compounds in crop protection.
Collapse
Affiliation(s)
- Lina Castano-Duque
- United States Department of Agriculture—Agriculture Research Services, New Orleans, LA 70124, USA; (M.D.L.); (B.M.M.); (J.M.L.); (C.C.-W.)
| | | | | | | | | |
Collapse
|
3
|
Hsiao YC, Dutta A. Network Modeling and Control of Dynamic Disease Pathways, Review and Perspectives. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1211-1230. [PMID: 38498762 DOI: 10.1109/tcbb.2024.3378155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Dynamic disease pathways are a combination of complex dynamical processes among bio-molecules in a cell that leads to diseases. Network modeling of disease pathways considers disease-related bio-molecules (e.g. DNA, RNA, transcription factors, enzymes, proteins, and metabolites) and their interaction (e.g. DNA methylation, histone modification, alternative splicing, and protein modification) to study disease progression and predict therapeutic responses. These bio-molecules and their interactions are the basic elements in the study of the misregulation in the disease-related gene expression that lead to abnormal cellular responses. Gene regulatory networks, cell signaling networks, and metabolic networks are the three major types of intracellular networks for the study of the cellular responses elicited from extracellular signals. The disease-related cellular responses can be prevented or regulated by designing control strategies to manipulate these extracellular or other intracellular signals. The paper reviews the regulatory mechanisms, the dynamic models, and the control strategies for each intracellular network. The applications, limitations and the prospective for modeling and control are also discussed.
Collapse
|
4
|
Koch D, Nandan A, Ramesan G, Koseska A. Biological computations: Limitations of attractor-based formalisms and the need for transients. Biochem Biophys Res Commun 2024; 720:150069. [PMID: 38754165 DOI: 10.1016/j.bbrc.2024.150069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 04/15/2024] [Accepted: 05/07/2024] [Indexed: 05/18/2024]
Abstract
Living systems, from single cells to higher vertebrates, receive a continuous stream of non-stationary inputs that they sense, for e.g. via cell surface receptors or sensory organs. By integrating these time-varying, multi-sensory, and often noisy information with memory using complex molecular or neuronal networks, they generate a variety of responses beyond simple stimulus-response association, including avoidance behavior, life-long-learning or social interactions. In a broad sense, these processes can be understood as a type of biological computation. Taking as a basis generic features of biological computations, such as real-time responsiveness or robustness and flexibility of the computation, we highlight the limitations of the current attractor-based framework for understanding computations in biological systems. We argue that frameworks based on transient dynamics away from attractors are better suited for the description of computations performed by neuronal and signaling networks. In particular, we discuss how quasi-stable transient dynamics from ghost states that emerge at criticality have a promising potential for developing an integrated framework of computations, that can help us understand how living system actively process information and learn from their continuously changing environment.
Collapse
Affiliation(s)
- Daniel Koch
- Lise Meitner Group Cellular Computations and Learning, Max Planck Institute for Neurobiology of Behaviour - Caesar, Bonn, Germany
| | - Akhilesh Nandan
- Lise Meitner Group Cellular Computations and Learning, Max Planck Institute for Neurobiology of Behaviour - Caesar, Bonn, Germany
| | - Gayathri Ramesan
- Lise Meitner Group Cellular Computations and Learning, Max Planck Institute for Neurobiology of Behaviour - Caesar, Bonn, Germany
| | - Aneta Koseska
- Lise Meitner Group Cellular Computations and Learning, Max Planck Institute for Neurobiology of Behaviour - Caesar, Bonn, Germany.
| |
Collapse
|
5
|
Jia C, Grima R. Holimap: an accurate and efficient method for solving stochastic gene network dynamics. Nat Commun 2024; 15:6557. [PMID: 39095346 PMCID: PMC11297302 DOI: 10.1038/s41467-024-50716-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 07/13/2024] [Indexed: 08/04/2024] Open
Abstract
Gene-gene interactions are crucial to the control of sub-cellular processes but our understanding of their stochastic dynamics is hindered by the lack of simulation methods that can accurately and efficiently predict how the distributions of gene product numbers vary across parameter space. To overcome these difficulties, here we present Holimap (high-order linear-mapping approximation), an approach that approximates the protein or mRNA number distributions of a complex gene regulatory network by the distributions of a much simpler reaction system. We demonstrate Holimap's computational advantages over conventional methods by applying it to predict the stochastic time-dependent dynamics of various gene networks, including transcriptional networks ranging from simple autoregulatory loops to complex randomly connected networks, post-transcriptional networks, and post-translational networks. Holimap is ideally suited to study how the intricate network of gene-gene interactions results in precise coordination and control of gene expression.
Collapse
Affiliation(s)
- Chen Jia
- Applied and Computational Mathematics Division, Beijing Computational Science Research Center, Beijing, China
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
6
|
Huang X, Zhang H. Detecting responsible nodes in differential Bayesian networks. Stat Med 2024; 43:3294-3312. [PMID: 38831542 DOI: 10.1002/sim.10125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Revised: 03/25/2024] [Accepted: 05/18/2024] [Indexed: 06/05/2024]
Abstract
To study the roles that different nodes play in differentiating Bayesian networks under two states, such as control versus disease, we formulate two node-specific scores to facilitate such assessment. The first score is motivated by the prediction invariance property of a causal model. The second score results from modifying an existing score constructed for differential analysis of undirected networks. We develop strategies based on these scores to identify nodes responsible for topological differences between two Bayesian networks. Synthetic data and real-life data from designed experiments are used to demonstrate the efficacy of the proposed methods in detecting responsible nodes.
Collapse
Affiliation(s)
- Xianzheng Huang
- Department of Statistics, University of South Carolina, Columbia, South Carolina, USA
| | - Hongmei Zhang
- Division of Epidemiology, Biostatistics, and Environmental Health, School of Public Health, University of Memphis, Memphis, Tennessee
| |
Collapse
|
7
|
Zhou X, Pan J, Chen L, Zhang S, Chen Y. DeepIMAGER: Deeply Analyzing Gene Regulatory Networks from scRNA-seq Data. Biomolecules 2024; 14:766. [PMID: 39062480 PMCID: PMC11274664 DOI: 10.3390/biom14070766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2024] [Revised: 06/22/2024] [Accepted: 06/25/2024] [Indexed: 07/28/2024] Open
Abstract
Understanding the dynamics of gene regulatory networks (GRNs) across diverse cell types poses a challenge yet holds immense value in unraveling the molecular mechanisms governing cellular processes. Current computational methods, which rely solely on expression changes from bulk RNA-seq and/or scRNA-seq data, often result in high rates of false positives and low precision. Here, we introduce an advanced computational tool, DeepIMAGER, for inferring cell-specific GRNs through deep learning and data integration. DeepIMAGER employs a supervised approach that transforms the co-expression patterns of gene pairs into image-like representations and leverages transcription factor (TF) binding information for model training. It is trained using comprehensive datasets that encompass scRNA-seq profiles and ChIP-seq data, capturing TF-gene pair information across various cell types. Comprehensive validations on six cell lines show DeepIMAGER exhibits superior performance in ten popular GRN inference tools and has remarkable robustness against dropout-zero events. DeepIMAGER was applied to scRNA-seq datasets of multiple myeloma (MM) and detected potential GRNs for TFs of RORC, MITF, and FOXD2 in MM dendritic cells. This technical innovation, combined with its capability to accurately decode GRNs from scRNA-seq, establishes DeepIMAGER as a valuable tool for unraveling complex regulatory networks in various cell types.
Collapse
Affiliation(s)
- Xiguo Zhou
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China; (X.Z.); (J.P.); (L.C.)
| | - Jingyi Pan
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China; (X.Z.); (J.P.); (L.C.)
| | - Liang Chen
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China; (X.Z.); (J.P.); (L.C.)
| | - Shaoqiang Zhang
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China; (X.Z.); (J.P.); (L.C.)
| | - Yong Chen
- Department of Biological and Biomedical Sciences, Rowan University, Glassboro, NJ 08028, USA
| |
Collapse
|
8
|
Deepika, Madhu, Upadhyay SK. Deciphering the features and functions of serine/arginine protein kinases in bread wheat. PLANT GENE 2024; 38:100451. [DOI: 10.1016/j.plgene.2024.100451] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/09/2024]
|
9
|
Wu H, Shi W, Wang MD. Developing a novel causal inference algorithm for personalized biomedical causal graph learning using meta machine learning. BMC Med Inform Decis Mak 2024; 24:137. [PMID: 38802809 PMCID: PMC11129385 DOI: 10.1186/s12911-024-02510-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/15/2024] [Indexed: 05/29/2024] Open
Abstract
BACKGROUND Modeling causality through graphs, referred to as causal graph learning, offers an appropriate description of the dynamics of causality. The majority of current machine learning models in clinical decision support systems only predict associations between variables, whereas causal graph learning models causality dynamics through graphs. However, building personalized causal graphs for each individual is challenging due to the limited amount of data available for each patient. METHOD In this study, we present a new algorithmic framework using meta-learning for learning personalized causal graphs in biomedicine. Our framework extracts common patterns from multiple patient graphs and applies this information to develop individualized graphs. In multi-task causal graph learning, the proposed optimized initial guess of shared commonality enables the rapid adoption of knowledge to new tasks for efficient causal graph learning. RESULTS Experiments on one real-world biomedical causal graph learning benchmark data and four synthetic benchmarks show that our algorithm outperformed the baseline methods. Our algorithm can better understand the underlying patterns in the data, leading to more accurate predictions of the causal graph. Specifically, we reduce the structural hamming distance by 50-75%, indicating an improvement in graph prediction accuracy. Additionally, the false discovery rate is decreased by 20-30%, demonstrating that our algorithm made fewer incorrect predictions compared to the baseline algorithms. CONCLUSION To the best of our knowledge, this is the first study to demonstrate the effectiveness of meta-learning in personalized causal graph learning and cause inference modeling for biomedicine. In addition, the proposed algorithm can also be generalized to transnational research areas where integrated analysis is necessary for various distributions of datasets, including different clinical institutions.
Collapse
Affiliation(s)
- Hang Wu
- Coulter Department of Biomedical Engineering, Georgia Insitute of Technology, Atlanta, USA
| | - Wenqi Shi
- Department of Electrical and Computer Engineering, Georgia Insitute of Technology, Atlanta, USA
| | - May D Wang
- Coulter Department of Biomedical Engineering, Georgia Insitute of Technology, Atlanta, USA.
| |
Collapse
|
10
|
Zinati Y, Takiddeen A, Emad A. GRouNdGAN: GRN-guided simulation of single-cell RNA-seq data using causal generative adversarial networks. Nat Commun 2024; 15:4055. [PMID: 38744843 DOI: 10.1038/s41467-024-48516-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 05/01/2024] [Indexed: 05/16/2024] Open
Abstract
We introduce GRouNdGAN, a gene regulatory network (GRN)-guided reference-based causal implicit generative model for simulating single-cell RNA-seq data, in silico perturbation experiments, and benchmarking GRN inference methods. Through the imposition of a user-defined GRN in its architecture, GRouNdGAN simulates steady-state and transient-state single-cell datasets where genes are causally expressed under the control of their regulating transcription factors (TFs). Training on six experimental reference datasets, we show that our model captures non-linear TF-gene dependencies and preserves gene identities, cell trajectories, pseudo-time ordering, and technical and biological noise, with no user manipulation and only implicit parameterization. GRouNdGAN can synthesize cells under new conditions to perform in silico TF knockout experiments. Benchmarking various GRN inference algorithms reveals that GRouNdGAN effectively bridges the existing gap between simulated and biological data benchmarks of GRN inference algorithms, providing gold standard ground truth GRNs and realistic cells corresponding to the biological system of interest.
Collapse
Affiliation(s)
- Yazdan Zinati
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada
| | - Abdulrahman Takiddeen
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada
| | - Amin Emad
- Department of Electrical and Computer Engineering, McGill University, Montreal, QC, Canada.
- Mila, Quebec AI Institute, Montreal, QC, Canada.
- The Rosalind and Morris Goodman Cancer Institute, Montreal, QC, Canada.
| |
Collapse
|
11
|
Tomasi F, Pozzi M, Lauria M. Investigating the mechanisms underlying resistance to chemoterapy and to CRISPR-Cas9 in cancer cell lines. Sci Rep 2024; 14:5402. [PMID: 38443409 PMCID: PMC10915165 DOI: 10.1038/s41598-024-55138-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 02/20/2024] [Indexed: 03/07/2024] Open
Abstract
Cancer is one of the major causes of death worldwide and the development of multidrug resistance (MDR) in cancer cells is the principal cause of chemotherapy failure. To gain insights into the specific mechanisms of MDR in cancer cell lines, we developed a novel method for the combined analysis of recently published datasets on drug sensitivity and CRISPR loss-of-function screens for the same set of cancer cell lines. For our analysis, we first selected cell lines that consistently exhibit drug resistance across several classes of compounds. We then identified putative resistance genes for each class of compound and used inferred gene regulatory networks (GRNs) to study possible mechanisms underlying the development of MDR in the identified cancer cell lines. We show that the same method of analysis can also be used to identify cell lines that consistently exhibit resistance to the gene knockout effect of the CRISPR-Cas9 technique and to study the possible underlying mechanisms. In the GRN associated to the drug resistant cell lines, we identify genes previously associated with resistance (UHMK1, RALYL, MGST3, USP9X, and ESRG), genes for which an indirect association can be identified (SPINK13, LINC00664, MRPL38, and EMILIN3), and genes that are found to be overexpressed in non-resistant cancer cell lines (MRPL38, EMILIN3 and RALYL). In the GRNs associated to the CRISPR-Cas9 resistance mechanism, none of the identified genes has been previously reported in the admittedly sparse literature on the subject. However, some of these genes have a common role: APBB2, RUNX1T1, ZBTB7C, and ISX regulate transcription, while APBB2, BTG3, ZBTB7C, SZRD1 and LEF1 have a function in regulating proliferation, suggesting a role for these two pathways. While our results are specific for the lung cancer cell lines we selected for this work, our method of analysis can be applied to cell lines from other tissues and for which the required data is available.
Collapse
Affiliation(s)
| | - Matteo Pozzi
- CIBIO Department, University of Trento, Povo, Italy
- Fondazione Bruno Kessler, Povo, Italy
| | - Mario Lauria
- Department of Mathematics, University of Trento, Povo, Italy.
- Fondazione The Microsoft Research - University of Trento Centre for Computational and Systems Biology, Rovereto, Italy.
| |
Collapse
|
12
|
Tian J, Lei J, Roeder K. From local to global gene co-expression estimation using single-cell RNA-seq data. Biometrics 2024; 80:ujae001. [PMID: 38465983 PMCID: PMC10926266 DOI: 10.1093/biomtc/ujae001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 10/01/2023] [Accepted: 01/15/2024] [Indexed: 03/12/2024]
Abstract
In genomics studies, the investigation of gene relationships often brings important biological insights. Currently, the large heterogeneous datasets impose new challenges for statisticians because gene relationships are often local. They change from one sample point to another, may only exist in a subset of the sample, and can be nonlinear or even nonmonotone. Most previous dependence measures do not specifically target local dependence relationships, and the ones that do are computationally costly. In this paper, we explore a state-of-the-art network estimation technique that characterizes gene relationships at the single cell level, under the name of cell-specific gene networks. We first show that averaging the cell-specific gene relationship over a population gives a novel univariate dependence measure, the averaged Local Density Gap (aLDG), that accumulates local dependence and can detect any nonlinear, nonmonotone relationship. Together with a consistent nonparametric estimator, we establish its robustness on both the population and empirical levels. Then, we show that averaging the cell-specific gene relationship over mini-batches determined by some external structure information (eg, spatial or temporal factor) better highlights meaningful local structure change points. We explore the application of aLDG and its minibatch variant in many scenarios, including pairwise gene relationship estimation, bifurcating point detection in cell trajectory, and spatial transcriptomics structure visualization. Both simulations and real data analysis show that aLDG outperforms existing ones.
Collapse
Affiliation(s)
- Jinjin Tian
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| | - Jing Lei
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, 15213, Pittsburgh, PA, United States
| |
Collapse
|
13
|
Samad SS, Schwartz JM, Francavilla C. Functional selectivity of Receptor Tyrosine Kinases regulates distinct cellular outputs. Front Cell Dev Biol 2024; 11:1348056. [PMID: 38259512 PMCID: PMC10800419 DOI: 10.3389/fcell.2023.1348056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
Functional selectivity refers to the activation of differential signalling and cellular outputs downstream of the same membrane-bound receptor when activated by two or more different ligands. Functional selectivity has been described and extensively studied for G-protein Coupled Receptors (GPCRs), leading to specific therapeutic options for dysregulated GPCRs functions. However, studies regarding the functional selectivity of Receptor Tyrosine Kinases (RTKs) remain sparse. Here, we will summarize recent data about RTK functional selectivity focusing on how the nature and the amount of RTK ligands and the crosstalk of RTKs with other membrane proteins regulate the specificity of RTK signalling. In addition, we will discuss how structural changes in RTKs upon ligand binding affects selective signalling pathways. Much remains to be known about the integration of different signals affecting RTK signalling specificity to orchestrate long-term cellular outcomes. Recent advancements in omics, specifically quantitative phosphoproteomics, and in systems biology methods to study, model and integrate different types of large-scale omics data have increased our ability to compare several signals affecting RTK functional selectivity in a global, system-wide fashion. We will discuss how such methods facilitate the exploration of important signalling hubs and enable data-driven predictions aiming at improving the efficacy of therapeutics for diseases like cancer, where redundant RTK signalling pathways often compromise treatment efficacy.
Collapse
Affiliation(s)
- Sakim S. Samad
- Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Division of Evolution, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Jean-Marc Schwartz
- Division of Evolution, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
| | - Chiara Francavilla
- Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- Section of Protein Science and Biotherapeutics, Department of Bioengineering and Biomedicine, Danish Technical University, Lyngby, Denmark
| |
Collapse
|
14
|
Smiley KO, Munley KM, Aghi K, Lipshutz SE, Patton TM, Pradhan DS, Solomon-Lane TK, Sun SED. Sex diversity in the 21st century: Concepts, frameworks, and approaches for the future of neuroendocrinology. Horm Behav 2024; 157:105445. [PMID: 37979209 PMCID: PMC10842816 DOI: 10.1016/j.yhbeh.2023.105445] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Revised: 10/11/2023] [Accepted: 10/18/2023] [Indexed: 11/20/2023]
Abstract
Sex is ubiquitous and variable throughout the animal kingdom. Historically, scientists have used reductionist methodologies that rely on a priori sex categorizations, in which two discrete sexes are inextricably linked with gamete type. However, this binarized operationalization does not adequately reflect the diversity of sex observed in nature. This is due, in part, to the fact that sex exists across many levels of biological analysis, including genetic, molecular, cellular, morphological, behavioral, and population levels. Furthermore, the biological mechanisms governing sex are embedded in complex networks that dynamically interact with other systems. To produce the most accurate and scientifically rigorous work examining sex in neuroendocrinology and to capture the full range of sex variability and diversity present in animal systems, we must critically assess the frameworks, experimental designs, and analytical methods used in our research. In this perspective piece, we first propose a new conceptual framework to guide the integrative study of sex. Then, we provide practical guidance on research approaches for studying sex-associated variables, including factors to consider in study design, selection of model organisms, experimental methodologies, and statistical analyses. We invite fellow scientists to conscientiously apply these modernized approaches to advance our biological understanding of sex and to encourage academically and socially responsible outcomes of our work. By expanding our conceptual frameworks and methodological approaches to the study of sex, we will gain insight into the unique ways that sex exists across levels of biological organization to produce the vast array of variability and diversity observed in nature.
Collapse
Affiliation(s)
- Kristina O Smiley
- Department of Psychological and Brain Sciences, University of Massachusetts Amherst, 639 North Pleasant Street, Morrill IVN Neuroscience, Amherst, MA 01003, USA.
| | - Kathleen M Munley
- Department of Psychology, University of Houston, 3695 Cullen Boulevard, Houston, TX 77204, USA.
| | - Krisha Aghi
- Department of Integrative Biology and Physiology, University of California Los Angeles, 405 Hilgard Ave, Los Angeles, CA 90095, USA.
| | - Sara E Lipshutz
- Department of Biology, Duke University, 130 Science Drive, Durham, NC 27708, USA.
| | - Tessa M Patton
- Bioinformatics Program, Loyola University Chicago, 1032 West Sheridan Road, LSB 317, Chicago, IL 60660, USA.
| | - Devaleena S Pradhan
- Department of Biological Sciences, Idaho State University, 921 South 8th Avenue, Mail Stop 8007, Pocatello, ID 83209, USA.
| | - Tessa K Solomon-Lane
- Scripps, Pitzer, Claremont McKenna Colleges, 925 North Mills Avenue, Claremont, CA 91711, USA.
| | - Simón E D Sun
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.
| |
Collapse
|
15
|
Chomthong M, Griffiths H. Prospects and perspectives: inferring physiological and regulatory targets for CAM from molecular and modelling approaches. ANNALS OF BOTANY 2023; 132:583-596. [PMID: 37742290 PMCID: PMC10799989 DOI: 10.1093/aob/mcad142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Revised: 08/26/2023] [Accepted: 09/21/2023] [Indexed: 09/26/2023]
Abstract
BACKGROUND AND SCOPE This review summarizes recent advances in our understanding of Crassulacean Acid Metabolism (CAM) by integrating evolutionary, ecological, physiological, metabolic and molecular perspectives. A number of key control loops which moderate the expression of CAM phases, and their metabolic and molecular control, are explored. These include nocturnal stomatal opening, activation of phosphoenolpyruvate carboxylase by a specific protein kinase, interactions with circadian clock control, as well as daytime decarboxylation and activation of Rubisco. The vacuolar storage and release of malic acid and the interplay between the supply and demand for carbohydrate reserves are also key metabolic control points. FUTURE OPPORTUNITIES We identify open questions and opportunities, with experimentation informed by top-down molecular modelling approaches allied with bottom-up mechanistic modelling systems. For example, mining transcriptomic datasets using high-speed systems approaches will help to identify targets for future genetic manipulation experiments to define the regulation of CAM (whether circadian or metabolic control). We emphasize that inferences arising from computational approaches or advanced nuclear sequencing techniques can identify potential genes and transcription factors as regulatory targets. However, these outputs then require systematic evaluation, using genetic manipulation in key model organisms over a developmental progression, combining gene silencing and metabolic flux analysis and modelling to define functionality across the CAM day-night cycle. From an evolutionary perspective, the origins and function of CAM succulents and responses to water deficits are set against the mesophyll and hydraulic limitations imposed by cell and tissue succulence in contrasting morphological lineages. We highlight the interplay between traits across shoots (3D vein density, mesophyll conductance and cell shrinkage) and roots (xylem embolism and segmentation). Thus, molecular, biophysical and biochemical processes help to curtail water losses and exploit rapid rehydration during restorative rain events. In the face of a changing climate, we hope such approaches will stimulate opportunities for future research.
Collapse
Affiliation(s)
- Methawi Chomthong
- Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, UK
| | - Howard Griffiths
- Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, UK
| |
Collapse
|
16
|
Ni L, Yu Q, You R, Chen C, Peng B. Development of the RF-GSEA Method for Identifying Disulfidptosis-Related Genes and Application in Hepatocellular Carcinoma. Curr Issues Mol Biol 2023; 45:9450-9470. [PMID: 38132439 PMCID: PMC10741996 DOI: 10.3390/cimb45120593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/22/2023] [Accepted: 11/23/2023] [Indexed: 12/23/2023] Open
Abstract
Disulfidptosis is a newly discovered cellular programmed cell death mode. Presently, a considerable number of genes related to disulfidptosis remain undiscovered, and its significance in hepatocellular carcinoma remains unrevealed. We have developed a powerful analytical method called RF-GSEA for identifying potential genes associated with disulfidptosis. This method draws inspiration from gene regulation networks and graph theory, and it is implemented through a combination of random forest regression model and Gene Set Enrichment Analysis. Subsequently, to validate the practical application value of this method, we applied it to hepatocellular carcinoma. Based on the RF-GSEA method, we developed a disulfidptosis-related signature. Lastly, we looked into how the disulfidptosis-related signature is connected to HCC prognosis, the tumor microenvironment, the effectiveness of immunotherapy, and the sensitivity of chemotherapy drugs. The RF-GSEA method identified a total of 220 disulfidptosis-related genes, from which 7 were selected to construct the disulfidptosis-related signature. The high-disulfidptosis-related score group had a worse prognosis compared to the low-disulfidptosis-related score group and showed lower infiltration levels of immune-promoting cells. The high-disulfidptosis-related score group had a higher likelihood of benefiting from immunotherapy compared to the low-disulfidptosis-related score group. The RF-GSEA method is a powerful tool for identifying disulfidptosis-related genes. The disulfidptosis-related signature effectively predicts HCC prognosis, immunotherapy response, and drug sensitivity.
Collapse
Affiliation(s)
| | | | | | | | - Bin Peng
- School of Public Health, Chongqing Medical University, Chongqing 400016, China
| |
Collapse
|
17
|
Jiang Z, Chen C, Xu Z, Wang X, Zhang M, Zhang D. SIGNET: transcriptome-wide causal inference for gene regulatory networks. Sci Rep 2023; 13:19371. [PMID: 37938594 PMCID: PMC10632394 DOI: 10.1038/s41598-023-46295-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 10/30/2023] [Indexed: 11/09/2023] Open
Abstract
Gene regulation plays an important role in understanding the mechanisms of human biology and diseases. However, inferring causal relationships between all genes is challenging due to the large number of genes in the transcriptome. Here, we present SIGNET (Statistical Inference on Gene Regulatory Networks), a flexible software package that reveals networks of causal regulation between genes built upon large-scale transcriptomic and genotypic data at the population level. Like Mendelian randomization, SIGNET uses genotypic variants as natural instrumental variables to establish such causal relationships but constructs a transcriptome-wide gene regulatory network with high confidence. SIGNET makes such a computationally heavy task feasible by deploying a well-designed statistical algorithm over a parallel computing environment. It also provides a user-friendly interface allowing for parameter tuning, efficient parallel computing scheduling, interactive network visualization, and confirmatory results retrieval. The Open source SIGNET software is freely available ( https://www.zstats.org/signet/ ).
Collapse
Affiliation(s)
- Zhongli Jiang
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Zhenyu Xu
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Min Zhang
- Department of Statistics, Purdue University, West Lafayette, IN, 47907, USA
- Department of Epidemiology and Biostatistics, University of California, Irvine, CA, 92617, USA
| | - Dabao Zhang
- Department of Epidemiology and Biostatistics, University of California, Irvine, CA, 92617, USA.
| |
Collapse
|
18
|
Caliskan A, Arga KY. A Differential Transcriptional Regulome Approach to Unpack Cancer Biology: Insights on Renal Cell Carcinoma Subtypes. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2023; 27:536-545. [PMID: 37943533 DOI: 10.1089/omi.2023.0167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Cancer research calls for new approaches that account for the regulatory complexities of biology. We present, in this study, the differential transcriptional regulome (DIFFREG) approach for the identification and prioritization of key transcriptional regulators and apply it to the case of renal cell carcinoma (RCC) biology. Of note, RCC has a poor prognosis and the biomarker and drug discovery studies to date have tended to focus on gene expression independent from mutations and/or post-translational modifications. DIFFREG focuses on the differential regulation between transcription factors (TFs) and their target genes rather than differential gene expression and integrates transcriptome profiling with the human transcriptional regulatory network to analyze differential gene regulation between healthy and RCC cases. In this study, RNA-seq tissue samples (n = 1020) from the Cancer Genome Atlas (TCGA), including healthy and tumor subjects, were integrated with a comprehensive human TF-gene interactome dataset (1122603 interactions between 1289 TFs and 25177 genes). Comparative analysis of DIFFREG profiles, consisting of perturbed TF-gene interactions, from three common subtypes (clear cell RCC, papillary RCC and chromophobe RCC) revealed subtype-specific alterations, supporting the hypothesis that these signatures in the transcriptional regulome profiles may be considered potential biomarkers that may play an important role in elucidating the molecular mechanisms of RCC development and translating knowledge about the genetic basis of RCC into the clinic. In addition, these indicators may help oncologists make the best decisions for diagnosis and prognosis management.
Collapse
Affiliation(s)
- Aysegul Caliskan
- Department of Bioengineering, Marmara University, Istanbul, Turkey
- Department of Pharmacy, Faculty of Pharmacy, Istinye University, Istanbul, Turkey
| | - Kazim Yalcin Arga
- Department of Bioengineering, Marmara University, Istanbul, Turkey
- Genetic and Metabolic Diseases Research and Investigation Center (GEMHAM), Marmara University, Istanbul, Turkey
| |
Collapse
|
19
|
Oubounyt M, Adlung L, Patroni F, Wenke NK, Maier A, Hartung M, Baumbach J, Elkjaer ML. Inference of differential key regulatory networks and mechanistic drug repurposing candidates from scRNA-seq data with SCANet. Bioinformatics 2023; 39:btad644. [PMID: 37862243 PMCID: PMC10628438 DOI: 10.1093/bioinformatics/btad644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 09/07/2023] [Accepted: 10/19/2023] [Indexed: 10/22/2023] Open
Abstract
MOTIVATION The reconstruction of small key regulatory networks that explain the differences in the development of cell (sub)types from single-cell RNA sequencing is a yet unresolved computational problem. RESULTS To this end, we have developed SCANet, an all-in-one package for single-cell profiling that covers the whole differential mechanotyping workflow, from inference of trait/cell-type-specific gene co-expression modules, driver gene detection, and transcriptional gene regulatory network reconstruction to mechanistic drug repurposing candidate prediction. To illustrate the power of SCANet, we examined data from two studies. First, we identify the drivers of the mechanotype of a cytokine storm associated with increased mortality in patients with acute respiratory illness. Secondly, we find 20 drugs for eight potential pharmacological targets in cellular driver mechanisms in the intestinal stem cells of obese mice. AVAILABILITY AND IMPLEMENTATION SCANet is a free, open-source, and user-friendly Python package that can be seamlessly integrated into single-cell-based systems medicine research and mechanistic drug discovery.
Collapse
Affiliation(s)
- Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Lorenz Adlung
- Department of Medicine, Hamburg Center for Translational Immunology (HCTI) and Center for Biomedical AI (bAIome), University Medical Center Hamburg-Eppendorf (UKE), Hamburg 20246, Germany
| | - Fabio Patroni
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
- Center for Molecular Biology and Genetic Engineering (CBMEG), State University of Campinas (Unicamp), Campinas, SP 13083-875, Brazil
| | - Nina Kerstin Wenke
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Andreas Maier
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Michael Hartung
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense 5000, Denmark
| | - Maria L Elkjaer
- Institute for Computational Systems Biology, University of Hamburg, Hamburg 22607, Germany
- Department of Neurology, Odense University Hospital, Odense 5000, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense 5000, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense 5000, Denmark
| |
Collapse
|
20
|
Wu Y, Qian B, Wang A, Dong H, Zhu E, Ma B. iLSGRN: inference of large-scale gene regulatory networks based on multi-model fusion. Bioinformatics 2023; 39:btad619. [PMID: 37851379 PMCID: PMC10589915 DOI: 10.1093/bioinformatics/btad619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/04/2023] [Accepted: 10/17/2023] [Indexed: 10/19/2023] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) are a way of describing the interaction between genes, which contribute to revealing the different biological mechanisms in the cell. Reconstructing GRNs based on gene expression data has been a central computational problem in systems biology. However, due to the high dimensionality and non-linearity of large-scale GRNs, accurately and efficiently inferring GRNs is still a challenging task. RESULTS In this article, we propose a new approach, iLSGRN, to reconstruct large-scale GRNs from steady-state and time-series gene expression data based on non-linear ordinary differential equations. Firstly, the regulatory gene recognition algorithm calculates the Maximal Information Coefficient between genes and excludes redundant regulatory relationships to achieve dimensionality reduction. Then, the feature fusion algorithm constructs a model leveraging the feature importance derived from XGBoost (eXtreme Gradient Boosting) and RF (Random Forest) models, which can effectively train the non-linear ordinary differential equations model of GRNs and improve the accuracy and stability of the inference algorithm. The extensive experiments on different scale datasets show that our method makes sensible improvement compared with the state-of-the-art methods. Furthermore, we perform cross-validation experiments on the real gene datasets to validate the robustness and effectiveness of the proposed method. AVAILABILITY AND IMPLEMENTATION The proposed method is written in the Python language, and is available at: https://github.com/lab319/iLSGRN.
Collapse
Affiliation(s)
- Yiming Wu
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Bing Qian
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Anqi Wang
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong 999077, China
| | - Heng Dong
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| | - Enqiang Zhu
- Institution of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China
| | - Baoshan Ma
- School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China
| |
Collapse
|
21
|
Belova T, Biondi N, Hsieh PH, Lutsik P, Chudasama P, Kuijjer M. Heterogeneity in the gene regulatory landscape of leiomyosarcoma. NAR Cancer 2023; 5:zcad037. [PMID: 37492373 PMCID: PMC10365024 DOI: 10.1093/narcan/zcad037] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 07/06/2023] [Accepted: 07/18/2023] [Indexed: 07/27/2023] Open
Abstract
Characterizing inter-tumor heterogeneity is crucial for selecting suitable cancer therapy, as the presence of diverse molecular subgroups of patients can be associated with disease outcome or response to treatment. While cancer subtypes are often characterized by differences in gene expression, the mechanisms driving these differences are generally unknown. We set out to model the regulatory mechanisms driving sarcoma heterogeneity based on patient-specific, genome-wide gene regulatory networks. We developed a new computational framework, PORCUPINE, which combines knowledge on biological pathways with permutation-based network analysis to identify pathways that exhibit significant regulatory heterogeneity across a patient population. We applied PORCUPINE to patient-specific leiomyosarcoma networks modeled on data from The Cancer Genome Atlas and validated our results in an independent dataset from the German Cancer Research Center. PORCUPINE identified 37 heterogeneously regulated pathways, including pathways representing potential targets for treatment of subgroups of leiomyosarcoma patients, such as FGFR and CTLA4 inhibitory signaling. We validated the detected regulatory heterogeneity through analysis of networks and chromatin states in leiomyosarcoma cell lines. We showed that the heterogeneity identified with PORCUPINE is not associated with methylation profiles or clinical features, thereby suggesting an independent mechanism of patient heterogeneity driven by the complex landscape of gene regulatory interactions.
Collapse
Affiliation(s)
- Tatiana Belova
- Computational Biology and Systems Medicine Group, Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway
| | - Nicola Biondi
- Precision Sarcoma Research Group, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases, Heidelberg, Germany
| | - Ping-Han Hsieh
- Computational Biology and Systems Medicine Group, Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway
| | - Pavlo Lutsik
- Division of Cancer Epigenomics, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Department of Oncology, Catholic University (KU) Leuven, Leuven, Belgium
| | - Priya Chudasama
- Precision Sarcoma Research Group, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases, Heidelberg, Germany
| | - Marieke L Kuijjer
- Computational Biology and Systems Medicine Group, Centre for Molecular Medicine Norway, University of Oslo, Oslo, Norway
- Department of Pathology, Leiden University Medical Center, Leiden, the Netherlands
- Leiden Center for Computational Oncology, Leiden University Medical Center, Leiden, the Netherlands
| |
Collapse
|
22
|
Latapiat V, Saez M, Pedroso I, Martin AJM. Unraveling patient heterogeneity in complex diseases through individualized co-expression networks: a perspective. Front Genet 2023; 14:1209416. [PMID: 37636264 PMCID: PMC10449456 DOI: 10.3389/fgene.2023.1209416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 07/24/2023] [Indexed: 08/29/2023] Open
Abstract
This perspective highlights the potential of individualized networks as a novel strategy for studying complex diseases through patient stratification, enabling advancements in precision medicine. We emphasize the impact of interpatient heterogeneity resulting from genetic and environmental factors and discuss how individualized networks improve our ability to develop treatments and enhance diagnostics. Integrating system biology, combining multimodal information such as genomic and clinical data has reached a tipping point, allowing the inference of biological networks at a single-individual resolution. This approach generates a specific biological network per sample, representing the individual from which the sample originated. The availability of individualized networks enables applications in personalized medicine, such as identifying malfunctions and selecting tailored treatments. In essence, reliable, individualized networks can expedite research progress in understanding drug response variability by modeling heterogeneity among individuals and enabling the personalized selection of pharmacological targets for treatment. Therefore, developing diverse and cost-effective approaches for generating these networks is crucial for widespread application in clinical services.
Collapse
Affiliation(s)
- Verónica Latapiat
- Programa de Doctorado en Genómica Integrativa, Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile
- Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile
- Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Santiago, Chile
| | - Mauricio Saez
- Centro de Oncología de Precisión, Facultad de Medicina y Ciencias de la Salud, Universidad Mayor, Santiago, Chile
- Laboratorio de Investigación en Salud de Precisión, Departamento de Procesos Diagnósticos y Evaluación, Facultad de Ciencias de la Salud, Universidad Católica de Temuco, Temuco, Chile
| | - Inti Pedroso
- Vicerrectoría de Investigación, Universidad Mayor, Santiago, Chile
| | - Alberto J. M. Martin
- Laboratorio de Redes Biológicas, Centro Científico y Tecnológico de Excelencia Ciencia & Vida, Fundación Ciencia & Vida, Santiago, Chile
- Escuela de Ingeniería, Facultad de Ingeniería, Arquitectura y Diseño, Universidad San Sebastián, Santiago, Chile
| |
Collapse
|
23
|
Williams RTP, King DC, Mastroianni IR, Hill JL, Apenes NW, Ramirez G, Miner EC, Moore A, Coleman K, Nishimura EO. Transcriptome profiling of the Caenorhabditis elegans intestine reveals that ELT-2 negatively and positively regulates intestinal gene expression within the context of a gene regulatory network. Genetics 2023; 224:iyad088. [PMID: 37183501 PMCID: PMC10411582 DOI: 10.1093/genetics/iyad088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/28/2023] [Accepted: 04/30/2023] [Indexed: 05/16/2023] Open
Abstract
ELT-2 is the major transcription factor (TF) required for Caenorhabditis elegans intestinal development. ELT-2 expression initiates in embryos to promote development and then persists after hatching through the larval and adult stages. Though the sites of ELT-2 binding are characterized and the transcriptional changes that result from ELT-2 depletion are known, an intestine-specific transcriptome profile spanning developmental time has been missing. We generated this dataset by performing Fluorescence Activated Cell Sorting on intestine cells at distinct developmental stages. We analyzed this dataset in conjunction with previously conducted ELT-2 studies to evaluate the role of ELT-2 in directing the intestinal gene regulatory network through development. We found that only 33% of intestine-enriched genes in the embryo were direct targets of ELT-2 but that number increased to 75% by the L3 stage. This suggests additional TFs promote intestinal transcription especially in the embryo. Furthermore, only half of ELT-2's direct target genes were dependent on ELT-2 for their proper expression levels, and an equal proportion of those responded to elt-2 depletion with over-expression as with under-expression. That is, ELT-2 can either activate or repress direct target genes. Additionally, we observed that ELT-2 repressed its own promoter, implicating new models for its autoregulation. Together, our results illustrate that ELT-2 impacts roughly 20-50% of intestine-specific genes, that ELT-2 both positively and negatively controls its direct targets, and that the current model of the intestinal regulatory network is incomplete as the factors responsible for directing the expression of many intestinal genes remain unknown.
Collapse
Affiliation(s)
- Robert T P Williams
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - David C King
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Izabella R Mastroianni
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Jessica L Hill
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Nicolai W Apenes
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Gabriela Ramirez
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
- Department of Cell and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - E Catherine Miner
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
- College of Veterinary Medicine and Biomedical Sciences, Colorado State University, Fort Collins, CO 80523, USA
| | - Andrew Moore
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Karissa Coleman
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| | - Erin Osborne Nishimura
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, CO 80523, USA
| |
Collapse
|
24
|
Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023; 19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open
Abstract
Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data.
Collapse
Affiliation(s)
- Malvina Marku
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
25
|
Jiang Z, Chen C, Xu Z, Wang X, Zhang M, Zhang D. SIGNET: Transcriptome-wide Causal Inference for Gene Regulatory Networks. RESEARCH SQUARE 2023:rs.3.rs-3180043. [PMID: 37546848 PMCID: PMC10402199 DOI: 10.21203/rs.3.rs-3180043/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
Gene regulation plays an important role in understanding the mechanisms of human biology and diseases. However, inferring causal relationships between all genes is challenging due to the large number of genes in the transcriptome. Here, we present SIGNET (Statistical Inference on Gene Regulatory Networks), a flexible software package that reveals networks of causal regulation between genes built upon large-scale transcriptomic and genotypic data at the population level. Like Mendelian randomization, SIGNET uses genotypic variants as natural instrumental variables to establish such causal relationships but constructs a transcriptome-wide gene regulatory network with high confidence. SIGNET makes such a computationally heavy task feasible by deploying a well-designed statistical algorithm over a parallel computing environment. It also provides a user-friendly interface allowing for parameter tuning, efficient parallel computing scheduling, interactive network visualization, and confirmatory results retrieval. The Open source SIGNET software is freely available (https://www.zstats.org/signet/).
Collapse
Affiliation(s)
- Zhongli Jiang
- Department of Statistics, Purdue University, West Lafayette, 47907, Indiana, United States
| | - Chen Chen
- UCB Pharma, Brussels, 1070, Belgium
- These authors contributed to this project as research assistants when they studied in the Department of Statistics, Purdue University
| | - Zhenyu Xu
- Department of Statistics, Purdue University, West Lafayette, 47907, Indiana, United States
- These authors contributed to this project as research assistants when they studied in the Department of Statistics, Purdue University
| | - Xiaojian Wang
- ByteDance, Shanghai, 201107, China
- These authors contributed to this project as research assistants when they studied in the Department of Statistics, Purdue University
| | - Min Zhang
- Department of Statistics, Purdue University, West Lafayette, 47907, Indiana, United States
- Department of Epidemiology and Biostatistics, University of California, Irvine, 92617, California, United States
| | - Dabao Zhang
- Department of Statistics, Purdue University, West Lafayette, 47907, Indiana, United States
| |
Collapse
|
26
|
Littman R, Cheng M, Wang N, Peng C, Yang X. SCING: Inference of robust, interpretable gene regulatory networks from single cell and spatial transcriptomics. iScience 2023; 26:107124. [PMID: 37434694 PMCID: PMC10331489 DOI: 10.1016/j.isci.2023.107124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 03/31/2023] [Accepted: 06/09/2023] [Indexed: 07/13/2023] Open
Abstract
Gene regulatory network (GRN) inference is an integral part of understanding physiology and disease. Single cell/nuclei RNA-seq (scRNA-seq/snRNA-seq) data has been used to elucidate cell-type GRNs; however, the accuracy and speed of current scRNAseq-based GRN approaches are suboptimal. Here, we present Single Cell INtegrative Gene regulatory network inference (SCING), a gradient boosting and mutual information-based approach for identifying robust GRNs from scRNA-seq, snRNA-seq, and spatial transcriptomics data. Performance evaluation using Perturb-seq datasets, held-out data, and the mouse cell atlas combined with the DisGeNET database demonstrates the improved accuracy and biological interpretability of SCING compared to existing methods. We applied SCING to the entire mouse single cell atlas, human Alzheimer's disease (AD), and mouse AD spatial transcriptomics. SCING GRNs reveal unique disease subnetwork modeling capabilities, have intrinsic capacity to correct for batch effects, retrieve disease relevant genes and pathways, and are informative on spatial specificity of disease pathogenesis.
Collapse
Affiliation(s)
- Russell Littman
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Michael Cheng
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ning Wang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
| | - Chao Peng
- Department of Neurology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Xia Yang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
- Institute for Quantitative and Computational Biosciences (QCBio), Los Angeles, CA, USA
- Molecular Biology Institute (MBI), Los Angeles, CA, USA
- Brain Research Institute (BRI), Los Angeles, CA, USA
| |
Collapse
|
27
|
Zhang J, Hu C, Zhang Q. Gene regulatory network inference based on a nonhomogeneous dynamic Bayesian network model with an improved Markov Monte Carlo sampling. BMC Bioinformatics 2023; 24:264. [PMID: 37355560 DOI: 10.1186/s12859-023-05381-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 06/07/2023] [Indexed: 06/26/2023] Open
Abstract
A nonhomogeneous dynamic Bayesian network model, which combines the dynamic Bayesian network and the multi-change point process, solves the limitations of the dynamic Bayesian network in modeling non-stationary gene expression data to a certain extent. However, certain problems persist, such as the low network reconstruction accuracy and poor model convergence. Therefore, we propose an MD-birth move based on the Manhattan distance of the data points to increase the rationality of the multi-change point process. The underlying concept of the MD-birth move is that the direction of movement of the change point is assumed to have a larger Manhattan distance between the variance and the mean of its left and right data points. Considering the data instability characteristics, we propose a Markov chain Monte Carlo sampling method based on node-dependent particle filtering in addition to the multi-change point process. The candidate parent nodes to be sampled, which are close to the real state, are pushed to the high probability area through the particle filter, and the candidate parent node set to be sampled that is far from the real state is pushed to the low probability area and then sampled. In terms of reconstructing the gene regulatory network, the model proposed in this paper (FC-DBN) has better network reconstruction accuracy and model convergence speed than other corresponding models on the Saccharomyces cerevisiae data and RAF data.
Collapse
Affiliation(s)
- Jiayao Zhang
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China
| | - Chunling Hu
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China.
| | - Qianqian Zhang
- College of Artificial Intelligence and Big Data, Hefei University, Hefei, 230031, China
| |
Collapse
|
28
|
Mathews J, Chang A(J, Devlin L, Levin M. Cellular signaling pathways as plastic, proto-cognitive systems: Implications for biomedicine. PATTERNS (NEW YORK, N.Y.) 2023; 4:100737. [PMID: 37223267 PMCID: PMC10201306 DOI: 10.1016/j.patter.2023.100737] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Many aspects of health and disease are modeled using the abstraction of a "pathway"-a set of protein or other subcellular activities with specified functional linkages between them. This metaphor is a paradigmatic case of a deterministic, mechanistic framework that focuses biomedical intervention strategies on altering the members of this network or the up-/down-regulation links between them-rewiring the molecular hardware. However, protein pathways and transcriptional networks exhibit interesting and unexpected capabilities such as trainability (memory) and information processing in a context-sensitive manner. Specifically, they may be amenable to manipulation via their history of stimuli (equivalent to experiences in behavioral science). If true, this would enable a new class of biomedical interventions that target aspects of the dynamic physiological "software" implemented by pathways and gene-regulatory networks. Here, we briefly review clinical and laboratory data that show how high-level cognitive inputs and mechanistic pathway modulation interact to determine outcomes in vivo. Further, we propose an expanded view of pathways from the perspective of basal cognition and argue that a broader understanding of pathways and how they process contextual information across scales will catalyze progress in many areas of physiology and neurobiology. We argue that this fuller understanding of the functionality and tractability of pathways must go beyond a focus on the mechanistic details of protein and drug structure to encompass their physiological history as well as their embedding within higher levels of organization in the organism, with numerous implications for data science addressing health and disease. Exploiting tools and concepts from behavioral and cognitive sciences to explore a proto-cognitive metaphor for the pathways underlying health and disease is more than a philosophical stance on biochemical processes; at stake is a new roadmap for overcoming the limitations of today's pharmacological strategies and for inferring future therapeutic interventions for a wide range of disease states.
Collapse
Affiliation(s)
- Juanita Mathews
- Allen Discovery Center at Tufts University, Medford, MA, USA
| | | | - Liam Devlin
- Allen Discovery Center at Tufts University, Medford, MA, USA
| | - Michael Levin
- Allen Discovery Center at Tufts University, Medford, MA, USA
- Wyss Institute for Biologically Inspired Engineering at Harvard University, Boston, MA, USA
| |
Collapse
|
29
|
Wang Q, Guo M, Chen J, Duan R. A gene regulatory network inference model based on pseudo-siamese network. BMC Bioinformatics 2023; 24:163. [PMID: 37085776 PMCID: PMC10122305 DOI: 10.1186/s12859-023-05253-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 03/24/2023] [Indexed: 04/23/2023] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) arise from the intricate interactions between transcription factors (TFs) and their target genes during the growth and development of organisms. The inference of GRNs can unveil the underlying gene interactions in living systems and facilitate the investigation of the relationship between gene expression patterns and phenotypic traits. Although several machine-learning models have been proposed for inferring GRNs from single-cell RNA sequencing (scRNA-seq) data, some of these models, such as Boolean and tree-based networks, suffer from sensitivity to noise and may encounter difficulties in handling the high noise and dimensionality of actual scRNA-seq data, as well as the sparse nature of gene regulation relationships. Thus, inferring large-scale information from GRNs remains a formidable challenge. RESULTS This study proposes a multilevel, multi-structure framework called a pseudo-Siamese GRN (PSGRN) for inferring large-scale GRNs from time-series expression datasets. Based on the pseudo-Siamese network, we applied a gated recurrent unit to capture the time features of each TF and target matrix and learn the spatial features of the matrices after merging by applying the DenseNet framework. Finally, we applied a sigmoid function to evaluate interactions. We constructed two maize sub-datasets, including gene expression levels and GRNs, using existing open-source maize multi-omics data and compared them to other GRN inference methods, including GENIE3, GRNBoost2, nonlinear ordinary differential equations, CNNC, and DGRNS. Our results show that PSGRN outperforms state-of-the-art methods. This study proposed a new framework: a PSGRN that allows GRNs to be inferred from scRNA-seq data, elucidating the temporal and spatial features of TFs and their target genes. The results show the model's robustness and generalization, laying a theoretical foundation for maize genotype-phenotype associations with implications for breeding work.
Collapse
Affiliation(s)
- Qian Wang
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Jian Chen
- College of Agronomy and Biotechnology, China Agricultural University, Beijing, China
| | - Ran Duan
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
30
|
Agarwal M, Sharma A, Kagoo R A, Bhargava A. Interactions between genes altered during cardiotoxicity and neurotoxicity in zebrafish revealed using induced network modules analysis. Sci Rep 2023; 13:6257. [PMID: 37069190 PMCID: PMC10110561 DOI: 10.1038/s41598-023-33145-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 04/07/2023] [Indexed: 04/19/2023] Open
Abstract
As the manufacturing and development of new synthetic compounds increase to keep pace with the expanding global demand, adverse health effects due to these compounds are emerging as critical public health concerns. Zebrafish have become a prominent model organism to study toxicology due to their genomic similarity to humans, optical clarity, well-defined developmental stages, short generation time, and cost-effective maintenance. It also provides a shorter time frame for in vivo toxicology evaluation compared to the mammalian experimental systems. Here, we used meta-analysis to examine the alteration in genes during cardiotoxicity and neurotoxicity in zebrafish, caused by chemical exposure of any kind. First, we searched the literature comprehensively for genes that are altered during neurotoxicity and cardiotoxicity followed by meta-analysis using ConsensusPathDB. Since constant communication between the heart and the brain is an important physiological phenomenon, we also analyzed interactions among genes altered simultaneously during cardiotoxicity and neurotoxicity using induced network modules analysis in ConsensusPathDB. We observed inflammation and regeneration as the major pathways involved in cardiotoxicity and neurotoxicity. A large number of intermediate genes and input genes anchored in these pathways are molecular regulators of cell cycle progression and cell death and are implicated in tumor manifestation. We propose potential predictive biomarkers for neurotoxicity and cardiotoxicity and the major pathways potentially implicated in the manifestation of a particular toxicity phenotype.
Collapse
Affiliation(s)
- Manusmriti Agarwal
- Department of Biotechnology, Indian Institute of Technology Hyderabad (IITH), Kandi, Telangana, 502284, India
| | - Ankush Sharma
- Department of Biotechnology, Indian Institute of Technology Hyderabad (IITH), Kandi, Telangana, 502284, India
| | - Andrea Kagoo R
- Department of Biotechnology, Indian Institute of Technology Hyderabad (IITH), Kandi, Telangana, 502284, India
| | - Anamika Bhargava
- Department of Biotechnology, Indian Institute of Technology Hyderabad (IITH), Kandi, Telangana, 502284, India.
| |
Collapse
|
31
|
Pandey AK, Loscalzo J. Network medicine: an approach to complex kidney disease phenotypes. Nat Rev Nephrol 2023:10.1038/s41581-023-00705-0. [PMID: 37041415 DOI: 10.1038/s41581-023-00705-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/13/2023] [Indexed: 04/13/2023]
Abstract
Scientific reductionism has been the basis of disease classification and understanding for more than a century. However, the reductionist approach of characterizing diseases from a limited set of clinical observations and laboratory evaluations has proven insufficient in the face of an exponential growth in data generated from transcriptomics, proteomics, metabolomics and deep phenotyping. A new systematic method is necessary to organize these datasets and build new definitions of what constitutes a disease that incorporates both biological and environmental factors to more precisely describe the ever-growing complexity of phenotypes and their underlying molecular determinants. Network medicine provides such a conceptual framework to bridge these vast quantities of data while providing an individualized understanding of disease. The modern application of network medicine principles is yielding new insights into the pathobiology of chronic kidney diseases and renovascular disorders by expanding the understanding of pathogenic mediators, novel biomarkers and new options for renal therapeutics. These efforts affirm network medicine as a robust paradigm for elucidating new advances in the diagnosis and treatment of kidney disorders.
Collapse
Affiliation(s)
- Arvind K Pandey
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA
| | - Joseph Loscalzo
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
32
|
Xu J, Zhang A, Liu F, Zhang X. STGRNS: an interpretable transformer-based method for inferring gene regulatory networks from single-cell transcriptomic data. Bioinformatics 2023; 39:btad165. [PMID: 37004161 PMCID: PMC10085635 DOI: 10.1093/bioinformatics/btad165] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2022] [Revised: 02/28/2023] [Accepted: 03/25/2023] [Indexed: 04/03/2023] Open
Abstract
MOTIVATION Single-cell RNA-sequencing (scRNA-seq) technologies provide an opportunity to infer cell-specific gene regulatory networks (GRNs), which is an important challenge in systems biology. Although numerous methods have been developed for inferring GRNs from scRNA-seq data, it is still a challenge to deal with cellular heterogeneity. RESULTS To address this challenge, we developed an interpretable transformer-based method namely STGRNS for inferring GRNs from scRNA-seq data. In this algorithm, gene expression motif technique was proposed to convert gene pairs into contiguous sub-vectors, which can be used as input for the transformer encoder. By avoiding missing phase-specific regulations in a network, gene expression motif can improve the accuracy of GRN inference for different types of scRNA-seq data. To assess the performance of STGRNS, we implemented the comparative experiments with some popular methods on extensive benchmark datasets including 21 static and 27 time-series scRNA-seq dataset. All the results show that STGRNS is superior to other comparative methods. In addition, STGRNS was also proved to be more interpretable than "black box" deep learning methods, which are well-known for the difficulty to explain the predictions clearly. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/zhanglab-wbgcas/STGRNS.
Collapse
Affiliation(s)
- Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Aidi Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Fang Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074, China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074 China
| |
Collapse
|
33
|
Oubounyt M, Elkjaer ML, Laske T, Grønning AB, Moeller M, Baumbach J. De-novo reconstruction and identification of transcriptional gene regulatory network modules differentiating single-cell clusters. NAR Genom Bioinform 2023; 5:lqad018. [PMID: 36879901 PMCID: PMC9985332 DOI: 10.1093/nargab/lqad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 01/16/2023] [Accepted: 02/09/2023] [Indexed: 03/07/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) technology provides an unprecedented opportunity to understand gene functions and interactions at single-cell resolution. While computational tools for scRNA-seq data analysis to decipher differential gene expression profiles and differential pathway expression exist, we still lack methods to learn differential regulatory disease mechanisms directly from the single-cell data. Here, we provide a new methodology, named DiNiro, to unravel such mechanisms de novo and report them as small, easily interpretable transcriptional regulatory network modules. We demonstrate that DiNiro is able to uncover novel, relevant, and deep mechanistic models that not just predict but explain differential cellular gene expression programs. DiNiro is available at https://exbio.wzw.tum.de/diniro/.
Collapse
Affiliation(s)
- Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Tanja Laske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Alexander G B Grønning
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marcus J Moeller
- Heisenberg Chair of Preventive and Translational Nephrology, Department of Nephrology, Rheumatology and Clinical Immunology, RWTH Aachen University, Aachen, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
34
|
Gonzalez A, Leon DA, Perera Y, Perez R. On the gene expression landscape of cancer. PLoS One 2023; 18:e0277786. [PMID: 36802377 PMCID: PMC9942972 DOI: 10.1371/journal.pone.0277786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 11/03/2022] [Indexed: 02/23/2023] Open
Abstract
Kauffman picture of normal and tumor states as attractors in an abstract state space is used in order to interpret gene expression data for 15 cancer localizations obtained from The Cancer Genome Atlas. A principal component analysis of this data unveils the following qualitative aspects about tumors: 1) The state of a tissue in gene expression space can be described by a few variables. In particular, there is a single variable describing the progression from a normal tissue to a tumor. 2) Each cancer localization is characterized by a gene expression profile, in which genes have specific weights in the definition of the cancer state. There are no less than 2500 differentially-expressed genes, which lead to power-like tails in the expression distribution functions. 3) Tumors in different localizations share hundreds or even thousands of differentially expressed genes. There are 6 genes common to the 15 studied tumor localizations. 4) The tumor region is a kind of attractor. Tumors in advanced stages converge to this region independently of patient age or genetic characteristics. 5) There is a landscape of cancer in gene expression space with an approximate border separating normal tissues from tumors.
Collapse
Affiliation(s)
- Augusto Gonzalez
- University of Electronic Sciences and Technology of China, Chengdu, People Republic of China
- Institute of Cybernetics, Mathematics and Physics, Havana, Cuba
| | - Dario A. Leon
- Institute of Cybernetics, Mathematics and Physics, Havana, Cuba
- Department of Mechanical Engineering and Technology Management, Norwegian University of Life Sciences, Ås, Norway
- * E-mail:
| | - Yasser Perera
- China-Cuba Biotechnology Joint Innovation Center, Yongzhou, People Republic of China
- Center of Genetic Engineering and Biotechnology, Havana, Cuba
| | - Rolando Perez
- University of Electronic Sciences and Technology of China, Chengdu, People Republic of China
- Center of Molecular Immunology, Havana, Cuba
| |
Collapse
|
35
|
Kutchy NA, Morenikeji OB, Memili A, Ugur MR. Deciphering sperm functions using biological networks. Biotechnol Genet Eng Rev 2023:1-25. [PMID: 36722689 DOI: 10.1080/02648725.2023.2168912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Indexed: 02/02/2023]
Abstract
The global human population is exponentially increasing, which requires the production of quality food through efficient reproduction as well as sustainable production of livestock. Lack of knowledge and technology for assessing semen quality and predicting bull fertility is hindering advances in animal science and food animal production and causing millions of dollars of economic losses annually. The intent of this systemic review is to summarize methods from computational biology for analysis of gene, metabolite, and protein networks to identify potential markers that can be applied to improve livestock reproduction, with a focus on bull fertility. We provide examples of available gene, metabolic, and protein networks and computational biology methods to show how the interactions between genes, proteins, and metabolites together drive the complex process of spermatogenesis and regulate fertility in animals. We demonstrate the use of the National Center for Biotechnology Information (NCBI) and Ensembl for finding gene sequences, and then use them to create and understand gene, protein and metabolite networks for sperm associated factors to elucidate global cellular processes in sperm. This study highlights the value of mapping complex biological pathways among livestock and potential for conducting studies on promoting livestock improvement for global food security.
Collapse
Affiliation(s)
- Naseer A Kutchy
- Department of Anatomy, Physiology and Pharmacology, School of Veterinary Medicine, St. George's University, St. George's, Grenada
- Department of Animal Sciences, School of Environmental and Biological Sciences Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Olanrewaju B Morenikeji
- Division of Biological and Health Sciences, University of Pittsburgh at Bradford, Bradford, PA, USA
| | - Aylin Memili
- Department of Nutrition, Gillings School of Global Public Health, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA
| | | |
Collapse
|
36
|
Fan Z, Luo Y, Lu H, Wang T, Feng Y, Zhao W, Kim P, Zhou X. SPASCER: spatial transcriptomics annotation at single-cell resolution. Nucleic Acids Res 2023; 51:D1138-D1149. [PMID: 36243975 PMCID: PMC9825565 DOI: 10.1093/nar/gkac889] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/21/2022] [Accepted: 10/13/2022] [Indexed: 01/30/2023] Open
Abstract
In recent years, the explosive growth of spatial technologies has enabled the characterization of spatial heterogeneity of tissue architectures. Compared to traditional sequencing, spatial transcriptomics reserves the spatial information of each captured location and provides novel insights into diverse spatially related biological contexts. Even though two spatial transcriptomics databases exist, they provide limited analytical information. Information such as spatial heterogeneity of genes and cells, cell-cell communication activities in space, and the cell type compositions in the microenvironment are critical clues to unveil the mechanism of tumorigenesis and embryo differentiation. Therefore, we constructed a new spatial transcriptomics database, named SPASCER (https://ccsm.uth.edu/SPASCER), designed to help understand the heterogeneity of tissue organizations, region-specific microenvironment, and intercellular interactions across tissue architectures at multiple levels. SPASCER contains datasets from 43 studies, including 1082 sub-datasets from 16 organ types across four species. scRNA-seq was integrated to deconvolve/map spatial transcriptomics, and processed with spatial cell-cell interaction, gene pattern and pathway enrichment analysis. Cell-cell interactions and gene regulation network of scRNA-seq from matched spatial transcriptomics were performed as well. The application of SPASCER will provide new insights into tissue architecture and a solid foundation for the mechanistic understanding of many biological processes in healthy and diseased tissues.
Collapse
Affiliation(s)
- Zhiwei Fan
- West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu 610041, China
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Yangyang Luo
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Huifen Lu
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Tiangang Wang
- School of Life Science and Technology, Xidian University, Xi’an 710126, China
| | - YuZhou Feng
- West China Hospital, Sichuan University, Chengdu 610041, China
| | - Weiling Zhao
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Pora Kim
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
37
|
Koumadorakis DE, Krokidis MG, Dimitrakopoulos GN, Vrahatis AG. A Consensus Gene Regulatory Network for Neurodegenerative Diseases Using Single-Cell RNA-Seq Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1423:215-224. [PMID: 37525047 DOI: 10.1007/978-3-031-31978-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
Gene regulatory network (GRN) inference from gene expression data is a highly complex and challenging task in systems biology. Despite the challenges, GRNs have emerged, and for complex diseases such as neurodegenerative diseases, they have the potential to provide vital information and identify key regulators. However, every GRN method produced predicts results based on its assumptions, providing limited biological insights. For that reason, the current work focused on the development of an ensemble method from individual GRN methods to address this issue. Four state-of-the-art GRN algorithms were selected to form a consensus GRN from their common gene interactions. Each algorithm uses a different construction method, and for a more robust behavior, both static and dynamic methods were selected as well. The algorithms were applied to a scRNA-seq dataset from the CK-p25 mus musculus model during neurodegeneration. The top subnetworks were constructed from the consensus network, and potential key regulators were identified. The results also demonstrated the overlap between the algorithms for the current dataset and the necessity for an ensemble approach. This work aims to demonstrate the creation of an ensemble network and provide insights into whether a combination of different GRN methods can produce valuable results.
Collapse
Affiliation(s)
- Dimitrios E Koumadorakis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Marios G Krokidis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Georgios N Dimitrakopoulos
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Aristidis G Vrahatis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
38
|
Galindez G, Sadegh S, Baumbach J, Kacprowski T, List M. Network-based approaches for modeling disease regulation and progression. Comput Struct Biotechnol J 2022; 21:780-795. [PMID: 36698974 PMCID: PMC9841310 DOI: 10.1016/j.csbj.2022.12.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/14/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022] Open
Abstract
Molecular interaction networks lay the foundation for studying how biological functions are controlled by the complex interplay of genes and proteins. Investigating perturbed processes using biological networks has been instrumental in uncovering mechanisms that underlie complex disease phenotypes. Rapid advances in omics technologies have prompted the generation of high-throughput datasets, enabling large-scale, network-based analyses. Consequently, various modeling techniques, including network enrichment, differential network extraction, and network inference, have proven to be useful for gaining new mechanistic insights. We provide an overview of recent network-based methods and their core ideas to facilitate the discovery of disease modules or candidate mechanisms. Knowledge generated from these computational efforts will benefit biomedical research, especially drug development and precision medicine. We further discuss current challenges and provide perspectives in the field, highlighting the need for more integrative and dynamic network approaches to model disease development and progression.
Collapse
Affiliation(s)
- Gihanna Galindez
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Sepideh Sadegh
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
| | - Markus List
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| |
Collapse
|
39
|
Global Distribution and Diversity of Prevalent Sewage Water Plasmidomes. mSystems 2022; 7:e0019122. [PMID: 36069451 PMCID: PMC9600348 DOI: 10.1128/msystems.00191-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Sewage water from around the world contains an abundance of short plasmids, several of which harbor antimicrobial resistance genes (ARGs). The global dynamics of plasmid-derived antimicrobial resistance and functions are only starting to be unveiled. Here, we utilized a previously created data set of 159,332 assumed small plasmids from 24 different global sewage samples. The detailed phylogeny, as well as the interplay between their protein domains, ARGs, and predicted bacterial host genera, were investigated to understand sewage plasmidome dynamics globally. A total of 58,429 circular elements carried genes encoding plasmid-related features, and MASH distance analyses showed a high degree of diversity. A single (yet diverse) cluster of 520 predicted Acinetobacter plasmids was predominant among the European sewage water. Our results suggested a prevalence of plasmid-backbone gene combinations over others. This could be related to selected bacterial genera that act as bacterial hosts. These combinations also mirrored the geographical locations of the sewage samples. Our functional domain network analysis identified three groups of plasmids. However, these backbone domains were not exclusive to any given group, and Acinetobacter was the dominant host genus among the theta-replicating plasmids, which contained a reservoir of the macrolide resistance gene pair msr(E) and mph(E). Macrolide resistance genes were the most common in the sewage plasmidomes and were found in the largest number of unique plasmids. While msr(E) and mph(E) were limited to Acinetobacter, erm(B) was disseminated among a range of Firmicutes plasmids, including Staphylococcus and Streptococcus, highlighting a potential reservoir of antibiotic resistance for these pathogens from around the globe. IMPORTANCE Antimicrobial resistance is a global threat to human health, as it inhibits our ability to treat infectious diseases. This study utilizes sewage water plasmidomes to identify plasmid-derived features and highlights antimicrobial resistance genes, particularly macrolide resistance genes, as abundant in sewage water plasmidomes in Firmicutes and Acinetobacter hosts. The emergence of macrolide resistance in these bacteria suggests that macrolide selective pressure exists in sewage water and that the resident bacteria can readily acquire macrolide resistance via small plasmids.
Collapse
|
40
|
Talarmain L, Clarke MA, Shorthouse D, Cabrera-Cosme L, Kent DG, Fisher J, Hall BA. HOXA9 has the hallmarks of a biological switch with implications in blood cancers. Nat Commun 2022; 13:5829. [PMID: 36192425 PMCID: PMC9530117 DOI: 10.1038/s41467-022-33189-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 09/07/2022] [Indexed: 11/09/2022] Open
Abstract
Blood malignancies arise from the dysregulation of haematopoiesis. The type of blood cell and the specific order of oncogenic events initiating abnormal growth ultimately determine the cancer subtype and subsequent clinical outcome. HOXA9 plays an important role in acute myeloid leukaemia (AML) prognosis by promoting blood cell expansion and altering differentiation; however, the function of HOXA9 in other blood malignancies is still unclear. Here, we highlight the biological switch and prognosis marker properties of HOXA9 in AML and chronic myeloproliferative neoplasms (MPN). First, we establish the ability of HOXA9 to stratify AML patients with distinct cellular and clinical outcomes. Then, through the use of a computational network model of MPN, we show that the self-activation of HOXA9 and its relationship to JAK2 and TET2 can explain the branching progression of JAK2/TET2 mutant MPN patients towards divergent clinical characteristics. Finally, we predict a connection between the RUNX1 and MYB genes and a suppressive role for the NOTCH pathway in MPN diseases.
Collapse
Affiliation(s)
- Laure Talarmain
- Peter MacCallum Cancer Centre, 305 Grattan Street, Melbourne, VIC, 3000, Australia
| | - Matthew A Clarke
- UCL Cancer Institute, University College London, Paul O'Gorman Building, 72 Huntley Street, London, WC1E 6BT, United Kingdom
| | - David Shorthouse
- Department of Medical Physics and Biomedical Engineering, Malet Place Engineering Building, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - Lilia Cabrera-Cosme
- York Biomedical Research Institute, Department of Biology, University of York, York, YO10 5DD, United Kingdom
| | - David G Kent
- York Biomedical Research Institute, Department of Biology, University of York, York, YO10 5DD, United Kingdom
| | - Jasmin Fisher
- UCL Cancer Institute, University College London, Paul O'Gorman Building, 72 Huntley Street, London, WC1E 6BT, United Kingdom
| | - Benjamin A Hall
- Department of Medical Physics and Biomedical Engineering, Malet Place Engineering Building, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
41
|
Wang Y, Zhang C, Wang Y, Liu X, Zhang Z. Enhancer RNA (eRNA) in Human Diseases. Int J Mol Sci 2022; 23:11582. [PMID: 36232885 PMCID: PMC9569849 DOI: 10.3390/ijms231911582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 09/22/2022] [Accepted: 09/24/2022] [Indexed: 11/16/2022] Open
Abstract
Enhancer RNAs (eRNAs), a class of non-coding RNAs (ncRNAs) transcribed from enhancer regions, serve as a type of critical regulatory element in gene expression. There is increasing evidence demonstrating that the aberrant expression of eRNAs can be broadly detected in various human diseases. Some studies also revealed the potential clinical utility of eRNAs in these diseases. In this review, we summarized the recent studies regarding the pathological mechanisms of eRNAs as well as their potential utility across human diseases, including cancers, neurodegenerative disorders, cardiovascular diseases and metabolic diseases. It could help us to understand how eRNAs are engaged in the processes of diseases and to obtain better insight of eRNAs in diagnosis, prognosis or therapy. The studies we reviewed here indicate the enormous therapeutic potency of eRNAs across human diseases.
Collapse
Affiliation(s)
- Yunzhe Wang
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Chenyang Zhang
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Yuxiang Wang
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Xiuping Liu
- Department of Pathology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| | - Zhao Zhang
- MOE Key Laboratory of Metabolism and Molecular Medicine, Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Fudan University, Shanghai 200032, China
| |
Collapse
|
42
|
Chaudhuri S, Srivastava A. Network approach to understand biological systems: From single to multilayer networks. J Biosci 2022. [DOI: 10.1007/s12038-022-00285-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
43
|
Zhang N, Nanshan M, Cao J. A Joint estimation approach to sparse additive ordinary differential equations. STATISTICS AND COMPUTING 2022; 32:69. [PMID: 36033975 PMCID: PMC9395913 DOI: 10.1007/s11222-022-10117-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 06/13/2022] [Indexed: 06/15/2023]
Abstract
Ordinary differential equations (ODEs) are widely used to characterize the dynamics of complex systems in real applications. In this article, we propose a novel joint estimation approach for generalized sparse additive ODEs where observations are allowed to be non-Gaussian. The new method is unified with existing collocation methods by considering the likelihood, ODE fidelity and sparse regularization simultaneously. We design a block coordinate descent algorithm for optimizing the non-convex and non-differentiable objective function. The global convergence of the algorithm is established. The simulation study and two applications demonstrate the superior performance of the proposed method in estimation and improved performance of identifying the sparse structure.
Collapse
Affiliation(s)
- Nan Zhang
- School of Data Science, Fudan University, Shanghai, China
| | - Muye Nanshan
- School of Data Science, Fudan University, Shanghai, China
| | - Jiguo Cao
- Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
44
|
Coleman S, Kirk PDW, Wallace C. Consensus clustering for Bayesian mixture models. BMC Bioinformatics 2022; 23:290. [PMID: 35864476 PMCID: PMC9306175 DOI: 10.1186/s12859-022-04830-8] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 07/05/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Cluster analysis is an integral part of precision medicine and systems biology, used to define groups of patients or biomolecules. Consensus clustering is an ensemble approach that is widely used in these areas, which combines the output from multiple runs of a non-deterministic clustering algorithm. Here we consider the application of consensus clustering to a broad class of heuristic clustering algorithms that can be derived from Bayesian mixture models (and extensions thereof) by adopting an early stopping criterion when performing sampling-based inference for these models. While the resulting approach is non-Bayesian, it inherits the usual benefits of consensus clustering, particularly in terms of computational scalability and providing assessments of clustering stability/robustness. RESULTS In simulation studies, we show that our approach can successfully uncover the target clustering structure, while also exploring different plausible clusterings of the data. We show that, when a parallel computation environment is available, our approach offers significant reductions in runtime compared to performing sampling-based Bayesian inference for the underlying model, while retaining many of the practical benefits of the Bayesian approach, such as exploring different numbers of clusters. We propose a heuristic to decide upon ensemble size and the early stopping criterion, and then apply consensus clustering to a clustering algorithm derived from a Bayesian integrative clustering method. We use the resulting approach to perform an integrative analysis of three 'omics datasets for budding yeast and find clusters of co-expressed genes with shared regulatory proteins. We validate these clusters using data external to the analysis. CONCLUSTIONS Our approach can be used as a wrapper for essentially any existing sampling-based Bayesian clustering implementation, and enables meaningful clustering analyses to be performed using such implementations, even when computational Bayesian inference is not feasible, e.g. due to poor exploration of the target density (often as a result of increasing numbers of features) or a limited computational budget that does not along sufficient samples to drawn from a single chain. This enables researchers to straightforwardly extend the applicability of existing software to much larger datasets, including implementations of sophisticated models such as those that jointly model multiple datasets.
Collapse
Affiliation(s)
- Stephen Coleman
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
| | - Paul D. W. Kirk
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
| | - Chris Wallace
- MRC Biostatistics Unit, University of Cambridge, Cambridge, UK
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, University of Cambridge, Cambridge, UK
| |
Collapse
|
45
|
Suter P, Kuipers J, Beerenwinkel N. Discovering gene regulatory networks of multiple phenotypic groups using dynamic Bayesian networks. Brief Bioinform 2022; 23:bbac219. [PMID: 35679575 PMCID: PMC9294428 DOI: 10.1093/bib/bbac219] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 11/13/2022] Open
Abstract
Dynamic Bayesian networks (DBNs) can be used for the discovery of gene regulatory networks (GRNs) from time series gene expression data. Here, we suggest a strategy for learning DBNs from gene expression data by employing a Bayesian approach that is scalable to large networks and is targeted at learning models with high predictive accuracy. Our framework can be used to learn DBNs for multiple groups of samples and highlight differences and similarities in their GRNs. We learn these DBN models based on different structural and parametric assumptions and select the optimal model based on the cross-validated predictive accuracy. We show in simulation studies that our approach is better equipped to prevent overfitting than techniques used in previous studies. We applied the proposed DBN-based approach to two time series transcriptomic datasets from the Gene Expression Omnibus database, each comprising data from distinct phenotypic groups of the same tissue type. In the first case, we used DBNs to characterize responders and non-responders to anti-cancer therapy. In the second case, we compared normal to tumor cells of colorectal tissue. The classification accuracy reached by the DBN-based classifier for both datasets was higher than reported previously. For the colorectal cancer dataset, our analysis suggested that GRNs for cancer and normal tissues have a lot of differences, which are most pronounced in the neighborhoods of oncogenes and known cancer tissue markers. The identified differences in gene networks of cancer and normal cells may be used for the discovery of targeted therapies.
Collapse
Affiliation(s)
- Polina Suter
- Department of Biosystems Science and Engineering, ETH Zurich, Matternstrasse 26, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| | - Jack Kuipers
- Department of Biosystems Science and Engineering, ETH Zurich, Matternstrasse 26, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| | - Niko Beerenwinkel
- Department of Biosystems Science and Engineering, ETH Zurich, Matternstrasse 26, 4058 Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| |
Collapse
|
46
|
Approaches in Gene Coexpression Analysis in Eukaryotes. BIOLOGY 2022; 11:biology11071019. [PMID: 36101400 PMCID: PMC9312353 DOI: 10.3390/biology11071019] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/28/2022] [Accepted: 07/04/2022] [Indexed: 11/22/2022]
Abstract
Simple Summary Genes whose expression levels rise and fall similarly in a large set of samples, may be considered coexpressed. Gene coexpression analysis refers to the en masse discovery of coexpressed genes from a large variety of transcriptomic experiments. The type of biological networks that studies gene coexpression, known as Gene Coexpression Networks, consist of an undirected graph depicting genes and their coexpression relationships. Coexpressed genes are clustered in smaller subnetworks, the predominant biological roles of which can be determined through enrichment analysis. By studying well-annotated gene partners, the attribution of new roles to genes of unknown function or assumption for participation in common metabolic pathways can be achieved, through a guilt-by-association approach. In this review, we present key issues in gene coexpression analysis, as well as the most popular tools that perform it. Abstract Gene coexpression analysis constitutes a widely used practice for gene partner identification and gene function prediction, consisting of many intricate procedures. The analysis begins with the collection of primary transcriptomic data and their preprocessing, continues with the calculation of the similarity between genes based on their expression values in the selected sample dataset and results in the construction and visualisation of a gene coexpression network (GCN) and its evaluation using biological term enrichment analysis. As gene coexpression analysis has been studied extensively, we present most parts of the methodology in a clear manner and the reasoning behind the selection of some of the techniques. In this review, we offer a comprehensive and comprehensible account of the steps required for performing a complete gene coexpression analysis in eukaryotic organisms. We comment on the use of RNA-Seq vs. microarrays, as well as the best practices for GCN construction. Furthermore, we recount the most popular webtools and standalone applications performing gene coexpression analysis, with details on their methods, features and outputs.
Collapse
|
47
|
Seçilmiş D, Hillerton T, Sonnhammer ELL. GRNbenchmark - a web server for benchmarking directed gene regulatory network inference methods. Nucleic Acids Res 2022; 50:W398-W404. [PMID: 35609981 PMCID: PMC9252735 DOI: 10.1093/nar/gkac377] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/20/2022] [Accepted: 05/19/2022] [Indexed: 11/30/2022] Open
Abstract
Accurate inference of gene regulatory networks (GRN) is an essential component of systems biology, and there is a constant development of new inference methods. The most common approach to assess accuracy for publications is to benchmark the new method against a selection of existing algorithms. This often leads to a very limited comparison, potentially biasing the results, which may stem from tuning the benchmark's properties or incorrect application of other methods. These issues can be avoided by a web server with a broad range of data properties and inference algorithms, that makes it easy to perform comprehensive benchmarking of new methods, and provides a more objective assessment. Here we present https://GRNbenchmark.org/ - a new web server for benchmarking GRN inference methods, which provides the user with a set of benchmarks with several datasets, each spanning a range of properties including multiple noise levels. As soon as the web server has performed the benchmarking, the accuracy results are made privately available to the user via interactive summary plots and underlying curves. The user can then download these results for any purpose, and decide whether or not to make them public to share with the community.
Collapse
Affiliation(s)
- Deniz Seçilmiş
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Thomas Hillerton
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| | - Erik L L Sonnhammer
- Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
48
|
Lombardo SD, Wangsaputra IF, Menche J, Stevens A. Network Approaches for Charting the Transcriptomic and Epigenetic Landscape of the Developmental Origins of Health and Disease. Genes (Basel) 2022; 13:764. [PMID: 35627149 PMCID: PMC9141211 DOI: 10.3390/genes13050764] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 04/04/2022] [Accepted: 04/13/2022] [Indexed: 02/04/2023] Open
Abstract
The early developmental phase is of critical importance for human health and disease later in life. To decipher the molecular mechanisms at play, current biomedical research is increasingly relying on large quantities of diverse omics data. The integration and interpretation of the different datasets pose a critical challenge towards the holistic understanding of the complex biological processes that are involved in early development. In this review, we outline the major transcriptomic and epigenetic processes and the respective datasets that are most relevant for studying the periconceptional period. We cover both basic data processing and analysis steps, as well as more advanced data integration methods. A particular focus is given to network-based methods. Finally, we review the medical applications of such integrative analyses.
Collapse
Affiliation(s)
- Salvo Danilo Lombardo
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
| | - Ivan Fernando Wangsaputra
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| | - Jörg Menche
- Max Perutz Labs, Department of Structural and Computational Biology, University of Vienna, 1030 Vienna, Austria;
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1030 Vienna, Austria
- Faculty of Mathematics, University of Vienna, 1030 Vienna, Austria
| | - Adam Stevens
- Maternal and Fetal Health Research Group, Division of Developmental Biology and Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester M13 9WL, UK;
| |
Collapse
|
49
|
Mekedem M, Ravel P, Colinge J. Application of modular response analysis to medium- to large-size biological systems. PLoS Comput Biol 2022; 18:e1009312. [PMID: 35442961 PMCID: PMC9060349 DOI: 10.1371/journal.pcbi.1009312] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 05/02/2022] [Accepted: 03/31/2022] [Indexed: 11/18/2022] Open
Abstract
The development of high-throughput genomic technologies associated with recent genetic perturbation techniques such as short hairpin RNA (shRNA), gene trapping, or gene editing (CRISPR/Cas9) has made it possible to obtain large perturbation data sets. These data sets are invaluable sources of information regarding the function of genes, and they offer unique opportunities to reverse engineer gene regulatory networks in specific cell types. Modular response analysis (MRA) is a well-accepted mathematical modeling method that is precisely aimed at such network inference tasks, but its use has been limited to rather small biological systems so far. In this study, we show that MRA can be employed on large systems with almost 1,000 network components. In particular, we show that MRA performance surpasses general-purpose mutual information-based algorithms. Part of these competitive results was obtained by the application of a novel heuristic that pruned MRA-inferred interactions a posteriori. We also exploited a block structure in MRA linear algebra to parallelize large system resolutions.
Collapse
Affiliation(s)
- Meriem Mekedem
- Université de Montpellier, Montpellier, France
- Institut de Recherche en Cancérologie de Montpellier, Inserm U1194, Montpellier, France
- Institut régional du Cancer Montpellier, Montpellier, France
| | - Patrice Ravel
- Université de Montpellier, Montpellier, France
- Institut de Recherche en Cancérologie de Montpellier, Inserm U1194, Montpellier, France
- Institut régional du Cancer Montpellier, Montpellier, France
- Faculté de Pharmacie, Université de Montpellier, Montpellier, France
| | - Jacques Colinge
- Université de Montpellier, Montpellier, France
- Institut de Recherche en Cancérologie de Montpellier, Inserm U1194, Montpellier, France
- Institut régional du Cancer Montpellier, Montpellier, France
- Faculté de Médecine, Université de Montpellier, Montpellier, France
| |
Collapse
|
50
|
Herrero R, Leon DA, Gonzalez A. A one-dimensional parameter-free model for carcinogenesis in gene expression space. Sci Rep 2022; 12:4748. [PMID: 35306505 PMCID: PMC8934350 DOI: 10.1038/s41598-022-08502-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 03/02/2022] [Indexed: 12/03/2022] Open
Abstract
A small portion of a tissue defines a microstate in gene expression space. Mutations, epigenetic events or external factors cause microstate displacements which are modeled by combining small independent gene expression variations and large Levy jumps, resulting from the collective variations of a set of genes. The risk of cancer in a tissue is estimated as the microstate probability to transit from the normal to the tumor region in gene expression space. The formula coming from the contribution of large Levy jumps seems to provide a qualitatively correct description of the lifetime risk of cancer in 8 tissues, and reveals an interesting connection between the risk and the way the tissue is protected against infections.
Collapse
Affiliation(s)
| | - Dario A Leon
- Institute of Cybernetics, Mathematics and Physics, 10400, Havana, Cuba.
- S3 Centre, Istituto Nanoscienze, CNR, 41125, Modena, Italy.
| | - Augusto Gonzalez
- Institute of Cybernetics, Mathematics and Physics, 10400, Havana, Cuba
| |
Collapse
|