1
|
Melo D, Pallares LF, Ayroles JF. Reassessing the modularity of gene co-expression networks using the Stochastic Block Model. PLoS Comput Biol 2024; 20:e1012300. [PMID: 39074140 PMCID: PMC11309492 DOI: 10.1371/journal.pcbi.1012300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 08/08/2024] [Accepted: 07/07/2024] [Indexed: 07/31/2024] Open
Abstract
Finding communities in gene co-expression networks is a common first step toward extracting biological insight from these complex datasets. Most community detection algorithms expect genes to be organized into assortative modules, that is, groups of genes that are more associated with each other than with genes in other groups. While it is reasonable to expect that these modules exist, using methods that assume they exist a priori is risky, as it guarantees that alternative organizations of gene interactions will be ignored. Here, we ask: can we find meaningful communities without imposing a modular organization on gene co-expression networks, and how modular are these communities? For this, we use a recently developed community detection method, the weighted degree corrected stochastic block model (SBM), that does not assume that assortative modules exist. Instead, the SBM attempts to efficiently use all information contained in the co-expression network to separate the genes into hierarchically organized blocks of genes. Using RNAseq gene expression data measured in two tissues derived from an outbred population of Drosophila melanogaster, we show that (a) the SBM is able to find ten times as many groups as competing methods, that (b) several of those gene groups are not modular, and that (c) the functional enrichment for non-modular groups is as strong as for modular communities. These results show that the transcriptome is structured in more complex ways than traditionally thought and that we should revisit the long-standing assumption that modularity is the main driver of the structuring of gene co-expression networks.
Collapse
Affiliation(s)
- Diogo Melo
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Luisa F. Pallares
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America
- Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany
| | - Julien F. Ayroles
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, New Jersey, United States of America
| |
Collapse
|
2
|
Melo D, Pallares LF, Ayroles JF. Reassessing the modularity of gene co-expression networks using the Stochastic Block Model. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.31.542906. [PMID: 37398186 PMCID: PMC10312592 DOI: 10.1101/2023.05.31.542906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Finding communities in gene co-expression networks is a common first step toward extracting biological insight from these complex datasets. Most community detection algorithms expect genes to be organized into assortative modules, that is, groups of genes that are more associated with each other than with genes in other groups. While it is reasonable to expect that these modules exist, using methods that assume they exist a priori is risky, as it guarantees that alternative organizations of gene interactions will be ignored. Here, we ask: can we find meaningful communities without imposing a modular organization on gene co-expression networks, and how modular are these communities? For this, we use a recently developed community detection method, the weighted degree corrected stochastic block model (SBM), that does not assume that assortative modules exist. Instead, the SBM attempts to efficiently use all information contained in the co-expression network to separate the genes into hierarchically organized blocks of genes. Using RNA-seq gene expression data measured in two tissues derived from an outbred population of Drosophila melanogaster, we show that (a) the SBM is able to find ten times as many groups as competing methods, that (b) several of those gene groups are not modular, and that (c) the functional enrichment for non-modular groups is as strong as for modular communities. These results show that the transcriptome is structured in more complex ways than traditionally thought and that we should revisit the long-standing assumption that modularity is the main driver of the structuring of gene co-expression networks.
Collapse
Affiliation(s)
- Diogo Melo
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| | - Luisa F Pallares
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
- Friedrich Miescher Laboratory of the Max Planck Society, Tübingen, Germany
| | - Julien F Ayroles
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
3
|
Russell M, Aqi A, Saitou M, Gokcumen O, Masuda N. Gene communities in co-expression networks across different tissues. ARXIV 2023:arXiv:2305.12963v2. [PMID: 37292479 PMCID: PMC10246089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With the recent availability of tissue-specific gene expression data, e.g., provided by the GTEx Consortium, there is interest in comparing gene co-expression patterns across tissues. One promising approach to this problem is to use a multilayer network analysis framework and perform multilayer community detection. Communities in gene co-expression networks reveal groups of genes similarly expressed across individuals, potentially involved in related biological processes responding to specific environmental stimuli or sharing common regulatory variations. We construct a multilayer network in which each of the four layers is an exocrine gland tissue-specific gene co-expression network. We develop methods for multilayer community detection with correlation matrix input and an appropriate null model. Our correlation matrix input method identifies five groups of genes that are similarly co-expressed in multiple tissues (a community that spans multiple layers, which we call a generalist community) and two groups of genes that are co-expressed in just one tissue (a community that lies primarily within just one layer, which we call a specialist community). We further found gene co-expression communities where the genes physically cluster across the genome significantly more than expected by chance (on chromosomes 1 and 11). This clustering hints at underlying regulatory elements determining similar expression patterns across individuals and cell types. We suggest that KRTAP3-1, KRTAP3-3, and KRTAP3-5 share regulatory elements in skin and pancreas. Furthermore, we find that CELA3A and CELA3B share associated expression quantitative trait loci in the pancreas. The results indicate that our multilayer community detection method for correlation matrix input extracts biologically interesting communities of genes.
Collapse
Affiliation(s)
| | - Alber Aqi
- Department of Biological Sciences, University at Buffalo
| | - Marie Saitou
- Faculty of Biosciences, Norwegian University of Life Sciences
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo
| | - Naoki Masuda
- Department of Mathematics, University at Buffalo
- Institute for Artificial Intelligence and Data Science, University at Buffalo
| |
Collapse
|
4
|
Russell M, Aqil A, Saitou M, Gokcumen O, Masuda N. Gene communities in co-expression networks across different tissues. PLoS Comput Biol 2023; 19:e1011616. [PMID: 37976327 PMCID: PMC10691702 DOI: 10.1371/journal.pcbi.1011616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 12/01/2023] [Accepted: 10/19/2023] [Indexed: 11/19/2023] Open
Abstract
With the recent availability of tissue-specific gene expression data, e.g., provided by the GTEx Consortium, there is interest in comparing gene co-expression patterns across tissues. One promising approach to this problem is to use a multilayer network analysis framework and perform multilayer community detection. Communities in gene co-expression networks reveal groups of genes similarly expressed across individuals, potentially involved in related biological processes responding to specific environmental stimuli or sharing common regulatory variations. We construct a multilayer network in which each of the four layers is an exocrine gland tissue-specific gene co-expression network. We develop methods for multilayer community detection with correlation matrix input and an appropriate null model. Our correlation matrix input method identifies five groups of genes that are similarly co-expressed in multiple tissues (a community that spans multiple layers, which we call a generalist community) and two groups of genes that are co-expressed in just one tissue (a community that lies primarily within just one layer, which we call a specialist community). We further found gene co-expression communities where the genes physically cluster across the genome significantly more than expected by chance (on chromosomes 1 and 11). This clustering hints at underlying regulatory elements determining similar expression patterns across individuals and cell types. We suggest that KRTAP3-1, KRTAP3-3, and KRTAP3-5 share regulatory elements in skin and pancreas. Furthermore, we find that CELA3A and CELA3B share associated expression quantitative trait loci in the pancreas. The results indicate that our multilayer community detection method for correlation matrix input extracts biologically interesting communities of genes.
Collapse
Affiliation(s)
- Madison Russell
- Department of Mathematics, State University of New York at Buffalo, Buffalo, New York, United States of America
| | - Alber Aqil
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, New York, United States of America
| | - Marie Saitou
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Omer Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, New York, United States of America
| | - Naoki Masuda
- Department of Mathematics, State University of New York at Buffalo, Buffalo, New York, United States of America
- Institute for Artificial Intelligence and Data Science, State University of New York at Buffalo, Buffalo, New York, United States of America
| |
Collapse
|
5
|
Ravikumar V, Xu T, Al-Holou WN, Fattahi S, Rao A. Efficient Inference of Spatially-Varying Gaussian Markov Random Fields With Applications in Gene Regulatory Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2920-2932. [PMID: 37276119 PMCID: PMC10623339 DOI: 10.1109/tcbb.2023.3282028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In this paper, we study the problem of inferring spatially-varying Gaussian Markov random fields (SV-GMRF) where the goal is to learn a network of sparse, context-specific GMRFs representing network relationships between genes. An important application of SV-GMRFs is in inference of gene regulatory networks from spatially-resolved transcriptomics datasets. The current work on inference of SV-GMRFs are based on the regularized maximum likelihood estimation (MLE) and suffer from overwhelmingly high computational cost due to their highly nonlinear nature. To alleviate this challenge, we propose a simple and efficient optimization problem in lieu of MLE that comes equipped with strong statistical and computational guarantees. Our proposed optimization problem is extremely efficient in practice: we can solve instances of SV-GMRFs with more than 2 million variables in less than 2 minutes. We apply the developed framework to study how gene regulatory networks in Glioblastoma are spatially rewired within tissue, and identify prominent activity of the transcription factor HES4 and ribosomal proteins as characterizing the gene expression network in the tumor peri-vascular niche that is known to harbor treatment resistant stem cells.
Collapse
|
6
|
Dong M, He Y, Jiang Y, Zou F. Joint gene network construction by single-cell RNA sequencing data. Biometrics 2023; 79:915-925. [PMID: 35184277 PMCID: PMC10548400 DOI: 10.1111/biom.13645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 11/30/2021] [Accepted: 02/07/2022] [Indexed: 11/26/2022]
Abstract
In contrast to differential gene expression analysis at the single-gene level, gene regulatory network (GRN) analysis depicts complex transcriptomic interactions among genes for better understandings of underlying genetic architectures of human diseases and traits. Recent advances in single-cell RNA sequencing (scRNA-seq) allow constructing GRNs at a much finer resolution than bulk RNA-seq and microarray data. However, scRNA-seq data are inherently sparse, which hinders the direct application of the popular Gaussian graphical models (GGMs). Furthermore, most existing approaches for constructing GRNs with scRNA-seq data only consider gene networks under one condition. To better understand GRNs across different but related conditions at single-cell resolution, we propose to construct Joint Gene Networks with scRNA-seq data (JGNsc) under the GGMs framework. To facilitate the use of GGMs, JGNsc first proposes a hybrid imputation procedure that combines a Bayesian zero-inflated Poisson model with an iterative low-rank matrix completion step to efficiently impute zero-inflated counts resulted from technical artifacts. JGNsc then transforms the imputed data via a nonparanormal transformation, based on which joint GGMs are constructed. We demonstrate JGNsc and assess its performance using synthetic data. The application of JGNsc on two cancer clinical studies of medulloblastoma and glioblastoma gains novel insights in addition to confirming well-known biological results.
Collapse
Affiliation(s)
- Meichen Dong
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Yiping He
- Department of Pathology, School of Medicine, Duke University, Durham, North Carolina, USA
| | - Yuchao Jiang
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Fei Zou
- Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
7
|
Seal S, Li Q, Basner EB, Saba LM, Kechris K. RCFGL: Rapid Condition adaptive Fused Graphical Lasso and application to modeling brain region co-expression networks. PLoS Comput Biol 2023; 19:e1010758. [PMID: 36607897 PMCID: PMC9821764 DOI: 10.1371/journal.pcbi.1010758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 11/24/2022] [Indexed: 01/07/2023] Open
Abstract
Inferring gene co-expression networks is a useful process for understanding gene regulation and pathway activity. The networks are usually undirected graphs where genes are represented as nodes and an edge represents a significant co-expression relationship. When expression data of multiple (p) genes in multiple (K) conditions (e.g., treatments, tissues, strains) are available, joint estimation of networks harnessing shared information across them can significantly increase the power of analysis. In addition, examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. Condition adaptive fused graphical lasso (CFGL) is an existing method that incorporates condition specificity in a fused graphical lasso (FGL) model for estimating multiple co-expression networks. However, with computational complexity of O(p2K log K), the current implementation of CFGL is prohibitively slow even for a moderate number of genes and can only be used for a maximum of three conditions. In this paper, we propose a faster alternative of CFGL named rapid condition adaptive fused graphical lasso (RCFGL). In RCFGL, we incorporate the condition specificity into another popular model for joint network estimation, known as fused multiple graphical lasso (FMGL). We use a more efficient algorithm in the iterative steps compared to CFGL, enabling faster computation with complexity of O(p2K) and making it easily generalizable for more than three conditions. We also present a novel screening rule to determine if the full network estimation problem can be broken down into estimation of smaller disjoint sub-networks, thereby reducing the complexity further. We demonstrate the computational advantage and superior performance of our method compared to two non-condition adaptive methods, FGL and FMGL, and one condition adaptive method, CFGL in both simulation study and real data analysis. We used RCFGL to jointly estimate the gene co-expression networks in different brain regions (conditions) using a cohort of heterogeneous stock rats. We also provide an accommodating C and Python based package that implements RCFGL.
Collapse
Affiliation(s)
- Souvik Seal
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Qunhua Li
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Elle Butler Basner
- Department of Statistics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Laura M. Saba
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Katerina Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| |
Collapse
|
8
|
Wang MG, Ou-Yang L, Yan H, Zhang XF. Inferring Gene Co-Expression Networks by Incorporating Prior Protein-Protein Interaction Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2894-2906. [PMID: 34383650 DOI: 10.1109/tcbb.2021.3103407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Inferring gene co-expression networks from high-throughput gene expression data is an important task in bioinformatics. Many gene networks often exhibit modular structures. Although several Gaussian graphical model-based methods have been developed to estimate gene co-expression networks by incorporating the modular structural prior, none of them takes into account the modular structures captured by the prior networks (e.g., protein interaction networks). In this study, we propose a novel prior network-dependent gene network inference (pGNI) method to estimate gene co-expression networks by integrating gene expression data and prior protein interaction network data. The underlying modular structure is learned from both sets of data. Through simulation studies, we demonstrate the feasibility and effectiveness of our method. We also apply our method to two real datasets. The modular structures in the networks estimated by our method are biological significant.
Collapse
|
9
|
Ferreira F, Gysi D, Castro D, Ferreira TB. The nosographic structure of posttraumatic stress symptoms across trauma types: An exploratory network analysis approach. J Trauma Stress 2022; 35:1115-1128. [PMID: 35246860 DOI: 10.1002/jts.22818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 12/06/2021] [Accepted: 01/11/2022] [Indexed: 11/12/2022]
Abstract
The nosographic structure of posttraumatic stress disorder (PTSD) remains unclear, and attempts to determine its symptomatic organization have been unsatisfactory. Several explanations have been suggested, and the impact of trauma type is receiving increasing attention. As little is known about the differential impact trauma type in the nosographic structure of PTSD, we explored the nosology of PTSD and the effect of trauma type on its symptomatic organization. We reanalyzed five cross-sectional psychopathological networks involving different trauma types, encompassing a broad range of traumatic events in veterans, war-related trauma in veterans, sexual abuse, terrorist attacks, and various traumatic events in refugees. The weighted topological overlap was used to estimate the networks and attribute weights to their links. Coexpression differential network analysis was used to identify the common and specific network structures of the connections across different trauma types and to determine the importance of symptoms across the networks. We found a set of symptoms with more common connections with other symptoms, suggesting that these might constitute the prototypical nosographic structure of PTSD. We also found a set of symptoms that had a high number of specific connections with other symptoms; these connections varied according to trauma type. The importance of symptoms across the common and specific networks was ascertained. The present findings offer new insights into the symptomatic organization of PTSD and support previous research on the impact of trauma type on the nosology of this disorder.
Collapse
Affiliation(s)
- Filipa Ferreira
- Social Sciences Department, University Institute of Maia, Maia, Portugal.,Centre for Psychology at University of Porto, Porto, Portugal
| | - Deisy Gysi
- Center for Complex Network Research, Northeastern University, Boston, Massachusetts, USA
| | - Daniel Castro
- Social Sciences Department, University Institute of Maia, Maia, Portugal.,Centre for Psychology at University of Porto, Porto, Portugal
| | - Tiago Bento Ferreira
- Social Sciences Department, University Institute of Maia, Maia, Portugal.,Centre for Psychology at University of Porto, Porto, Portugal
| |
Collapse
|
10
|
Tan YT, Ou-Yang L, Jiang X, Yan H, Zhang XF. Identifying Gene Network Rewiring Based on Partial Correlation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:513-521. [PMID: 32750866 DOI: 10.1109/tcbb.2020.3002906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
It is an important task to learn how gene regulatory networks change under different conditions. Several Gaussian graphical model-based methods have been proposed to deal with this task by inferring differential networks from gene expression data. However, most existing methods define the differential networks as the difference of precision matrices, which may include false differential edges caused by the change of conditional variances. In addition, prior information about the condition-specific networks and the differential networks can be obtained from other domains. It is useful to incorporate prior information into differential network analysis. In this study, we propose a new differential network analysis method to address the above challenges. Instead of using the precision matrices, we define the differential networks as the difference of partial correlations, which can exclude the spurious differential edges due to the variants of conditional variances. Furthermore, prior information from multiple hypothesis testing is incorporated using a weighted fused penalty. Simulation studies show that our method outperforms the competing methods. We also apply our method to identify the differential network between luminal A and basal-like subtypes of breast cancers and the differential network between acute myeloid leukemia tumors and normal samples. The hub genes in the differential networks identified by our method carry out important biological functions.
Collapse
|
11
|
Shalaby MN, Sakoury MMA, Abdi E, Elgamal S, Elrkbwey S, Ramadan W, Taiar R. The Impact of Resistance Training on Gene Expression of IGF1 and Athletes’ Physiological Parameters. Open Access Maced J Med Sci 2021. [DOI: 10.3889/oamjms.2021.7215] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
AIM: The purpose of this study was to verify the effect of a resistance training program for 8 weeks on IGF1, gene expression, and physical performance in male student-athletes.
METHODS: The population of this study was 20 male students divided to two equal groups. The parameters estimated were IGF1, gene expression, and muscle strength testing. Blood was drawn to verify the concentration of the variables, using kits and Elisa method in addition to the PCR technique.
RESULTS: The results revealed that a significant increase in IGF1, gene expression was different between students. Furthermore, muscle strength testing revealed significant changes.
CONCLUSION: The results suggested that resistance training program may impact fitness and muscle strength as well the anabolic activity through IGF1 increase accompanied by varied gene expressions.
Collapse
|
12
|
Leng J, Wu LY. Importance-Penalized Joint Graphical Lasso (IPJGL): differential network inference via GGMs. Bioinformatics 2021; 38:770-777. [PMID: 34718410 PMCID: PMC8756181 DOI: 10.1093/bioinformatics/btab751] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2021] [Revised: 10/03/2021] [Accepted: 10/27/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Differential network inference is a fundamental and challenging problem to reveal gene interactions and regulation relationships under different conditions. Many algorithms have been developed for this problem; however, they do not consider the differences between the importance of genes, which may not fit the real-world situation. Different genes have different mutation probabilities, and the vital genes associated with basic life activities have less fault tolerance to mutation. Equally treating all genes may bias the results of differential network inference. Thus, it is necessary to consider the importance of genes in the models of differential network inference. RESULTS Based on the Gaussian graphical model with adaptive gene importance regularization, we develop a novel Importance-Penalized Joint Graphical Lasso method (IPJGL) for differential network inference. The presented method is validated by the simulation experiments as well as the real datasets. Furthermore, to precisely evaluate the results of differential network inference, we propose a new metric named APC2 for the differential levels of gene pairs. We apply IPJGL to analyze the TCGA colorectal and breast cancer datasets and find some candidate cancer genes with significant survival analysis results, including SOST for colorectal cancer and RBBP8 for breast cancer. We also conduct further analysis based on the interactions in the Reactome database and confirm the utility of our method. AVAILABILITY AND IMPLEMENTATION R source code of Importance-Penalized Joint Graphical Lasso is freely available at https://github.com/Wu-Lab/IPJGL. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jiacheng Leng
- IAM, MADIS, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | | |
Collapse
|
13
|
Abstract
Cancer is a genetic disease in which multiple genes are perturbed. Thus, information about the regulatory relationships between genes is necessary for the identification of biomarkers and therapeutic targets. In this review, methods for inference of gene regulatory networks (GRNs) from transcriptomics data that are used in cancer research are introduced. The methods are classified into three categories according to the analysis model. The first category includes methods that use pair-wise measures between genes, including correlation coefficient and mutual information. The second category includes methods that determine the genetic regulatory relationship using multivariate measures, which consider the expression profiles of all genes concurrently. The third category includes methods using supervised and integrative approaches. The supervised approach estimates the regulatory relationship using a supervised learning method that constructs a regression or classification model for predicting whether there is a regulatory relationship between genes with input data of gene expression profiles and class labels of prior biological knowledge. The integrative method is an expansion of the supervised method and uses more data and biological knowledge for predicting the regulatory relationship. Furthermore, simulation and experimental validation of the estimated GRNs are also discussed in this review. This review identified that most GRN inference methods are not specific for cancer transcriptome data, and such methods are required for better understanding of cancer pathophysiology. In addition, more systematic methods for validation of the estimated GRNs need to be developed in the context of cancer biology.
Collapse
|
14
|
scLink: Inferring Sparse Gene Co-expression Networks from Single-cell Expression Data. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:475-492. [PMID: 34252628 PMCID: PMC8896229 DOI: 10.1016/j.gpb.2020.11.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/23/2020] [Accepted: 12/26/2020] [Indexed: 11/23/2022]
Abstract
A system-level understanding of the regulation and coordination mechanisms of gene expression is essential for studying the complexity of biological processes in health and disease. With the rapid development of single-cell RNA sequencing technologies, it is now possible to investigate gene interactions in a cell type-specific manner. Here we propose the scLink method, which uses statistical network modeling to understand the co-expression relationships among genes and construct sparse gene co-expression networks from single-cell gene expression data. We use both simulation and real data studies to demonstrate the advantages of scLink and its ability to improve single-cell gene network analysis. The scLink R package is available at https://github.com/Vivianstats/scLink.
Collapse
|
15
|
Clemente-Moreno MJ, Omranian N, Sáez P, Figueroa CM, Del-Saz N, Elso M, Poblete L, Orf I, Cuadros-Inostroza A, Cavieres L, Bravo L, Fernie A, Ribas-Carbó M, Flexas J, Nikoloski Z, Brotman Y, Gago J. Cytochrome respiration pathway and sulphur metabolism sustain stress tolerance to low temperature in the Antarctic species Colobanthus quitensis. THE NEW PHYTOLOGIST 2020; 225:754-768. [PMID: 31489634 DOI: 10.1111/nph.16167] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Accepted: 08/22/2019] [Indexed: 05/28/2023]
Abstract
Understanding the strategies employed by plant species that live in extreme environments offers the possibility to discover stress tolerance mechanisms. We studied the physiological, antioxidant and metabolic responses to three temperature conditions (4, 15, and 23°C) of Colobanthus quitensis (CQ), one of the only two native vascular species in Antarctica. We also employed Dianthus chinensis (DC), to assess the effects of the treatments in a non-Antarctic species from the same family. Using fused LASSO modelling, we associated physiological and biochemical antioxidant responses with primary metabolism. This approach allowed us to highlight the metabolic pathways driving the response specific to CQ. Low temperature imposed dramatic reductions in photosynthesis (up to 88%) but not in respiration (sustaining rates of 3.0-4.2 μmol CO2 m-2 s-1 ) in CQ, and no change in the physiological stress parameters was found. Its notable antioxidant capacity and mitochondrial cytochrome respiratory activity (20 and two times higher than DC, respectively), which ensure ATP production even at low temperature, was significantly associated with sulphur-containing metabolites and polyamines. Our findings potentially open new biotechnological opportunities regarding the role of antioxidant compounds and respiratory mechanisms associated with sulphur metabolism in stress tolerance strategies to low temperature.
Collapse
Affiliation(s)
- María José Clemente-Moreno
- Research Group on Plant Biology under Mediterranean Conditions, Instituto de Agroecología y Economía del Agua (INAGEA), Universitat de les Illes Balears (UIB), cta. Valldemossa km 7,5, 07122, Palma de Mallorca, Spain
| | - Nooshin Omranian
- Systems Biology and Mathematical Modeling Group, Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476, Potsdam-Golm, Germany
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany
| | - Patricia Sáez
- Laboratorio Cultivo de Tejidos Vegetales, Centro de Biotecnología, Departamento de Silvicultura, Facultad de Ciencias Forestales, Universidad de Concepción, 4030000, Concepción, Chile
| | - Carlos María Figueroa
- Instituto de Agrobiotecnología del Litoral, UNL, CONICET, FBCB, 3000, Santa Fe, Argentina
| | - Néstor Del-Saz
- Laboratorio de Fisiología Vegetal, Departamento de Botánica, Facultad de Ciencias Naturales y Oceanográficas, Universidad de Concepción, 4030000, Concepción, Chile
| | - Mhartyn Elso
- Laboratorio Cultivo de Tejidos Vegetales, Centro de Biotecnología, Departamento de Silvicultura, Facultad de Ciencias Forestales, Universidad de Concepción, 4030000, Concepción, Chile
| | - Leticia Poblete
- Laboratorio Cultivo de Tejidos Vegetales, Centro de Biotecnología, Departamento de Silvicultura, Facultad de Ciencias Forestales, Universidad de Concepción, 4030000, Concepción, Chile
| | - Isabel Orf
- Department of Life Sciences, Ben Gurion University of the Negev, 8410501, Beer Sheva, Israel
| | | | - Lohengrin Cavieres
- ECOBIOSIS, Departamento de Botánica, Facultad de Ciencias Naturales y Oceanográficas, Universidad de Concepción, 4030000, Concepción, Chile
| | - León Bravo
- Laboratorio de Fisiología y Biología Molecular Vegetal, Departamento de Cs. Agronómicas y Recursos Naturales, Facultad de Ciencias Agropecuarias y Forestales, Instituto de Agroindustria, Universidad de La Frontera, Temuco, Chile
- Center of Plant, Soil Interaction and Natural Resources Biotechnology, Scientific and Technological Bioresource Nucleus, Universidad de La Frontera, 4811230, Temuco, Chile
| | - Alisdair Fernie
- Central Metabolism Group, Molecular Physiology Department, Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476, Golm, Germany
| | - Miquel Ribas-Carbó
- Research Group on Plant Biology under Mediterranean Conditions, Instituto de Agroecología y Economía del Agua (INAGEA), Universitat de les Illes Balears (UIB), cta. Valldemossa km 7,5, 07122, Palma de Mallorca, Spain
| | - Jaume Flexas
- Research Group on Plant Biology under Mediterranean Conditions, Instituto de Agroecología y Economía del Agua (INAGEA), Universitat de les Illes Balears (UIB), cta. Valldemossa km 7,5, 07122, Palma de Mallorca, Spain
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modeling Group, Max-Planck-Institut für Molekulare Pflanzenphysiologie, 14476, Potsdam-Golm, Germany
- Bioinformatics, Institute of Biochemistry and Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam, Germany
- Center of Plant System Biology and Biotechnology (CPSBB), 4000, Plovdiv, Bulgaria
| | - Yariv Brotman
- Department of Life Sciences, Ben Gurion University of the Negev, 8410501, Beer Sheva, Israel
| | - Jorge Gago
- Research Group on Plant Biology under Mediterranean Conditions, Instituto de Agroecología y Economía del Agua (INAGEA), Universitat de les Illes Balears (UIB), cta. Valldemossa km 7,5, 07122, Palma de Mallorca, Spain
| |
Collapse
|