1
|
Corrigendum: The human milk proteome and allergy of mother and child: exploring associations with protein abundances and protein network connectivity. Front Immunol 2023; 14:1276180. [PMID: 37795098 PMCID: PMC10545895 DOI: 10.3389/fimmu.2023.1276180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 08/22/2023] [Indexed: 10/06/2023] Open
Abstract
[This corrects the article DOI: 10.3389/fimmu.2022.977470.].
Collapse
|
2
|
Identification of shared biological features in four different lung cell lines infected with SARS-CoV-2 virus through RNA-seq analysis. Front Genet 2023; 14:1235927. [PMID: 37662846 PMCID: PMC10468990 DOI: 10.3389/fgene.2023.1235927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 08/02/2023] [Indexed: 09/05/2023] Open
Abstract
The COVID-19 pandemic caused by SARS-CoV-2 has resulted in millions of confirmed cases and deaths worldwide. Understanding the biological mechanisms of SARS-CoV-2 infection is crucial for the development of effective therapies. This study conducts differential expression (DE) analysis, pathway analysis, and differential network (DN) analysis on RNA-seq data of four lung cell lines, NHBE, A549, A549.ACE2, and Calu3, to identify their common and unique biological features in response to SARS-CoV-2 infection. DE analysis shows that cell line A549.ACE2 has the highest number of DE genes, while cell line NHBE has the lowest. Among the DE genes identified for the four cell lines, 12 genes are overlapped, associated with various health conditions. The most significant signaling pathways varied among the four cell lines. Only one pathway, "cytokine-cytokine receptor interaction", is found to be significant among all four cell lines and is related to inflammation and immune response. The DN analysis reveals considerable variation in the differential connectivity of the most significant pathway shared among the four lung cell lines. These findings help to elucidate the mechanisms of SARS-CoV-2 infection and potential therapeutic targets.
Collapse
|
3
|
Differential Co-Abundance Network Analyses for Microbiome Data Adjusted for Clinical Covariates Using Jackknife Pseudo-Values. ARXIV 2023:arXiv:2303.13702v1. [PMID: 36994149 PMCID: PMC10055480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
A recent breakthrough in differential network (DN) analysis of microbiome data has been realized with the advent of next-generation sequencing technologies. The DN analysis disentangles the microbial co-abundance among taxa by comparing the network properties between two or more graphs under different biological conditions. However, the existing methods to the DN analysis for microbiome data do not adjust for other clinical differences between subjects. We propose a Statistical Approach via Pseudo-value Information and Estimation for Differential Network Analysis (SOHPIE-DNA) that incorporates additional covariates such as continuous age and categorical BMI. SOHPIE-DNA is a regression technique adopting jackknife pseudo-values that can be implemented readily for the analysis. We demonstrate through simulations that SOHPIE-DNA consistently reaches higher recall and F1-score, while maintaining similar precision and accuracy to existing methods (NetCoMi and MDiNE). Lastly, we apply SOHPIE-DNA on two real datasets from the American Gut Project and the Diet Exchange Study to showcase the utility. The analysis of the Diet Exchange Study is to showcase that SOHPIE-DNA can also be used to incorporate the temporal change of connectivity of taxa with the inclusion of additional covariates. As a result, our method has found taxa that are related to the prevention of intestinal inflammation and severity of fatigue in advanced metastatic cancer patients.
Collapse
|
4
|
The human milk proteome and allergy of mother and child: Exploring associations with protein abundances and protein network connectivity. Front Immunol 2022; 13:977470. [PMID: 36311719 PMCID: PMC9613325 DOI: 10.3389/fimmu.2022.977470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 09/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background The human milk proteome comprises a vast number of proteins with immunomodulatory functions, but it is not clear how this relates to allergy of the mother or allergy development in the breastfed infant. This study aimed to explore the relation between the human milk proteome and allergy of both mother and child. Methods Proteins were analyzed in milk samples from a subset of 300 mother-child dyads from the Canadian CHILD Cohort Study, selected based on maternal and child allergy phenotypes. For this selection, the definition of "allergy" included food allergy, eczema, allergic rhinitis, and asthma. Proteins were analyzed with non-targeted shotgun proteomics using filter-aided sample preparation (FASP) and nanoLC-Orbitrap-MS/MS. Protein abundances, based on label-free quantification, were compared using multiple statistical approaches, including univariate, multivariate, and network analyses. Results Using univariate analysis, we observed a trend that milk for infants who develop an allergy by 3 years of age contains higher abundances of immunoglobulin chains, irrespective of the allergy status of the mother. This observation suggests a difference in the milk's immunological potential, which might be related to the development of the infant's immune system. Furthermore, network analysis showed overall increased connectivity of proteins in the milk of allergic mothers and milk for infants who ultimately develop an allergy. This difference in connectivity was especially noted for proteins involved in the protein translation machinery and may be due to the physiological status of the mother, which is reflected in the interconnectedness of proteins in her milk. In addition, it was shown that network analysis complements the other methods for data analysis by revealing complex associations between the milk proteome and mother-child allergy status. Conclusion Together, these findings give new insights into how the human milk proteome, through differences in the abundance of individual proteins and protein-protein associations, relates to the allergy status of mother and child. In addition, these results inspire new research directions into the complex interplay of the mother-milk-infant triad and allergy.
Collapse
|
5
|
Identification of Transcription Factors Regulating SARS-CoV-2 Tropism Factor Expression by Inferring Cell-Type-Specific Transcriptional Regulatory Networks in Human Lungs. Viruses 2022; 14:v14040837. [PMID: 35458567 PMCID: PMC9026071 DOI: 10.3390/v14040837] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 04/02/2022] [Accepted: 04/05/2022] [Indexed: 02/04/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the virus that caused the coronavirus disease 2019 (COVID-19) pandemic. Though previous studies have suggested that SARS-CoV-2 cellular tropism depends on the host-cell-expressed proteins, whether transcriptional regulation controls SARS-CoV-2 tropism factors in human lung cells remains unclear. In this study, we used computational approaches to identify transcription factors (TFs) regulating SARS-CoV-2 tropism for different types of lung cells. We constructed transcriptional regulatory networks (TRNs) controlling SARS-CoV-2 tropism factors for healthy donors and COVID-19 patients using lung single-cell RNA-sequencing (scRNA-seq) data. Through differential network analysis, we found that the altered regulatory role of TFs in the same cell types of healthy and SARS-CoV-2-infected networks may be partially responsible for differential tropism factor expression. In addition, we identified the TFs with high centralities from each cell type and proposed currently available drugs that target these TFs as potential candidates for the treatment of SARS-CoV-2 infection. Altogether, our work provides valuable cell-type-specific TRN models for understanding the transcriptional regulation and gene expression of SARS-CoV-2 tropism factors.
Collapse
|
6
|
Inferring Differential Networks by Integrating Gene Expression Data With Additional Knowledge. Front Genet 2021; 12:760155. [PMID: 34858477 PMCID: PMC8632038 DOI: 10.3389/fgene.2021.760155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 10/13/2021] [Indexed: 11/23/2022] Open
Abstract
Evidences increasingly indicate the involvement of gene network rewiring in disease development and cell differentiation. With the accumulation of high-throughput gene expression data, it is now possible to infer the changes of gene networks between two different states or cell types via computational approaches. However, the distribution diversity of multi-platform gene expression data and the sparseness and high noise rate of single-cell RNA sequencing (scRNA-seq) data raise new challenges for existing differential network estimation methods. Furthermore, most existing methods are purely rely on gene expression data, and ignore the additional information provided by various existing biological knowledge. In this study, to address these challenges, we propose a general framework, named weighted joint sparse penalized D-trace model (WJSDM), to infer differential gene networks by integrating multi-platform gene expression data and multiple prior biological knowledge. Firstly, a non-paranormal graphical model is employed to tackle gene expression data with missing values. Then we propose a weighted group bridge penalty to integrate multi-platform gene expression data and various existing biological knowledge. Experiment results on synthetic data demonstrate the effectiveness of our method in inferring differential networks. We apply our method to the gene expression data of ovarian cancer and the scRNA-seq data of circulating tumor cells of prostate cancer, and infer the differential network associated with platinum resistance of ovarian cancer and anti-androgen resistance of prostate cancer. By analyzing the estimated differential networks, we find some important biological insights about the mechanisms underlying platinum resistance of ovarian cancer and anti-androgen resistance of prostate cancer.
Collapse
|
7
|
Brain connectivity alteration detection via matrix-variate differential network model. Biometrics 2021; 77:1409-1421. [PMID: 32829503 PMCID: PMC7900256 DOI: 10.1111/biom.13359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 08/10/2020] [Accepted: 08/14/2020] [Indexed: 10/23/2022]
Abstract
Brain functional connectivity reveals the synchronization of brain systems through correlations in neurophysiological measures of brain activities. Growing evidence now suggests that the brain connectivity network experiences alterations with the presence of numerous neurological disorders, thus differential brain network analysis may provide new insights into disease pathologies. The data from neurophysiological measurement are often multidimensional and in a matrix form, posing a challenge in brain connectivity analysis. Existing graphical model estimation methods either assume a vector normal distribution that in essence requires the columns of the matrix data to be independent or fail to address the estimation of differential networks across different populations. To tackle these issues, we propose an innovative matrix-variate differential network (MVDN) model. We exploit the D-trace loss function and a Lasso-type penalty to directly estimate the spatial differential partial correlation matrix and use an alternating direction method of multipliers algorithm for the optimization problem. Theoretical and simulation studies demonstrate that MVDN significantly outperforms other state-of-the-art methods in dynamic differential network analysis. We illustrate with a functional connectivity analysis of an attention deficit hyperactivity disorder dataset. The hub nodes and differential interaction patterns identified are consistent with existing experimental studies.
Collapse
|
8
|
Disparity-filtered differential correlation network analysis: a case study on CRC metabolomics. J Integr Bioinform 2021; 18:jib-2021-0030. [PMID: 34792303 PMCID: PMC8709737 DOI: 10.1515/jib-2021-0030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/18/2021] [Indexed: 11/15/2022] Open
Abstract
Differential network analysis has become a widely used technique to investigate changes of interactions among different conditions. Although the relationship between observed interactions and biochemical mechanisms is hard to establish, differential network analysis can provide useful insights about dysregulated pathways and candidate biomarkers. The available methods to detect differential interactions are heterogeneous and often rely on assumptions that are unrealistic in many applications. To address these issues, we develop a novel method for differential network analysis, using the so-called disparity filter as network reduction technique. In addition, we propose a classification model based on the inferred network interactions. The main novelty of this work lies in its ability to preserve connections that are statistically significant with respect to a null model without favouring any resolution scale, as a hard threshold would do, and without Gaussian assumptions. The method was tested using a published metabolomic dataset on colorectal cancer (CRC). Detected hub metabolites were consistent with recent literature and the classifier was able to distinguish CRC from polyp and healthy subjects with great accuracy. In conclusion, the proposed method provides a new simple and effective framework for the identification of differential interaction patterns and improves the biological interpretation of metabolomics data.
Collapse
|
9
|
Assisted differential network analysis for gene expression data. Genet Epidemiol 2021; 45:604-620. [PMID: 34174112 PMCID: PMC8376770 DOI: 10.1002/gepi.22419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 05/10/2021] [Accepted: 05/17/2021] [Indexed: 11/12/2022]
Abstract
In the analysis of gene expression data, when there are two or more disease conditions/groups (e.g., diseased and normal, responder and nonresponder, and multiple stages/subtypes), differential analysis has been extensively conducted to identify key differences and has important implications. Network analysis takes a system perspective and can be more informative than that limited to simple statistics such as mean and variance. In differential network analysis, a common practice is to first estimate a gene expression network for each condition/group, and then spectral clustering can be applied to the network difference(s) to identify key genes and biological mechanisms that lead to the differences. Compared to "simple" analysis such as regression, differential network analysis can be more challenging with the significantly larger number of parameters. In this study, taking advantage of the increasing popularity of multidimensional profiling data, we develop an assisted analysis strategy and propose incorporating regulator information to improve the identification of key genes (that lead to the differences in gene expression networks). An effective computational algorithm is developed. Comprehensive simulation is conducted, showing that the proposed approach can outperform the benchmark alternatives in identification accuracy. With the The Cancer Genome Atlas lung adenocarcinoma data, we analyze the expressions of genes in the KEGG cell cycle pathway, assisted by copy number variation data. The proposed assisted analysis leads to identification results similar to the alternatives but different estimations. Overall, this study can deliver an efficient and cost-effective way of improving differential network analysis.
Collapse
|
10
|
The Analysis of Gene Expression Data Incorporating Tumor Purity Information. Front Genet 2021; 12:642759. [PMID: 34497631 PMCID: PMC8419469 DOI: 10.3389/fgene.2021.642759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2020] [Accepted: 07/30/2021] [Indexed: 12/03/2022] Open
Abstract
The tumor microenvironment is composed of tumor cells, stroma cells, immune cells, blood vessels, and other associated non-cancerous cells. Gene expression measurements on tumor samples are an average over cells in the microenvironment. However, research questions often seek answers about tumor cells rather than the surrounding non-tumor tissue. Previous studies have suggested that the tumor purity (TP)-the proportion of tumor cells in a solid tumor sample-has a confounding effect on differential expression (DE) analysis of high vs. low survival groups. We investigate three ways incorporating the TP information in the two statistical methods used for analyzing gene expression data, namely, differential network (DN) analysis and DE analysis. Analysis 1 ignores the TP information completely, Analysis 2 uses a truncated sample by removing the low TP samples, and Analysis 3 uses TP as a covariate in the underlying statistical models. We use three gene expression data sets related to three different cancers from the Cancer Genome Atlas (TCGA) for our investigation. The networks from Analysis 2 have greater amount of differential connectivity in the two networks than that from Analysis 1 in all three cancer datasets. Similarly, Analysis 1 identified more differentially expressed genes than Analysis 2. Results of DN and DE analyses using Analysis 3 were mostly consistent with those of Analysis 1 across three cancers. However, Analysis 3 identified additional cancer-related genes in both DN and DE analyses. Our findings suggest that using TP as a covariate in a linear model is appropriate for DE analysis, but a more robust model is needed for DN analysis. However, because true DN or DE patterns are not known for the empirical datasets, simulated datasets can be used to study the statistical properties of these methods in future studies.
Collapse
|
11
|
SeqNet: An R Package for Generating Gene-Gene Networks and Simulating RNA-Seq Data. J Stat Softw 2021; 98:10.18637/jss.v098.i12. [PMID: 34321962 PMCID: PMC8315007 DOI: 10.18637/jss.v098.i12] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Gene expression data provide an abundant resource for inferring connections in gene regulatory networks. While methodologies developed for this task have shown success, a challenge remains in comparing the performance among methods. Gold-standard datasets are scarce and limited in use. And while tools for simulating expression data are available, they are not designed to resemble the data obtained from RNA-seq experiments. SeqNet is an R package that provides tools for generating a rich variety of gene network structures and simulating RNA-seq data from them. This produces in silico RNA-seq data for benchmarking and assessing gene network inference methods. The package is available on CRAN and on GitHub at https://github.com/tgrimes/SeqNet.
Collapse
|
12
|
Predicting Brain Regions Related to Alzheimer's Disease Based on Global Feature. Front Comput Neurosci 2021; 15:659838. [PMID: 34093157 PMCID: PMC8175859 DOI: 10.3389/fncom.2021.659838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 04/22/2021] [Indexed: 11/15/2022] Open
Abstract
Alzheimer's disease (AD) is a neurodegenerative disease that commonly affects the elderly; early diagnosis and timely treatment are very important to delay the course of the disease. In the past, most brain regions related to AD were identified based on imaging methods, and only some atrophic brain regions could be identified. In this work, the authors used mathematical models to identify the potential brain regions related to AD. In this study, 20 patients with AD and 13 healthy controls (non-AD) were recruited by the neurology outpatient department or the neurology ward of Peking University First Hospital from September 2017 to March 2019. First, diffusion tensor imaging (DTI) was used to construct the brain structural network. Next, the authors set a new local feature index 2hop-connectivity to measure the correlation between different regions. Compared with the traditional graph theory index, 2hop-connectivity exploits the higher-order information of the graph structure. And for this purpose, the authors proposed a novel algorithm called 2hopRWR to measure 2hop-connectivity. Then, a new index global feature score (GFS) based on a global feature was proposed by combing five local features, namely degree centrality, betweenness centrality, closeness centrality, the number of maximal cliques, and 2hop-connectivity, to judge which brain regions are related to AD. As a result, the top ten brain regions identified using the GFS scoring difference between the AD and the non-AD groups were associated to AD by literature verification. The results of the literature validation comparing GFS with the local features showed that GFS was superior to individual local features. Finally, the results of the canonical correlation analysis showed that the GFS was significantly correlated with the scores of the Mini-Mental State Examination (MMSE) scale and the Montreal Cognitive Assessment (MoCA) scale. Therefore, the authors believe the GFS can also be used as a new biomarker to assist in diagnosis and objective monitoring of disease progression. Besides, the method proposed in this paper can be used as a differential network analysis method for network analysis in other domains.
Collapse
|
13
|
Comparing Statistical Tests for Differential Network Analysis of Gene Modules. Front Genet 2021; 12:630215. [PMID: 34093641 PMCID: PMC8170128 DOI: 10.3389/fgene.2021.630215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/19/2021] [Indexed: 11/13/2022] Open
Abstract
Genes often work together to perform complex biological processes, and "networks" provide a versatile framework for representing the interactions between multiple genes. Differential network analysis (DiNA) quantifies how this network structure differs between two or more groups/phenotypes (e.g., disease subjects and healthy controls), with the goal of determining whether differences in network structure can help explain differences between phenotypes. In this paper, we focus on gene co-expression networks, although in principle, the methods studied can be used for DiNA for other types of features (e.g., metabolome, epigenome, microbiome, proteome, etc.). Three common applications of DiNA involve (1) testing whether the connections to a single gene differ between groups, (2) testing whether the connection between a pair of genes differs between groups, or (3) testing whether the connections within a "module" (a subset of 3 or more genes) differs between groups. This article focuses on the latter, as there is a lack of studies comparing statistical methods for identifying differentially co-expressed modules (DCMs). Through extensive simulations, we compare several previously proposed test statistics and a new p-norm difference test (PND). We demonstrate that the true positive rate of the proposed PND test is competitive with and often higher than the other methods, while controlling the false positive rate. The R package discoMod (differentially co-expressed modules) implements the proposed method and provides a full pipeline for identifying DCMs: clustering tools to derive gene modules, tests to identify DCMs, and methods for visualizing the results.
Collapse
|
14
|
WDNE: an integrative graphical model for inferring differential networks from multi-platform gene expression data with missing values. Brief Bioinform 2021; 22:6272792. [PMID: 33975339 DOI: 10.1093/bib/bbab086] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 02/14/2021] [Accepted: 02/23/2021] [Indexed: 11/14/2022] Open
Abstract
The mechanisms controlling biological process, such as the development of disease or cell differentiation, can be investigated by examining changes in the networks of gene dependencies between states in the process. High-throughput experimental methods, like microarray and RNA sequencing, have been widely used to gather gene expression data, which paves the way to infer gene dependencies based on computational methods. However, most differential network analysis methods are designed to deal with fully observed data, but missing values, such as the dropout events in single-cell RNA-sequencing data, are frequent. New methods are needed to take account of these missing values. Moreover, since the changes of gene dependencies may be driven by certain perturbed genes, considering the changes in gene expression levels may promote the identification of gene network rewiring. In this study, a novel weighted differential network estimation (WDNE) model is proposed to handle multi-platform gene expression data with missing values and take account of changes in gene expression levels. Simulation studies demonstrate that WDNE outperforms state-of-the-art differential network estimation methods. When applied WDNE to infer differential gene networks associated with drug resistance in ovarian tumors, cell differentiation and breast tumor heterogeneity, the hub genes in the estimated differential gene networks can provide important insights into the underlying mechanisms. Furthermore, a Matlab toolbox, differential network analysis toolbox, was developed to implement the WDNE model and visualize the estimated differential networks.
Collapse
|
15
|
A Novel Method to Identify the Differences Between Two Single Cell Groups at Single Gene, Gene Pair, and Gene Module Levels. Front Genet 2021; 12:648898. [PMID: 33790951 PMCID: PMC8005607 DOI: 10.3389/fgene.2021.648898] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Accepted: 02/15/2021] [Indexed: 11/13/2022] Open
Abstract
Single-cell sequencing technology can not only view the heterogeneity of cells from a molecular perspective, but also discover new cell types. Although there are many effective methods on dropout imputation, cell clustering, and lineage reconstruction based on single cell RNA sequencing (RNA-seq) data, there is no systemic pipeline on how to compare two single cell clusters at the molecular level. In the study, we present a novel pipeline on comparing two single cell clusters, including calling differential gene expression, coexpression network modules, and so on. The pipeline could reveal mechanisms behind the biological difference between cell clusters and cell types, and identify cell type specific molecular mechanisms. We applied the pipeline to two famous single-cell databases, Usoskin from mouse brain and Xin from human pancreas, which contained 622 and 1,600 cells, respectively, both of which were composed of four types of cells. As a result, we identified many significant differential genes, differential gene coexpression and network modules among the cell clusters, which confirmed that different cell clusters might perform different functions.
Collapse
|
16
|
Construction of a disease-specific lncRNA-miRNA-mRNA regulatory network reveals potential regulatory axes and prognostic biomarkers for hepatocellular carcinoma. Cancer Med 2020; 9:9219-9235. [PMID: 33232580 PMCID: PMC7774738 DOI: 10.1002/cam4.3526] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 08/14/2020] [Accepted: 09/21/2020] [Indexed: 01/04/2023] Open
Abstract
Hepatocellular carcinoma (HCC) is a heterogeneous malignancy with a high incidence and poor prognosis. Exploration of the underlying mechanisms and effective prognostic indicators is conducive to clinical management and optimization of treatment. The RNA‐seq and clinical phenotype data of HCC were retrieved from The Cancer Genome Atlas (TCGA), and differential expression analysis was performed. Then, a differential lncRNA‐miRNA‐mRNA regulatory network was constructed, and the key genes were further identified and validated. By integrating this network with the online tool‐based ceRNA network, an HCC‐specific ceRNA network was obtained, and lncRNA‐miRNA‐mRNA regulatory axes were extracted. RNAs associated with prognosis were further obtained, and multivariate Cox regression models were established to identify the prognostic signature and nomogram. As a result, 198 DElncRNAs, 120 DEmiRNAs, and 2827 DEmRNAs were identified, and 30 key genes identified from the differential network were enriched in four cancer‐related pathways. Four HCC‐specific lncRNA‐miRNA‐mRNA regulatory axes were extracted, and SNHG11, CRNDE, MYLK‐AS1, E2F3, and CHEK1 were found to be related with HCC prognosis. Multivariate Cox regression analysis identified a prognostic signature, comprised of CRNDE, MYLK‐AS1, and CHEK1, for overall survival (OS) of HCC. A nomogram comprising the prognostic signature and pathological stage was established and showed some net clinical benefits. The AUC of the prognostic signature and nomogram for 1‐year, 3‐year, and 5‐year survival was 0.777 (0.657‐0.865), 0.722 (0.640‐0.848), and 0.630 (0.528‐0.823), and 0.751 (0.664‐0.870), 0.773 (0.707‐0.849), and 0.734 (0.638‐0.845), respectively. These results provided clues for the study of potential biomarkers and therapeutic targets for HCC. In addition, the obtained 30 key genes and 4 regulatory axes might also help elucidate the underlying mechanism of HCC.
Collapse
|
17
|
DNF: A differential network flow method to identify rewiring drivers for gene regulatory networks. Neurocomputing 2020; 410:202-210. [PMID: 34025035 PMCID: PMC8139126 DOI: 10.1016/j.neucom.2020.05.028] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Differential network analysis has become an important approach in identifying driver genes in development and disease. However, most studies capture only local features of the underlying gene-regulatory network topology. These approaches are vulnerable to noise and other changes which mask driver-gene activity. Therefore, methods are urgently needed which can separate the impact of true regulatory elements from stochastic changes and downstream effects. We propose the differential network flow (DNF) method to identify key regulators of progression in development or disease. Given the network representation of consecutive biological states, DNF quantifies the essentiality of each node by differences in the distribution of network flow, which are capable of capturing comprehensive topological differences from local to global feature domains. DNF achieves more accurate driver-gene identification than other state-of-the-art methods when applied to four human datasets from The Cancer Genome Atlas and three single-cell RNA-seq datasets of murine neural and hematopoietic differentiation. Furthermore, we predict key regulators of crosstalk between separate networks underlying both neuronal differentiation and the progression of neurodegenerative disease, among which APP is predicted as a driver gene of neural stem cell differentiation. Our method is a new approach for quantifying the essentiality of genes across networks of different biological states.
Collapse
|
18
|
System-Based Differential Gene Network Analysis for Characterizing a Sample-Specific Subnetwork. Biomolecules 2020; 10:biom10020306. [PMID: 32075209 PMCID: PMC7072632 DOI: 10.3390/biom10020306] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 02/03/2020] [Accepted: 02/08/2020] [Indexed: 12/18/2022] Open
Abstract
Gene network estimation is a method key to understanding a fundamental cellular system from high throughput omics data. However, the existing gene network analysis relies on having a sufficient number of samples and is required to handle a huge number of nodes and estimated edges, which remain difficult to interpret, especially in discovering the clinically relevant portions of the network. Here, we propose a novel method to extract a biomedically significant subnetwork using a Bayesian network, a type of unsupervised machine learning method that can be used as an explainable and interpretable artificial intelligence algorithm. Our method quantifies sample specific networks using our proposed Edge Contribution value (ECv) based on the estimated system, which realizes condition-specific subnetwork extraction using a limited number of samples. We applied this method to the Epithelial-Mesenchymal Transition (EMT) data set that is related to the process of metastasis and thus prognosis in cancer biology. We established our method-driven EMT network representing putative gene interactions. Furthermore, we found that the sample-specific ECv patterns of this EMT network can characterize the survival of lung cancer patients. These results show that our method unveils the explainable network differences in biological and clinical features through artificial intelligence technology.
Collapse
|
19
|
Open Data for Differential Network Analysis in Glioma. Int J Mol Sci 2020; 21:E547. [PMID: 31952211 PMCID: PMC7013918 DOI: 10.3390/ijms21020547] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2019] [Revised: 12/29/2019] [Accepted: 01/03/2020] [Indexed: 12/20/2022] Open
Abstract
The complexity of cancer diseases demands bioinformatic techniques and translational research based on big data and personalized medicine. Open data enables researchers to accelerate cancer studies, save resources and foster collaboration. Several tools and programming approaches are available for analyzing data, including annotation, clustering, comparison and extrapolation, merging, enrichment, functional association and statistics. We exploit openly available data via cancer gene expression analysis, we apply refinement as well as enrichment analysis via gene ontology and conclude with graph-based visualization of involved protein interaction networks as a basis for signaling. The different databases allowed for the construction of huge networks or specified ones consisting of high-confidence interactions only. Several genes associated to glioma were isolated via a network analysis from top hub nodes as well as from an outlier analysis. The latter approach highlights a mitogen-activated protein kinase next to a member of histondeacetylases and a protein phosphatase as genes uncommonly associated with glioma. Cluster analysis from top hub nodes lists several identified glioma-associated gene products to function within protein complexes, including epidermal growth factors as well as cell cycle proteins or RAS proto-oncogenes. By using selected exemplary tools and open-access resources for cancer research and differential network analysis, we highlight disturbed signaling components in brain cancer subtypes of glioma.
Collapse
|
20
|
Integrated Univariate, Multivariate, and Correlation-Based Network Analyses Reveal Metabolite-Specific Effects on Bacterial Growth and Biofilm Formation in Necrotizing Soft Tissue Infections. J Proteome Res 2020; 19:688-698. [PMID: 31833369 DOI: 10.1021/acs.jproteome.9b00565] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Necrotizing soft-tissue infections (NSTIs) have multiple causes, risk factors, anatomical locations, and pathogenic mechanisms. In patients with NSTI, circulating metabolites may serve as a substrate having impact on bacterial adaptation at the site of infection. Metabolic signatures associated with NSTI may reveal the potential to be useful as diagnostic and prognostic markers and novel targets for therapy. This study used untargeted metabolomics analyses of plasma from NSTI patients (n = 34) and healthy (noninfected) controls (n = 24) to identify the metabolic signatures and connectivity patterns among metabolites associated with NSTI. Metabolite-metabolite association networks were employed to compare the metabolic profiles of NSTI patients and noninfected surgical controls. Out of 97 metabolites detected, the abundance of 33 was significantly altered in NSTI patients. Analysis of metabolite-metabolite association networks showed a more densely connected network: specifically, 20 metabolites differentially connected between NSTI and controls. A selected set of significantly altered metabolites was tested in vitro to investigate potential influence on NSTI group A streptococcal strain growth and biofilm formation. Using chemically defined media supplemented with the selected metabolites, ornithine, ribose, urea, and glucuronic acid, revealed metabolite-specific effects on both bacterial growth and biofilm formation. This study identifies for the first time an NSTI-specific metabolic signature with implications for optimized diagnostics and therapies.
Collapse
|
21
|
Reconstruction of Cell-type-Specific Interactomes at Single-Cell Resolution. Cell Syst 2019; 9:559-568.e4. [PMID: 31786210 PMCID: PMC6943823 DOI: 10.1016/j.cels.2019.10.007] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 07/13/2019] [Accepted: 10/22/2019] [Indexed: 01/03/2023]
Abstract
The human interactome is instrumental in the systems-level study of the cell and the contextualization of disease-associated gene perturbations. However, reference organismal interactomes do not capture the cell-type-specific context in which proteins and modules preferentially act. Here, we introduce SCINET, a computational framework that reconstructs an ensemble of cell-type-specific interactomes by integrating a global, context-independent reference interactome with a single-cell gene-expression profile. SCINET addresses technical challenges of single-cell data by robustly imputing, transforming, and normalizing the initially noisy and sparse expression of data. Inferred cell-level gene interaction probabilities and group-level interaction strengths define cell-type-specific interactomes. We use SCINET to reconstruct and analyze interactomes of the major human brain and immune cell types, revealing specificity and modularity of perturbations associated with neurodegenerative, neuropsychiatric, and autoimmune disorders. We report cell-type interactomes for brain and immune cell types, together with the SCINET package.
Collapse
|
22
|
BioNetStat: A Tool for Biological Networks Differential Analysis. Front Genet 2019; 10:594. [PMID: 31293621 PMCID: PMC6598498 DOI: 10.3389/fgene.2019.00594] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 06/05/2019] [Indexed: 01/25/2023] Open
Abstract
The study of interactions among biological components can be carried out by using methods grounded on network theory. Most of these methods focus on the comparison of two biological networks (e.g., control vs. disease). However, biological systems often present more than two biological states (e.g., tumor grades). To compare two or more networks simultaneously, we developed BioNetStat, a Bioconductor package with a user-friendly graphical interface. BioNetStat compares correlation networks based on the probability distribution of a feature of the graph (e.g., centrality measures). The analysis of the structural alterations on the network reveals significant modifications in the system. For example, the analysis of centrality measures provides information about how the relevance of the nodes changes among the biological states. We evaluated the performance of BioNetStat in both, toy models and two case studies. The latter related to gene expression of tumor cells and plant metabolism. Results based on simulated scenarios suggest that the statistical power of BioNetStat is less sensitive to the increase of the number of networks than Gene Set Coexpression Analysis (GSCA). Also, besides being able to identify nodes with modified centralities, BioNetStat identified altered networks associated with signaling pathways that were not identified by other methods.
Collapse
|
23
|
Abstract
Identification of differential gene regulators with significant changes under disparate conditions is essential to understand complex biological mechanism in a disease. Differential Network Analysis (DiNA) examines different biological processes based on gene regulatory networks that represent regulatory interactions between genes with a graph model. While most studies in DiNA have considered correlation-based inference to construct gene regulatory networks from gene expression data due to its intuitive representation and simple implementation, the approach lacks in the representation of causal effects and multivariate effects between genes. In this paper, we propose an approach named Differential Gene Regulatory Network (DiffGRN) that infers differential gene regulation between two groups. We infer gene regulatory networks of two groups using Random LASSO, and then we identify differential gene regulations by the proposed significance test. The advantages of DiffGRN are to capture multivariate effects of genes that regulate a gene simultaneously, to identify causality of gene regulations, and to discover differential gene regulators between regression-based gene regulatory networks. We assessed DiffGRN by simulation experiments and showed its outstanding performance than the current state-of-the-art correlation-based method, DINGO. DiffGRN is applied to gene expression data in asthma. The DiNA with asthma data showed a number of gene regulations, such as ADAM12 and RELB, reported in biological literature.
Collapse
|
24
|
Abstract
Differential network analysis (DiNA) denotes a recent class of network-based Bioinformatics algorithms which focus on the differences in network topologies between two states of a cell, such as healthy and disease, to identify key players in the discriminating biological processes. In contrast to conventional differential analysis, DiNA identifies changes in the interplay between molecules, rather than changes in single molecules. This ability is especially important in cases where effectors are changed, e.g. mutated, but their expression is not. A number of different DiNA approaches have been proposed, yet a comparative assessment of their performance in different settings is still lacking. In this paper, we evaluate 10 different DiNA algorithms regarding their ability to recover genetic key players from transcriptome data. We construct high-quality regulatory networks and enrich them with co-expression data from four different types of cancer. Next, we assess the results of applying DiNA algorithms on these data sets using a gold standard list (GSL). We find that local DiNA algorithms are generally superior to global algorithms, and that all DiNA algorithms outperform conventional differential expression analysis. We also assess the ability of DiNA methods to exploit additional knowledge in the underlying cellular networks. To this end, we enrich the cancer-type specific networks with known regulatory miRNAs and compare the algorithms performance in networks with and without miRNA. We find that including miRNAs consistently and considerably improves the performance of almost all tested algorithms. Our results underline the advantages of comprehensive cell models for the analysis of -omics data.
Collapse
|
25
|
Plasma and Serum Metabolite Association Networks: Comparability within and between Studies Using NMR and MS Profiling. J Proteome Res 2017; 16:2547-2559. [PMID: 28517934 PMCID: PMC5645760 DOI: 10.1021/acs.jproteome.7b00106] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
Blood is one of the most used biofluids
in metabolomics studies,
and the serum and plasma fractions are routinely used as a proxy for
blood itself. Here we investigated the association networks of an
array of 29 metabolites identified and quantified via NMR in the plasma
and serum samples of two cohorts of ∼1000 healthy blood donors
each. A second study of 377 individuals was used to extract plasma
and serum samples from the same individual on which a set of 122 metabolites
were detected and quantified using FIA–MS/MS. Four different
inference algorithms (ARANCE, CLR, CORR, and PCLRC) were used to obtain
consensus networks. The plasma and serum networks obtained from different
studies showed different topological properties with the serum network
being more connected than the plasma network. On a global level, metabolite
association networks from plasma and serum fractions obtained from
the same blood sample of healthy people show similar topologies, and
at a local level, some differences arise like in the case of amino
acids.
Collapse
|
26
|
Differential network analysis reveals dysfunctional regulatory networks in gastric carcinogenesis. Am J Cancer Res 2015; 5:2605-2625. [PMID: 26609471 PMCID: PMC4633893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 08/04/2015] [Indexed: 06/05/2023] Open
Abstract
Gastric Carcinoma is one of the most common cancers in the world. A large number of differentially expressed genes have been identified as being associated with gastric cancer progression, however, little is known about the underlying regulatory mechanisms. To address this problem, we developed a differential networking approach that is characterized by including a nascent methodology, differential coexpression analysis (DCEA), and two novel quantitative methods for differential regulation analysis. We first applied DCEA to a gene expression dataset of gastric normal mucosa, adenoma and carcinoma samples to identify gene interconnection changes during cancer progression, based on which we inferred normal, adenoma, and carcinoma-specific gene regulation networks by using linear regression model. It was observed that cancer genes and drug targets were enriched in each network. To investigate the dynamic changes of gene regulation during carcinogenesis, we then designed two quantitative methods to prioritize differentially regulated genes (DRGs) and gene pairs or links (DRLs) between adjacent stages. It was found that known cancer genes and drug targets are significantly higher ranked. The top 4% normal vs. adenoma DRGs (36 genes) and top 6% adenoma vs. carcinoma DRGs (56 genes) proved to be worthy of further investigation to explore their association with gastric cancer. Out of the 16 DRGs involved in two top-10 DRG lists of normal vs. adenoma and adenoma vs. carcinoma comparisons, 15 have been reported to be gastric cancer or cancer related. Based on our inferred differential networking information and known signaling pathways, we generated testable hypotheses on the roles of GATA6, ESRRG and their signaling pathways in gastric carcinogenesis. Compared with established approaches which build genome-scale GRNs, or sub-networks around differentially expressed genes, the present one proved to be better at enriching cancer genes and drug targets, and prioritizing disease-related genes on the dataset we considered. We propose this extendable differential networking framework as a promising way to gain insights into gene regulatory mechanisms underlying cancer progression and other phenotypic changes.
Collapse
|
27
|
Differential metabolic and coexpression networks of plant metabolism. TRENDS IN PLANT SCIENCE 2015; 20:266-268. [PMID: 25791509 DOI: 10.1016/j.tplants.2015.02.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Revised: 02/04/2015] [Accepted: 02/19/2015] [Indexed: 05/14/2023]
Abstract
Recent analyses have demonstrated that plant metabolic networks do not differ in their structural properties and that genes involved in basic metabolic processes show smaller coexpression than genes involved in specialized metabolism. By contrast, our analysis reveals differences in the structure of plant metabolic networks and patterns of coexpression for genes in (non)specialized metabolism. Here we caution that conclusions concerning the organization of plant metabolism based on network-driven analyses strongly depend on the computational approaches used.
Collapse
|
28
|
Comparative study of computational methods for reconstructing genetic networks of cancer-related pathways. Cancer Inform 2014; 13:55-66. [PMID: 25288880 PMCID: PMC4179645 DOI: 10.4137/cin.s13781] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Revised: 05/08/2014] [Accepted: 05/10/2014] [Indexed: 12/16/2022] Open
Abstract
Network reconstruction is an important yet challenging task in systems biology. While many methods have been recently proposed for reconstructing biological networks from diverse data types, properties of estimated networks and differences between reconstruction methods are not well understood. In this paper, we conduct a comprehensive empirical evaluation of seven existing network reconstruction methods, by comparing the estimated networks with different sparsity levels for both normal and tumor samples. The results suggest substantial heterogeneity in networks reconstructed using different reconstruction methods. Our findings also provide evidence for significant differences between networks of normal and tumor samples, even after accounting for the considerable variability in structures of networks estimated using different reconstruction methods. These differences can offer new insight into changes in mechanisms of genetic interaction associated with cancer initiation and progression.
Collapse
|
29
|
Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci U S A 2006; 103:17973-8. [PMID: 17101986 PMCID: PMC1693857 DOI: 10.1073/pnas.0605938103] [Citation(s) in RCA: 413] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2006] [Indexed: 12/28/2022] Open
Abstract
Comparisons of gene expression between human and non-human primate brains have identified hundreds of differentially expressed genes, yet translating these lists into key functional distinctions between species has proved difficult. Here we provide a more integrated view of human brain evolution by examining the large-scale organization of gene coexpression networks in human and chimpanzee brains. We identify modules of coexpressed genes that correspond to discrete brain regions and quantify their conservation between the species. Module conservation in cerebral cortex is significantly weaker than module conservation in subcortical brain regions, revealing a striking gradient that parallels known evolutionary hierarchies. We introduce a method for identifying species-specific network connections and demonstrate how differential network connectivity can be used to identify key drivers of evolutionary change. By integrating our results with comparative genomic sequence data and estimates of protein sequence divergence rates, we confirm a number of network predictions and validate these findings. Our results provide insights into the molecular bases of primate brain organization and demonstrate the general utility of weighted gene coexpression network analysis.
Collapse
|