1
|
Sibilio P, Conte F, Huang Y, Castaldi PJ, Hersh CP, DeMeo DL, Silverman EK, Paci P. Correlation-based network integration of lung RNA sequencing and DNA methylation data in chronic obstructive pulmonary disease. Heliyon 2024; 10:e31301. [PMID: 38807864 PMCID: PMC11130701 DOI: 10.1016/j.heliyon.2024.e31301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 05/08/2024] [Accepted: 05/14/2024] [Indexed: 05/30/2024] Open
Abstract
Chronic Obstructive Pulmonary Disease (COPD) is a heterogeneous, chronic inflammatory process of the lungs and, like other complex diseases, is caused by both genetic and environmental factors. Detailed understanding of the molecular mechanisms of complex diseases requires the study of the interplay among different biomolecular layers, and thus the integration of different omics data types. In this study, we investigated COPD-associated molecular mechanisms through a correlation-based network integration of lung tissue RNA-seq and DNA methylation data of COPD cases (n = 446) and controls (n = 346) derived from the Lung Tissue Research Consortium. First, we performed a SWIM-network based analysis to build separate correlation networks for RNA-seq and DNA methylation data for our case-control study population. Then, we developed a method to integrate the results into a coupled network of differentially expressed and differentially methylated genes to investigate their relationships across both molecular layers. The functional enrichment analysis of the nodes of the coupled network revealed a strikingly significant enrichment in Immune System components, both innate and adaptive, as well as immune-system component communication (interleukin and cytokine-cytokine signaling). Our analysis allowed us to reveal novel putative COPD-associated genes and to analyze their relationships, both at the transcriptomics and epigenomics levels, thus contributing to an improved understanding of COPD pathogenesis.
Collapse
Affiliation(s)
- Pasquale Sibilio
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy
| | - Federica Conte
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy
| | - Yichen Huang
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Paola Paci
- Department of Computer, Control and Management Engineering, Sapienza University of Rome, Rome, Italy
- Institute for Systems Analysis and Computer Science "Antonio Ruberti", National Research Council, Rome, Italy
- Karolinska Institutet, 17177, Stockholm, Sweden
| |
Collapse
|
2
|
Moreno J, Gluud LL, Galsgaard ED, Hvid H, Mazzoni G, Das V. Identification of ligand and receptor interactions in CKD and MASH through the integration of single cell and spatial transcriptomics. PLoS One 2024; 19:e0302853. [PMID: 38768139 PMCID: PMC11104622 DOI: 10.1371/journal.pone.0302853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 04/10/2024] [Indexed: 05/22/2024] Open
Abstract
BACKGROUND Chronic Kidney Disease (CKD) and Metabolic dysfunction-associated steatohepatitis (MASH) are metabolic fibroinflammatory diseases. Combining single-cell (scRNAseq) and spatial transcriptomics (ST) could give unprecedented molecular disease understanding at single-cell resolution. A more comprehensive analysis of the cell-specific ligand-receptor (L-R) interactions could provide pivotal information about signaling pathways in CKD and MASH. To achieve this, we created an integrative analysis framework in CKD and MASH from two available human cohorts. RESULTS The analytical framework identified L-R pairs involved in cellular crosstalk in CKD and MASH. Interactions between cell types identified using scRNAseq data were validated by checking the spatial co-presence using the ST data and the co-expression of the communicating targets. Multiple L-R protein pairs identified are known key players in CKD and MASH, while others are novel potential targets previously observed only in animal models. CONCLUSION Our study highlights the importance of integrating different modalities of transcriptomic data for a better understanding of the molecular mechanisms. The combination of single-cell resolution from scRNAseq data, combined with tissue slide investigations and visualization of cell-cell interactions obtained through ST, paves the way for the identification of future potential therapeutic targets and developing effective therapies.
Collapse
Affiliation(s)
- Jaime Moreno
- Digital Science and Innovation, Computational Biology – AI & Digital Research, Novo Nordisk A/S, Maløv, Denmark
| | - Lise Lotte Gluud
- Gastro Unit, Copenhagen University Hospital Hvidovre, Hvidovre, Denmark
- Dept of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
| | | | - Henning Hvid
- Global Drug Discovery, Novo Nordisk A/S, Maløv, Denmark
| | - Gianluca Mazzoni
- Digital Science and Innovation, Computational Biology – AI & Digital Research, Novo Nordisk A/S, Maløv, Denmark
| | - Vivek Das
- Digital Science and Innovation, Computational Biology – AI & Digital Research, Novo Nordisk A/S, Maløv, Denmark
| |
Collapse
|
3
|
Soto-Cardinault C, Childs KL, Góngora-Castillo E. Network Analysis of Publicly Available RNA-seq Provides Insights into the Molecular Mechanisms of Plant Defense against Multiple Fungal Pathogens in Arabidopsis thaliana. Genes (Basel) 2023; 14:2223. [PMID: 38137044 PMCID: PMC10743233 DOI: 10.3390/genes14122223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 12/06/2023] [Accepted: 12/12/2023] [Indexed: 12/24/2023] Open
Abstract
Fungal pathogens can have devastating effects on global crop production, leading to annual economic losses ranging from 10% to 23%. In light of climate change-related challenges, researchers anticipate an increase in fungal infections as a result of shifting environmental conditions. However, plants have developed intricate molecular mechanisms for effective defense against fungal attacks. Understanding these mechanisms is essential to the development of new strategies for protecting crops from multiple fungi threats. Public omics databases provide valuable resources for research on plant-pathogen interactions; however, integrating data from different studies can be challenging due to experimental variation. In this study, we aimed to identify the core genes that defend against the pathogenic fungi Colletotrichum higginsianum and Botrytis cinerea in Arabidopsis thaliana. Using a custom framework to control batch effects and construct Gene Co-expression Networks in publicly available RNA-seq dataset from infected A. thaliana plants, we successfully identified a gene module that was responsive to both pathogens. We also performed gene annotation to reveal the roles of previously unknown protein-coding genes in plant defenses against fungal infections. This research demonstrates the potential of publicly available RNA-seq data for identifying the core genes involved in defending against multiple fungal pathogens.
Collapse
Affiliation(s)
- Cynthia Soto-Cardinault
- Unidad de Biotecnología, Centro de Investigación Científica de Yucatán, Mérida 97205, Mexico;
| | - Kevin L. Childs
- Plant Biology Department, Michigan State University, East Lansing, MI 48824, USA;
| | - Elsa Góngora-Castillo
- CONAHCYT-Unidad de Biotecnología, Centro de Investigación Científica de Yucatán, Mérida 97205, Mexico
| |
Collapse
|
4
|
Vandenbon A, Diez D. A universal tool for predicting differentially active features in single-cell and spatial genomics data. Sci Rep 2023; 13:11830. [PMID: 37481581 PMCID: PMC10363154 DOI: 10.1038/s41598-023-38965-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 07/18/2023] [Indexed: 07/24/2023] Open
Abstract
With the growing complexity of single-cell and spatial genomics data, there is an increasing importance of unbiased and efficient exploratory data analysis tools. One common exploratory data analysis step is the prediction of genes with different levels of activity in a subset of cells or locations inside a tissue. We previously developed singleCellHaystack, a method for predicting differentially expressed genes from single-cell transcriptome data, without relying on comparisons between clusters of cells. Here we present an update to singleCellHaystack, which is now a universally applicable method for predicting differentially active features: (1) singleCellHaystack now accepts continuous features that can be RNA or protein expression, chromatin accessibility or module scores from single-cell, spatial and even bulk genomics data, and (2) it can handle 1D trajectories, 2-3D spatial coordinates, as well as higher-dimensional latent spaces as input coordinates. Performance has been drastically improved, with up to ten times reduction in computational time and scalability to millions of cells, making singleCellHaystack a suitable tool for exploratory analysis of atlas level datasets. singleCellHaystack is available as packages in both R and Python.
Collapse
Affiliation(s)
- Alexis Vandenbon
- Institute for Life and Medical Sciences, Kyoto University, 53 Shougoin Kawahara-cho, Sakyo-ku, Kyoto, 606-8507, Japan.
- Institute for Liberal Arts and Sciences, Kyoto University, Yoshidanihonmatsu-cho, Sakyo-ku, Kyoto, 606-8501, Japan.
| | - Diego Diez
- Immunology Frontier Research Center, Osaka University, 3-1, Yamada-oka, Suita, Osaka, 565-0871, Japan
- Open and Transdisciplinary Research Institute (OTRI), Osaka University, 1-1, Yamada-oka, Suita, Osaka, 565-0871, Japan
| |
Collapse
|
5
|
Haczeyni F, Steensels S, Stein BD, Jordan JM, Li L, Dartigue V, Sarklioglu SS, Qiao J, Zhou XK, Dannenberg AJ, Iyengar NM, Yu H, Cantley LC, Ersoy BA. Submitochondrial Protein Translocation Upon Stress Inhibits Thermogenic Energy Expenditure. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.04.539294. [PMID: 37205525 PMCID: PMC10187325 DOI: 10.1101/2023.05.04.539294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Mitochondria-rich brown adipocytes dissipate cellular fuel as heat by thermogenic energy expenditure (TEE). Prolonged nutrient excess or cold exposure impair TEE and contribute to the pathogenesis of obesity, but the mechanisms remain incompletely understood. Here we report that stress-induced proton leak into the matrix interface of mitochondrial innermembrane (IM) mobilizes a group of proteins from IM into matrix, which in turn alter mitochondrial bioenergetics. We further determine a smaller subset that correlates with obesity in human subcutaneous adipose tissue. We go on to show that the top factor on this short list, acyl-CoA thioesterase 9 (ACOT9), migrates from the IM into the matrix upon stress where it enzymatically deactivates and prevents the utilization of acetyl-CoA in TEE. The loss of ACOT9 protects mice against the complications of obesity by maintaining unobstructed TEE. Overall, our results introduce aberrant protein translocation as a strategy to identify pathogenic factors. One-Sentence Summary Thermogenic stress impairs mitochondrial energy utilization by forcing translocation of IM-bound proteins into the matrix.
Collapse
|
6
|
Obayashi T, Kodate S, Hibara H, Kagaya Y, Kinoshita K. COXPRESdb v8: an animal gene coexpression database navigating from a global view to detailed investigations. Nucleic Acids Res 2022; 51:D80-D87. [PMID: 36350658 PMCID: PMC9825429 DOI: 10.1093/nar/gkac983] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/12/2022] [Accepted: 10/15/2022] [Indexed: 11/10/2022] Open
Abstract
Gene coexpression is synchronization of gene expression across many cellular and environmental conditions and is widely used to infer the biological function of genes. Gene coexpression information is complex, comprising a complete graph of all genes in the genome, and requires appropriate visualization and analysis tools. Since its initial release in 2007, the animal gene expression database COXPRESdb (https://coxpresdb.jp) has been continuously improved by adding new gene coexpression data and analysis tools. Here, we report COXPRESdb version 8, which has been enhanced with new features for an overview, summary, and individual examination of coexpression relationships: CoexMap to display coexpression on a genome scale, pathway enrichment analysis to summarize the function of coexpressed genes, and CoexPub to bridges coexpression and existing knowledge. COXPRESdb also facilitates downstream analyses such as interspecies comparisons by integrating RNAseq and microarray coexpression data in a union-type gene coexpression. COXPRESdb strongly support users with the new coexpression data and enhanced functionality.
Collapse
Affiliation(s)
- Takeshi Obayashi
- To whom correspondence should be addressed. Tel: +81 22 795 4741; Fax: +81 22 795 4765;
| | - Shun Kodate
- Tohoku Medical Megabank Organization, Tohoku University, Sendai, 980-8573, Japan
| | - Himiko Hibara
- Graduate School of Information Sciences, Tohoku University, 6-3-09, Aramaki-Aza-Aoba, Aoba-ku, Sendai, 980-8679, Japan
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | |
Collapse
|
7
|
Approaches in Gene Coexpression Analysis in Eukaryotes. BIOLOGY 2022; 11:biology11071019. [PMID: 36101400 PMCID: PMC9312353 DOI: 10.3390/biology11071019] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 06/28/2022] [Accepted: 07/04/2022] [Indexed: 11/22/2022]
Abstract
Simple Summary Genes whose expression levels rise and fall similarly in a large set of samples, may be considered coexpressed. Gene coexpression analysis refers to the en masse discovery of coexpressed genes from a large variety of transcriptomic experiments. The type of biological networks that studies gene coexpression, known as Gene Coexpression Networks, consist of an undirected graph depicting genes and their coexpression relationships. Coexpressed genes are clustered in smaller subnetworks, the predominant biological roles of which can be determined through enrichment analysis. By studying well-annotated gene partners, the attribution of new roles to genes of unknown function or assumption for participation in common metabolic pathways can be achieved, through a guilt-by-association approach. In this review, we present key issues in gene coexpression analysis, as well as the most popular tools that perform it. Abstract Gene coexpression analysis constitutes a widely used practice for gene partner identification and gene function prediction, consisting of many intricate procedures. The analysis begins with the collection of primary transcriptomic data and their preprocessing, continues with the calculation of the similarity between genes based on their expression values in the selected sample dataset and results in the construction and visualisation of a gene coexpression network (GCN) and its evaluation using biological term enrichment analysis. As gene coexpression analysis has been studied extensively, we present most parts of the methodology in a clear manner and the reasoning behind the selection of some of the techniques. In this review, we offer a comprehensive and comprehensible account of the steps required for performing a complete gene coexpression analysis in eukaryotic organisms. We comment on the use of RNA-Seq vs. microarrays, as well as the best practices for GCN construction. Furthermore, we recount the most popular webtools and standalone applications performing gene coexpression analysis, with details on their methods, features and outputs.
Collapse
|