1
|
Morin A, Chu C, Pavlidis P. Identifying Reproducible Transcription Regulator Coexpression Patterns with Single Cell Transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.15.580581. [PMID: 38559016 PMCID: PMC10979919 DOI: 10.1101/2024.02.15.580581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The proliferation of single cell transcriptomics has potentiated our ability to unveil patterns that reflect dynamic cellular processes, rather than cell type compositional effects that emerge from bulk tissue samples. In this study, we leverage a broad collection of single cell RNA-seq data to identify the gene partners whose expression is most coordinated with each human and mouse transcription regulator (TR). We assembled 120 human and 103 mouse scRNA-seq datasets from the literature (>28 million cells), constructing a single cell coexpression network for each. We aimed to understand the consistency of TR coexpression profiles across a broad sampling of biological contexts, rather than examine the preservation of context-specific signals. Our workflow therefore explicitly prioritizes the patterns that are most reproducible across cell types. Towards this goal, we characterize the similarity of each TR's coexpression within and across species. We create single cell coexpression rankings for each TR, demonstrating that this aggregated information recovers literature curated targets on par with ChIP-seq data. We then combine the coexpression and ChIP-seq information to identify candidate regulatory interactions supported across methods and species. Finally, we highlight interactions for the important neural TR ASCL1 to demonstrate how our compiled information can be adopted for community use.
Collapse
|
2
|
Cote AC, Young HE, Huckins LM. Critical reasoning on the co-expression module QTL in the dorsolateral prefrontal cortex. HGG ADVANCES 2024; 5:100311. [PMID: 38773772 PMCID: PMC11214266 DOI: 10.1016/j.xhgg.2024.100311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 05/16/2024] [Accepted: 05/16/2024] [Indexed: 05/24/2024] Open
Abstract
Expression quantitative trait locus (eQTL) analysis is a popular method of gaining insight into the function of regulatory variation. While cis-eQTL resources have been instrumental in linking genome-wide association study variants to gene function, complex trait heritability may be additionally mediated by other forms of gene regulation. Toward this end, novel eQTL methods leverage gene co-expression (module-QTL) to investigate joint regulation of gene modules by single genetic variants. Here we broadly define a "module-QTL" as the association of a genetic variant with a summary measure of gene co-expression. This approach aims to reduce the multiple testing burden of a trans-eQTL search through the consolidation of gene-based testing and provide biological context to eQTLs shared between genes. In this article we provide an in-depth examination of the co-expression module eQTL (module-QTL) through literature review, theoretical investigation, and real-data application of the module-QTL to three large prefrontal cortex genotype-RNA sequencing datasets. We find module-QTLs in our study that are disease associated and reproducible are not additionally informative beyond cis- or trans-eQTLs for module genes. Through comparison to prior studies, we highlight promises and limitations of the module-QTL across study designs and provide recommendations for further investigation of the module-QTL framework.
Collapse
Affiliation(s)
- Alanna C Cote
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
| | - Hannah E Young
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Laura M Huckins
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06511, USA.
| |
Collapse
|
3
|
Mitra S, Bp K, C R S, Saikumar NV, Philip P, Narayanan M. Alzheimer's disease rewires gene coexpression networks coupling different brain regions. NPJ Syst Biol Appl 2024; 10:50. [PMID: 38724582 PMCID: PMC11082197 DOI: 10.1038/s41540-024-00376-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 04/17/2024] [Indexed: 05/12/2024] Open
Abstract
Connectome studies have shown how Alzheimer's disease (AD) disrupts functional and structural connectivity among brain regions. But the molecular basis of such disruptions is less studied, with most genomic/transcriptomic studies performing within-brain-region analyses. To inspect how AD rewires the correlation structure among genes in different brain regions, we performed an Inter-brain-region Differential Correlation (Inter-DC) analysis of RNA-seq data from Mount Sinai Brain Bank on four brain regions (frontal pole, superior temporal gyrus, parahippocampal gyrus and inferior frontal gyrus, comprising 264 AD and 372 control human post-mortem samples). An Inter-DC network was assembled from all pairs of genes across two brain regions that gained (or lost) correlation strength in the AD group relative to controls at FDR 1%. The differentially correlated (DC) genes in this network complemented known differentially expressed genes in AD, and likely reflects cell-intrinsic changes since we adjusted for cell compositional effects. Each brain region used a distinctive set of DC genes when coupling with other regions, with parahippocampal gyrus showing the most rewiring, consistent with its known vulnerability to AD. The Inter-DC network revealed master dysregulation hubs in AD (at genes ZKSCAN1, SLC5A3, RCC1, IL17RB, PLK4, etc.), inter-region gene modules enriched for known AD pathways (synaptic signaling, endocytosis, etc.), and candidate signaling molecules that could mediate region-region communication. The Inter-DC network generated in this study is a valuable resource of gene pairs, pathways and signaling molecules whose inter-brain-region functional coupling is disrupted in AD, thereby offering a new perspective of AD etiology.
Collapse
Affiliation(s)
- Sanga Mitra
- Bioinformatics and Integrative Data Science group, Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India
| | - Kailash Bp
- Bioinformatics and Integrative Data Science group, Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India
| | - Srivatsan C R
- Bioinformatics and Integrative Data Science group, Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India
| | - Naga Venkata Saikumar
- Bioinformatics and Integrative Data Science group, Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India
| | - Philge Philip
- Centre for Integrative Biology and Systems Medicine, IIT Madras, Chennai, India
- Robert Bosch Centre for Data Science and Artificial Intelligence, IIT Madras, Chennai, India
| | - Manikandan Narayanan
- Bioinformatics and Integrative Data Science group, Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, India.
- Centre for Integrative Biology and Systems Medicine, IIT Madras, Chennai, India.
- Robert Bosch Centre for Data Science and Artificial Intelligence, IIT Madras, Chennai, India.
- Sudha Gopalakrishnan Brain Centre, IIT Madras, Chennai, India.
| |
Collapse
|
4
|
Forabosco P, Pala M, Crobu F, Diana MA, Marongiu M, Cusano R, Angius A, Steri M, Orrù V, Schlessinger D, Fiorillo E, Devoto M, Cucca F. Transcriptome organization of white blood cells through gene co-expression network analysis in a large RNA-seq dataset. Front Immunol 2024; 15:1350111. [PMID: 38629067 PMCID: PMC11018966 DOI: 10.3389/fimmu.2024.1350111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 03/13/2024] [Indexed: 04/19/2024] Open
Abstract
Gene co-expression network analysis enables identification of biologically meaningful clusters of co-regulated genes (modules) in an unsupervised manner. We present here the largest study conducted thus far of co-expression networks in white blood cells (WBC) based on RNA-seq data from 624 individuals. We identify 41 modules, 13 of them related to specific immune-related functions and cell types (e.g. neutrophils, B and T cells, NK cells, and plasmacytoid dendritic cells); we highlight biologically relevant lncRNAs for each annotated module of co-expressed genes. We further characterize with unprecedented resolution the modules in T cell sub-types, through the availability of 95 immune phenotypes obtained by flow cytometry in the same individuals. This study provides novel insights into the transcriptional architecture of human leukocytes, showing how network analysis can advance our understanding of coding and non-coding gene interactions in immune system cells.
Collapse
Affiliation(s)
- Paola Forabosco
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Mauro Pala
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Francesca Crobu
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Maria Antonietta Diana
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Mara Marongiu
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Roberto Cusano
- CRS4-Next Generation Sequencing (NGS) Core, Parco POLARIS, Cagliari, Italy
| | - Andrea Angius
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Maristella Steri
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Valeria Orrù
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - David Schlessinger
- Laboratory of Genetics and Genomics, National Institute on Aging, National Institutes of Health (NIH), Baltimore, MA, United States
| | - Edoardo Fiorillo
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
| | - Marcella Devoto
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
- Dipartimento di Medicina Traslazionale e di Precisione, Università Sapienza, Roma, Italy
| | - Francesco Cucca
- Istituto di Ricerca Genetica e Biomedica (IRGB), Consiglio Nazionale delle Ricerche (CNR), Cagliari, Italy
- Dipartimento di Scienze Biomediche, Università degli Studi di Sassari, Sassari, Italy
| |
Collapse
|
5
|
García-Blay Ó, Verhagen PGA, Martin B, Hansen MMK. Exploring the role of transcriptional and post-transcriptional processes in mRNA co-expression. Bioessays 2023; 45:e2300130. [PMID: 37926676 DOI: 10.1002/bies.202300130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/18/2023] [Accepted: 10/09/2023] [Indexed: 11/07/2023]
Abstract
Co-expression of two or more genes at the single-cell level is usually associated with functional co-regulation. While mRNA co-expression-measured as the correlation in mRNA levels-can be influenced by both transcriptional and post-transcriptional events, transcriptional regulation is typically considered dominant. We review and connect the literature describing transcriptional and post-transcriptional regulation of co-expression. To enhance our understanding, we integrate four datasets spanning single-cell gene expression data, single-cell promoter activity data and individual transcript half-lives. Confirming expectations, we find that positive co-expression necessitates promoter coordination and similar mRNA half-lives. Surprisingly, negative co-expression is favored by differences in mRNA half-lives, contrary to initial predictions from stochastic simulations. Notably, this association manifests specifically within clusters of genes. We further observe a striking compensation between promoter coordination and mRNA half-lives, which additional stochastic simulations suggest might give rise to the observed co-expression patterns. These findings raise intriguing questions about the functional advantages conferred by this compensation between distal kinetic steps.
Collapse
Affiliation(s)
- Óscar García-Blay
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Pieter G A Verhagen
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Benjamin Martin
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Maike M K Hansen
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| |
Collapse
|
6
|
Davis S, Scott C, Oetjen J, Charles PD, Kessler BM, Ansorge O, Fischer R. Deep topographic proteomics of a human brain tumour. Nat Commun 2023; 14:7710. [PMID: 38001067 PMCID: PMC10673928 DOI: 10.1038/s41467-023-43520-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023] Open
Abstract
The spatial organisation of cellular protein expression profiles within tissue determines cellular function and is key to understanding disease pathology. To define molecular phenotypes in the spatial context of tissue, there is a need for unbiased, quantitative technology capable of mapping proteomes within tissue structures. Here, we present a workflow for spatially-resolved, quantitative proteomics of tissue that generates maps of protein abundance across tissue slices derived from a human atypical teratoid-rhabdoid tumour at three spatial resolutions, the highest being 40 µm, to reveal distinct abundance patterns of thousands of proteins. We employ spatially-aware algorithms that do not require prior knowledge of the fine tissue structure to detect proteins and pathways with spatial abundance patterns and correlate proteins in the context of tissue heterogeneity and cellular features such as extracellular matrix or proximity to blood vessels. We identify PYGL, ASPH and CD45 as spatial markers for tumour boundary and reveal immune response-driven, spatially-organised protein networks of the extracellular tumour matrix. Overall, we demonstrate spatially-aware deep proteo-phenotyping of tissue heterogeneity, to re-define understanding tissue biology and pathology at the molecular level.
Collapse
Affiliation(s)
- Simon Davis
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
- Chinese Academy for Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
| | - Connor Scott
- Academic Unit of Neuropathology, Nuffield Department of Clinical Neurosciences, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Janina Oetjen
- Bruker Daltonics GmbH & Co. KG, Fahrenheitstraße 4, 28359, Bremen, Germany
| | - Philip D Charles
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
| | - Benedikt M Kessler
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
- Chinese Academy for Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK
| | - Olaf Ansorge
- Academic Unit of Neuropathology, Nuffield Department of Clinical Neurosciences, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DU, UK
| | - Roman Fischer
- Target Discovery Institute, Centre for Medicines Discovery, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK.
- Chinese Academy for Medical Sciences Oxford Institute, Nuffield Department of Medicine, University of Oxford, Roosevelt Drive, Oxford, OX3 7FZ, UK.
| |
Collapse
|
7
|
Suresh H, Crow M, Jorstad N, Hodge R, Lein E, Dobin A, Bakken T, Gillis J. Comparative single-cell transcriptomic analysis of primate brains highlights human-specific regulatory evolution. Nat Ecol Evol 2023; 7:1930-1943. [PMID: 37667001 PMCID: PMC10627823 DOI: 10.1038/s41559-023-02186-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 08/02/2023] [Indexed: 09/06/2023]
Abstract
Enhanced cognitive function in humans is hypothesized to result from cortical expansion and increased cellular diversity. However, the mechanisms that drive these phenotypic innovations remain poorly understood, in part because of the lack of high-quality cellular resolution data in human and non-human primates. Here, we take advantage of single-cell expression data from the middle temporal gyrus of five primates (human, chimp, gorilla, macaque and marmoset) to identify 57 homologous cell types and generate cell type-specific gene co-expression networks for comparative analysis. Although orthologue expression patterns are generally well conserved, we find 24% of genes with extensive differences between human and non-human primates (3,383 out of 14,131), which are also associated with multiple brain disorders. To assess the functional significance of gene expression differences in an evolutionary context, we evaluate changes in network connectivity across meta-analytic co-expression networks from 19 animals. We find that a subset of these genes has deeply conserved co-expression across all non-human animals, and strongly divergent co-expression relationships in humans (139 out of 3,383, <1% of primate orthologues). Genes with human-specific cellular expression and co-expression profiles (such as NHEJ1, GTF2H2, C2 and BBS5) typically evolve under relaxed selective constraints and may drive rapid evolutionary change in brain function.
Collapse
Affiliation(s)
- Hamsini Suresh
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | | | | | - Ed Lein
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Alexander Dobin
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | | | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
- Department of Physiology, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
8
|
Radulescu E, Chen Q, Pergola G, Di Carlo P, Han S, Shin JH, Hyde TM, Kleinman JE, Weinberger DR. Investigating trait variability of gene co-expression network architecture in brain by controlling for genomic risk of schizophrenia. PLoS Genet 2023; 19:e1010989. [PMID: 37831723 PMCID: PMC10599557 DOI: 10.1371/journal.pgen.1010989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 10/25/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
The effect of schizophrenia (SCZ) genetic risk on gene expression in brain remains elusive. A popular approach to this problem has been the application of gene co-expression network algorithms (e.g., WGCNA). To improve reliability with this method it is critical to remove unwanted sources of variance while also preserving biological signals of interest. In this WCGNA study of RNA-Seq data from postmortem prefrontal cortex (78 neurotypical donors, EUR ancestry), we tested the effects of SCZ genetic risk on co-expression networks. Specifically, we implemented a novel design in which gene expression was adjusted by linear regression models to preserve or remove variance explained by biological signal of interest (GWAS genomic scores for SCZ risk-(GS-SCZ), and genomic scores- GS of height (GS-Ht) as a negative control), while removing variance explained by covariates of non-interest. We calculated co-expression networks from adjusted expression (GS-SCZ and GS-Ht preserved or removed), and consensus between them (representative of a "background" network free of genomic scores effects). We then tested the overlap between GS-SCZ preserved modules and background networks reasoning that modules with reduced overlap would be most affected by GS-SCZ biology. Additionally, we tested these modules for convergence of SCZ risk (i.e., enrichment in PGC3 SCZ GWAS priority genes, enrichment in SCZ risk heritability and relevant biological ontologies. Our results highlight key aspects of GS-SCZ effects on brain co-expression networks, specifically: 1) preserving/removing SCZ genetic risk alters the co-expression modules; 2) biological pathways enriched in modules affected by GS-SCZ implicate processes of transcription, translation and metabolism that converge to influence synaptic transmission; 3) priority PGC3 SCZ GWAS genes and SCZ risk heritability are enriched in modules associated with GS-SCZ effects. Overall, our results indicate that gene co-expression networks that selectively integrate information about genetic risk can reveal novel combinations of biological pathways involved in schizophrenia.
Collapse
Affiliation(s)
- Eugenia Radulescu
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
| | - Qiang Chen
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
| | - Giulio Pergola
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
- Group of Psychiatric Neuroscience, Department of Translational Biomedicine and Neuroscience, University of Bari Aldo Moro, Bari, Italy
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Pasquale Di Carlo
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
| | - Shizhong Han
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Joo Heon Shin
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
| | - Thomas M. Hyde
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Joel E. Kleinman
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| | - Daniel R. Weinberger
- Lieber Institute for Brain Development, Johns Hopkins Medical Campus, Baltimore, Maryland United States of America
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
- Department of Neurology, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
9
|
Liao C, Moyses-Oliveira M, De Esch CE, Bhavsar R, Nuttle X, Li A, Yu A, Burt ND, Erdin S, Fu JM, Wang M, Morley T, Han L, Dion PA, Rouleau GA, Zhang B, Brennand KJ, Talkowski ME, Ruderfer DM. Convergent coexpression of autism-associated genes suggests some novel risk genes may not be detectable in large-scale genetic studies. CELL GENOMICS 2023; 3:100277. [PMID: 37082147 PMCID: PMC10112287 DOI: 10.1016/j.xgen.2023.100277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 12/01/2022] [Accepted: 02/10/2023] [Indexed: 03/21/2023]
Abstract
Autism spectrum disorder (ASD) is a heritable neurodevelopmental disorder characterized by deficits in social interactions and communication. Protein-altering variants in many genes have been shown to contribute to ASD; however, understanding the convergence across many genes remains a challenge. We demonstrate that coexpression patterns from 993 human postmortem brains are significantly correlated with the transcriptional consequences of CRISPR perturbations in human neurons. Across 71 ASD risk genes, there was significant tissue-specific convergence implicating synaptic pathways. Tissue-specific convergence was further demonstrated across schizophrenia and atrial fibrillation risk genes. The degree of ASD convergence was significantly correlated with ASD association from rare variation and differential expression in ASD brains. Positively convergent genes showed intolerance to functional mutations and had shorter coding lengths than known risk genes even after removing association with ASD. These results indicate that convergent coexpression can identify potentially novel genes that are unlikely to be discovered by sequencing studies.
Collapse
|
10
|
Cha J, Lavi M, Kim J, Shomron N, Lee I. Imputation of single-cell transcriptome data enables the reconstruction of networks predictive of breast cancer metastasis. Comput Struct Biotechnol J 2023; 21:2296-2304. [PMID: 37035549 PMCID: PMC10073994 DOI: 10.1016/j.csbj.2023.03.036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2022] [Revised: 03/21/2023] [Accepted: 03/21/2023] [Indexed: 03/30/2023] Open
Abstract
Single-cell transcriptome data provide a unique opportunity to explore the gene networks of a particular cell type. However, insufficient capture rate and high dimensionality of single-cell RNA sequencing (scRNA-seq) data challenge cell-type-specific gene network (CGN) reconstruction. Here, we demonstrated that the imputation of scRNA-seq data enables reconstruction of CGNs by effective retrieval of gene functional associations. We reconstructed CGNs for seven primary and nine metastatic breast cancer cell lines using scRNA-seq data with imputation. Key genes for primary or metastatic cell lines were prioritized based on network centrality measures and CGN hub genes that were presumed to be the major determinant of cell type characteristics. To identify novel genes in breast cancer metastasis, we used the average rank difference of centrality between the primary and metastatic cell lines. Genes predicted using CGN centrality analysis were more enriched for known breast cancer metastatic genes than those predicted using differential expression. The molecular chaperone CCT2 was identified as a novel gene for breast metastasis during knockdown assays of several candidate genes. Overall, our study demonstrated an effective CGN reconstruction technique with imputation of scRNA-seq data and the feasibility of identifying key genes for particular cell subsets using single-cell network analysis.
Collapse
Affiliation(s)
- Junha Cha
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Michael Lavi
- Faculty of Medicine and Edmond J Safra Center for Bioinformatics, Tel Aviv University, Tel Aviv 69978, Israel
| | - Junhan Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Noam Shomron
- Faculty of Medicine and Edmond J Safra Center for Bioinformatics, Tel Aviv University, Tel Aviv 69978, Israel
- Corresponding author.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang 37673, Republic of Korea
- Corresponding author at: Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea.
| |
Collapse
|
11
|
Yang M, Harrison BR, Promislow DEL. In search of a Drosophila core cellular network with single-cell transcriptome data. G3 GENES|GENOMES|GENETICS 2022; 12:6670625. [PMID: 35976114 PMCID: PMC9526075 DOI: 10.1093/g3journal/jkac212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022]
Abstract
Along with specialized functions, cells of multicellular organisms also perform essential functions common to most if not all cells. Whether diverse cells do this by using the same set of genes, interacting in a fixed coordinated fashion to execute essential functions, or a subset of genes specific to certain cells, remains a central question in biology. Here, we focus on gene coexpression to search for a core cellular network across a whole organism. Single-cell RNA-sequencing measures gene expression of individual cells, enabling researchers to discover gene expression patterns that contribute to the diversity of cell functions. Current efforts to study cellular functions focus primarily on identifying differentially expressed genes across cells. However, patterns of coexpression between genes are probably more indicative of biological processes than are the expression of individual genes. We constructed cell-type-specific gene coexpression networks using single-cell transcriptome datasets covering diverse cell types from the fruit fly, Drosophila melanogaster. We detected a set of highly coordinated genes preserved across cell types and present this as the best estimate of a core cellular network. This core is very small compared with cell-type-specific gene coexpression networks and shows dense connectivity. Gene members of this core tend to be ancient genes and are enriched for those encoding ribosomal proteins. Overall, we find evidence for a core cellular network in diverse cell types of the fruit fly. The topological, structural, functional, and evolutionary properties of this core indicate that it accounts for only a minority of essential functions.
Collapse
Affiliation(s)
- Ming Yang
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Benjamin R Harrison
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Daniel E L Promislow
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
- Department of Biology, University of Washington , Seattle, WA 98195, USA
| |
Collapse
|
12
|
Shared regulation and functional relevance of local gene co-expression revealed by single cell analysis. Commun Biol 2022; 5:876. [PMID: 36028576 PMCID: PMC9418141 DOI: 10.1038/s42003-022-03831-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 08/10/2022] [Indexed: 02/01/2023] Open
Abstract
Most human genes are co-expressed with a nearby gene. Previous studies have revealed this local gene co-expression to be widespread across chromosomes and across dozens of tissues. Yet, so far these studies used bulk RNA-seq, averaging gene expression measurements across millions of cells, thus being unclear if this co-expression stems from transcription events in single cells. Here, we leverage single cell datasets in >85 individuals to identify gene co-expression across cells, unbiased by cell-type heterogeneity and benefiting from the co-occurrence of transcription events in single cells. We discover >3800 co-expressed gene pairs in two human cell types, induced pluripotent stem cells (iPSCs) and lymphoblastoid cell lines (LCLs) and (i) compare single cell to bulk RNA-seq in identifying local gene co-expression, (ii) show that many co-expressed genes – but not the majority – are composed of functionally related genes and (iii) using proteomics data, provide evidence that their co-expression is maintained up to the protein level. Finally, using single cell RNA-sequencing (scRNA-seq) and single cell ATAC-sequencing (scATAC-seq) data for the same single cells, we identify gene-enhancer associations and reveal that >95% of co-expressed gene pairs share regulatory elements. These results elucidate the potential reasons for co-expression in single cell gene regulatory networks and warrant a deeper study of shared regulatory elements, in view of explaining disease comorbidity due to affecting several genes. Our in-depth view of local gene co-expression and regulatory element co-activity advances our understanding of the shared regulatory architecture between genes. Using single-cell data from cell lines, the co-expression of genes and co-activity of regulatory elements is analyzed, providing insight into shared architecture and regulation between genes.
Collapse
|
13
|
Figueiredo RQ, Del Ser SD, Raschka T, Hofmann-Apitius M, Kodamullil AT, Mubeen S, Domingo-Fernández D. Elucidating gene expression patterns across multiple biological contexts through a large-scale investigation of transcriptomic datasets. BMC Bioinformatics 2022; 23:231. [PMID: 35705903 PMCID: PMC9202106 DOI: 10.1186/s12859-022-04765-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Accepted: 06/03/2022] [Indexed: 11/10/2022] Open
Abstract
Distinct gene expression patterns within cells are foundational for the diversity of functions and unique characteristics observed in specific contexts, such as human tissues and cell types. Though some biological processes commonly occur across contexts, by harnessing the vast amounts of available gene expression data, we can decipher the processes that are unique to a specific context. Therefore, with the goal of developing a portrait of context-specific patterns to better elucidate how they govern distinct biological processes, this work presents a large-scale exploration of transcriptomic signatures across three different contexts (i.e., tissues, cell types, and cell lines) by leveraging over 600 gene expression datasets categorized into 98 subcontexts. The strongest pairwise correlations between genes from these subcontexts are used for the construction of co-expression networks. Using a network-based approach, we then pinpoint patterns that are unique and common across these subcontexts. First, we focused on patterns at the level of individual nodes and evaluated their functional roles using a human protein-protein interactome as a referential network. Next, within each context, we systematically overlaid the co-expression networks to identify specific and shared correlations as well as relations already described in scientific literature. Additionally, in a pathway-level analysis, we overlaid node and edge sets from co-expression networks against pathway knowledge to identify biological processes that are related to specific subcontexts or groups of them. Finally, we have released our data and scripts at https://zenodo.org/record/5831786 and https://github.com/ContNeXt/ , respectively and developed ContNeXt ( https://contnext.scai.fraunhofer.de/ ), a web application to explore the networks generated in this work.
Collapse
Affiliation(s)
- Rebeca Queiroz Figueiredo
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Sara Díaz Del Ser
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Tamara Raschka
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany.,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany
| | - Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany.,Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115, Bonn, Germany.,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53757, Sankt Augustin, Germany. .,Fraunhofer Center for Machine Learning, Sankt Augustin, Germany. .,Enveda Biosciences, Boulder, CO, 80301, USA.
| |
Collapse
|
14
|
Ding J, Zhao S, Chen X, Luo C, Peng J, Zhu J, Shen Y, Luo Z, Chen J. Prognostic and Diagnostic Values of Semaphorin 5B and Its Correlation With Tumor-Infiltrating Immune Cells in Kidney Renal Clear-Cell Carcinoma. Front Genet 2022; 13:835355. [PMID: 35480320 PMCID: PMC9035641 DOI: 10.3389/fgene.2022.835355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 03/11/2022] [Indexed: 11/18/2022] Open
Abstract
Background: Semaphorin 5B (SEMA5B) has been described to be involved in the development and progression of cancer. However, the potential diagnostic and prognosis roles and its correlation with tumor-infiltrating immune cells in KIRC have not been clearly reported yet. Methods: The mRNA level of SEMA5B was analyzed via the TCGA and GTEx database as well as the CCLE dataset and verified by GSE53757 and GSE40435 datasets. Meanwhile, the protein level of SEMA5B was analyzed by CPTAC and validated by HPA. The diagnostic value of SEMA5B was analyzed according to the TCGA database and validated by GSE53757, GSE46699, and GSE11024 + GSE46699 datasets. Then, the survival analysis was conducted using GEPIA2. R software (v3.6.3) was applied to investigate the relevance between SEMA5B and immune checkpoints and m6A RNA methylation regulator expression. The correlation between SEMA5B and MMRs and DNMT expression and tumor-infiltrating immune cells was explored via TIMER2. Co-expressed genes of SEMA5B were assessed by cBioPortal, and enrichment analysis was conducted by Metascape. The methylation analysis was conducted with MEXPRESS and MethSurv online tools. Gene set enrichment analysis (GSEA) was applied to annotate the biological function of SEMA5B. Results: SEMA5B was significantly upregulated at both the mRNA and protein levels in KIRC. Further analysis demonstrated that the mRNA expression of SEMA5B was significantly correlated with gender, age, T stage, pathologic stage, and histologic grade. High levels of SEMA5B were found to be a favorable prognostic factor and novel diagnostic biomarker for KIRC. SEMA5B expression was shown to be significantly associated with the abundance of immune cells in KIRC. Also, SEMA5B expression was significantly correlated with the abundance of MMR genes, DNMTs, and m6A regulators in KIRC. Enrichment analysis indicated that the co-expressed genes may involve in crosslinking in the extracellular matrix (ECM). GSEA disclosed that SYSTEMIC_LUPUS_ERYTHEMATOSUS and NABA_ECM_REGULATORS were prominently enriched in the SEMA5B low-expression phenotype. Finally, the methylation analysis demonstrated a correlation between hypermethylation of the SEMA5B gene and a poor prognosis in KIRC. Conclusion: Increased SEMA5B expression correlated with immune cell infiltration, which can be served as a favorable prognostic factor and a novel diagnostic biomarker for KIRC.
Collapse
Affiliation(s)
- Junping Ding
- Departments of Urology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Shubin Zhao
- Departments of Urology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Xianhua Chen
- Departments of Clinical Laboratory, Key Laboratory of Medical Molecular Diagnostics of Liuzhou, Key Laboratory for Nucleic Acid Molecular Diagnosis and Application of Guangxi Health & Wellness Commission, Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Changjun Luo
- Departments of Cardiology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Jinjian Peng
- Departments of Urology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Jiantan Zhu
- Departments of Urology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Yongqi Shen
- Departments of Oncology of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Zhou Luo
- Departments of Infectious Diseases of Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| | - Jianlin Chen
- Departments of Clinical Laboratory, Key Laboratory of Medical Molecular Diagnostics of Liuzhou, Key Laboratory for Nucleic Acid Molecular Diagnosis and Application of Guangxi Health & Wellness Commission, Affiliated Liutie Central Hospital of Guangxi Medical University, Liuzhou, China
| |
Collapse
|
15
|
Burks DJ, Sengupta S, De R, Mittler R, Azad RK. The Arabidopsis gene co-expression network. PLANT DIRECT 2022; 6:e396. [PMID: 35492683 PMCID: PMC9039629 DOI: 10.1002/pld3.396] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 02/05/2022] [Accepted: 02/08/2022] [Indexed: 06/14/2023]
Abstract
Identifying genes that interact to confer a biological function to an organism is one of the main goals of functional genomics. High-throughput technologies for assessment and quantification of genome-wide gene expression patterns have enabled systems-level analyses to infer pathways or networks of genes involved in different functions under many different conditions. Here, we leveraged the publicly available, information-rich RNA-Seq datasets of the model plant Arabidopsis thaliana to construct a gene co-expression network, which was partitioned into clusters or modules that harbor genes correlated by expression. Gene ontology and pathway enrichment analyses were performed to assess functional terms and pathways that were enriched within the different gene modules. By interrogating the co-expression network for genes in different modules that associate with a gene of interest, diverse functional roles of the gene can be deciphered. By mapping genes differentially expressing under a certain condition in Arabidopsis onto the co-expression network, we demonstrate the ability of the network to uncover novel genes that are likely transcriptionally active but prone to be missed by standard statistical approaches due to their falling outside of the confidence zone of detection. To our knowledge, this is the first A. thaliana co-expression network constructed using the entire mRNA-Seq datasets (>20,000) available at the NCBI SRA database. The developed network can serve as a useful resource for the Arabidopsis research community to interrogate specific genes of interest within the network, retrieve the respective interactomes, decipher gene modules that are transcriptionally altered under certain condition or stage, and gain understanding of gene functions.
Collapse
Affiliation(s)
- David J. Burks
- Department of Biological Sciences and BioDiscovery Institute, College of ScienceUniversity of North TexasDentonTexasUSA
| | - Soham Sengupta
- Department of Biological Sciences and BioDiscovery Institute, College of ScienceUniversity of North TexasDentonTexasUSA
| | - Ronika De
- Department of Biological Sciences and BioDiscovery Institute, College of ScienceUniversity of North TexasDentonTexasUSA
| | - Ron Mittler
- The Division of Plant Sciences and Interdisciplinary Plant Group, College of Agriculture, Food and Natural ResourcesChristopher S. Bond Life Sciences Center University of MissouriColumbiaMissouriUSA
- Department of SurgeryUniversity of Missouri School of MedicineColumbiaMissouriUSA
| | - Rajeev K. Azad
- Department of Biological Sciences and BioDiscovery Institute, College of ScienceUniversity of North TexasDentonTexasUSA
- Department of MathematicsUniversity of North TexasDentonTexasUSA
| |
Collapse
|
16
|
Johnson KA, Krishnan A. Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data. Genome Biol 2022; 23:1. [PMID: 34980209 PMCID: PMC8721966 DOI: 10.1186/s13059-021-02568-9] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/06/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Constructing gene coexpression networks is a powerful approach for analyzing high-throughput gene expression data towards module identification, gene function prediction, and disease-gene prioritization. While optimal workflows for constructing coexpression networks, including good choices for data pre-processing, normalization, and network transformation, have been developed for microarray-based expression data, such well-tested choices do not exist for RNA-seq data. Almost all studies that compare data processing and normalization methods for RNA-seq focus on the end goal of determining differential gene expression. RESULTS Here, we present a comprehensive benchmarking and analysis of 36 different workflows, each with a unique set of normalization and network transformation methods, for constructing coexpression networks from RNA-seq datasets. We test these workflows on both large, homogenous datasets and small, heterogeneous datasets from various labs. We analyze the workflows in terms of aggregate performance, individual method choices, and the impact of multiple dataset experimental factors. Our results demonstrate that between-sample normalization has the biggest impact, with counts adjusted by size factors producing networks that most accurately recapitulate known tissue-naive and tissue-aware gene functional relationships. CONCLUSIONS Based on this work, we provide concrete recommendations on robust procedures for building an accurate coexpression network from an RNA-seq dataset. In addition, researchers can examine all the results in great detail at https://krishnanlab.github.io/RNAseq_coexpression to make appropriate choices for coexpression analysis based on the experimental factors of their RNA-seq dataset.
Collapse
Affiliation(s)
- Kayla A Johnson
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
17
|
Lin A, Forsyth JK, Hoftman GD, Kushan-Wells L, Jalbrzikowski M, Dokuru D, Coppola G, Fiksinski A, Zinkstok J, Vorstman J, Nachun D, Bearden CE. Transcriptomic profiling of whole blood in 22q11.2 reciprocal copy number variants reveals that cell proportion highly impacts gene expression. Brain Behav Immun Health 2021; 18:100386. [PMID: 34841284 PMCID: PMC8607166 DOI: 10.1016/j.bbih.2021.100386] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 10/31/2021] [Indexed: 11/24/2022] Open
Abstract
22q11.2 reciprocal copy number variants (CNVs) offer a powerful quasi-experimental "reverse-genetics" paradigm to elucidate how gene dosage (i.e., deletions and duplications) disrupts the transcriptome to cause further downstream effects. Clinical profiles of 22q11.2 CNV carriers indicate that disrupted gene expression causes alterations in neuroanatomy, cognitive function, and psychiatric disease risk. However, interpreting transcriptomic signal in bulk tissue requires careful consideration of potential changes in cell composition. We first characterized transcriptomic dysregulation in peripheral blood from reciprocal 22q11.2 CNV carriers using differential expression analysis and weighted gene co-expression network analysis (WGCNA) to identify modules of co-expressed genes. We also assessed for group differences in cell composition and re-characterized transcriptomic differences after accounting for cell type proportions and medication usage. Finally, to explore whether CNV-related transcriptomic changes relate to downstream phenotypes associated with 22q11.2 CNVs, we tested for associations of gene expression with neuroimaging measures and behavioral traits, including IQ and psychosis or ASD diagnosis. 22q11.2 deletion carriers (22qDel) showed widespread expression changes at the individual gene as well as module eigengene level compared to 22q11.2 duplication carriers (22qDup) and controls. 22qDup showed increased expression of 5 genes within the 22q11.2 locus, and CDH6 located outside of the locus. Downregulated modules in 22qDel implicated altered immune and inflammatory processes. Celltype deconvolution analyses revealed significant differences between CNV and control groups in T-cell, mast cell, and macrophage proportions; differential expression of individual genes between groups was substantially attenuated after adjusting for cell composition. Individual gene, module eigengene, and cell proportions were not significantly associated with psychiatric or neuroanatomic traits. Our findings suggest broad immune-related dysfunction in 22qDel and highlight the importance of understanding differences in cell composition when interpreting transcriptomic changes in clinical populations. Results also suggest novel directions for future investigation to test whether 22q11.2 CNV effects on macrophages have implications for brain-related microglial function that may contribute to psychiatric phenotypes in 22q11.2 CNV carriers.
Collapse
Affiliation(s)
- Amy Lin
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
- Neuroscience Interdepartmental Program, University of California at Los Angeles, Los Angeles, CA, USA
| | - Jennifer K. Forsyth
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Psychology, University of Washington, WA, USA
| | - Gil D. Hoftman
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
| | - Leila Kushan-Wells
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
| | | | - Deepika Dokuru
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
| | - Giovanni Coppola
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ania Fiksinski
- Wilhelmina Children's Hospital & University Medical Center Utrecht, Brain Center, the Netherlands
- Maastricht University, Department of Psychiatry and Neuropsychology, Division of Mental Health, MHeNS, the Netherlands
| | - Janneke Zinkstok
- Department of Psychiatry and Brain Center, University Medical Center Utrecht, the Netherlands
| | - Jacob Vorstman
- Program in Genetics and Genome Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada; Department of Psychiatry, University of Toronto, Toronto, Canada
| | - Daniel Nachun
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Pathology, Stanford University, Stanford, CA, USA
| | - Carrie E. Bearden
- Department of Psychiatry and Biobehavioral Sciences, Semel Institute for Neuroscience and Human Behavior, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Psychology, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
18
|
Jiang Y, Urresti J, Pagel KA, Pramod AB, Iakoucheva LM, Radivojac P. Prioritizing de novo autism risk variants with calibrated gene- and variant-scoring models. Hum Genet 2021; 141:1595-1613. [PMID: 34549350 DOI: 10.1007/s00439-021-02356-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 08/26/2021] [Indexed: 12/17/2022]
Abstract
Whole-exome and whole-genome sequencing studies in autism spectrum disorder (ASD) have identified hundreds of thousands of exonic variants. Only a handful of them, primarily loss-of-function variants, have been shown to increase the risk for ASD, while the contributory roles of other variants, including most missense variants, remain unknown. New approaches that combine tissue-specific molecular profiles with patients' genetic data can thus play an important role in elucidating the functional impact of exonic variation and improve understanding of ASD pathogenesis. Here, we integrate spatio-temporal gene co-expression networks from the developing human brain and protein-protein interaction networks to first reach accurate prioritization of ASD risk genes based on their connectivity patterns with previously known high-confidence ASD risk genes. We subsequently integrate these gene scores with variant pathogenicity predictions to further prioritize individual exonic variants based on the positive-unlabeled learning framework with gene- and variant-score calibration. We demonstrate that this approach discriminates among variants between cases and controls at the high end of the prediction range. Finally, we experimentally validate our top-scoring de novo mutation NP_001243143.1:p.Phe309Ser in the sodium/potassium-transporting ATPase ATP1A3 to disrupt protein binding with different partners.
Collapse
Affiliation(s)
- Yuxiang Jiang
- Department of Computer Science, Indiana University, Bloomington, IN, USA
| | - Jorge Urresti
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Kymberleigh A Pagel
- Department of Computer Science, Indiana University, Bloomington, IN, USA.,Institute for Computational Medicine, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Akula Bala Pramod
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Lilia M Iakoucheva
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA.
| | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
| |
Collapse
|
19
|
Ruiz-Cantos M, Hutchison CE, Shoulders CC. Musings from the Tribbles Research and Innovation Network. Cancers (Basel) 2021; 13:cancers13184517. [PMID: 34572744 PMCID: PMC8467127 DOI: 10.3390/cancers13184517] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 09/02/2021] [Accepted: 09/04/2021] [Indexed: 11/16/2022] Open
Abstract
This commentary integrates historical and modern findings that underpin our understanding of the cell-specific functions of the Tribbles (TRIB) proteins that bear on tumorigenesis. We touch on the initial discovery of roles played by mammalian TRIB proteins in a diverse range of cell-types and pathologies, for example, TRIB1 in regulatory T-cells, TRIB2 in acute myeloid leukaemia and TRIB3 in gliomas; the origins and diversity of TRIB1 transcripts; microRNA-mediated (miRNA) regulation of TRIB1 transcript decay and translation; the substantial conformational changes that ensue on binding of TRIB1 to the transcription factor C/EBPα; and the unique pocket formed by TRIB1 to sequester its C-terminal motif bearing a binding site for the E3 ubiquitin ligase COP1. Unashamedly, the narrative is relayed through the perspective of the Tribbles Research and Innovation Network, and its establishment, progress and future ambitions: the growth of TRIB and COP1 research to hasten discovery of their cell-specific contributions to health and obesity-related cancers.
Collapse
|
20
|
Harris BD, Crow M, Fischer S, Gillis J. Single-cell co-expression analysis reveals that transcriptional modules are shared across cell types in the brain. Cell Syst 2021; 12:748-756.e3. [PMID: 34015329 PMCID: PMC8298279 DOI: 10.1016/j.cels.2021.04.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 02/11/2021] [Accepted: 04/23/2021] [Indexed: 12/27/2022]
Abstract
Gene-gene relationships are commonly measured via the co-variation of gene expression across samples, also known as gene co-expression. Because shared expression patterns are thought to reflect shared function, co-expression networks describe functional relationships between genes, including co-regulation. However, the heterogeneity of cell types in bulk RNA-seq samples creates connections in co-expression networks that potentially obscure co-regulatory modules. The brain initiative cell census network (BICCN) single-cell RNA sequencing (scRNA-seq) datasets provide an unparalleled opportunity to understand how gene-gene relationships shape cell identity. Comparison of the BICCN data (500,000 cells/nuclei across 7 BICCN datasets) with that of bulk RNA-seq networks (2,000 mouse brain samples across 52 studies) reveals a consistent topology reflecting a shared co-regulatory signal. Differential signals between broad cell classes persist in driving variation at finer levels, indicating that convergent regulatory processes affect cell phenotype at multiple scales.
Collapse
Affiliation(s)
- Benjamin D Harris
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Cold Spring Harbor School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Megan Crow
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Stephan Fischer
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA; Cold Spring Harbor School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| |
Collapse
|
21
|
Abstract
Transcriptomes are known to organize themselves into gene co-expression clusters or modules where groups of genes display distinct patterns of coordinated or synchronous expression across independent biological samples. The functional significance of these co-expression clusters is suggested by the fact that highly coexpressed groups of genes tend to be enriched in genes involved in common functions and biological processes. While gene co-expression is widely assumed to reflect close regulatory proximity, the validity of this assumption remains unclear. Here we use a simple synthetic gene regulatory network (GRN) model and contrast the resulting co-expression structure produced by these networks with their known regulatory architecture and with the co-expression structure measured in available human expression data. Using randomization tests, we found that the levels of co-expression observed in simulated expression data were, just as with empirical data, significantly higher than expected by chance. When examining the source of correlated expression, we found that individual regulators, both in simulated and experimental data, fail, on average, to display correlated expression with their immediate targets. However, highly correlated gene pairs tend to share at least one common regulator, while most gene pairs sharing common regulators do not necessarily display correlated expression. Our results demonstrate that widespread co-expression naturally emerges in regulatory networks, and that it is a reliable and direct indicator of active co-regulation in a given cellular context.
Collapse
|
22
|
Single-cell network biology for resolving cellular heterogeneity in human diseases. Exp Mol Med 2020; 52:1798-1808. [PMID: 33244151 PMCID: PMC8080824 DOI: 10.1038/s12276-020-00528-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/26/2020] [Accepted: 08/31/2020] [Indexed: 01/10/2023] Open
Abstract
Understanding cellular heterogeneity is the holy grail of biology and medicine. Cells harboring identical genomes show a wide variety of behaviors in multicellular organisms. Genetic circuits underlying cell-type identities will facilitate the understanding of the regulatory programs for differentiation and maintenance of distinct cellular states. Such a cell-type-specific gene network can be inferred from coregulatory patterns across individual cells. Conventional methods of transcriptome profiling using tissue samples provide only average signals of diverse cell types. Therefore, reconstructing gene regulatory networks for a particular cell type is not feasible with tissue-based transcriptome data. Recently, single-cell omics technology has emerged and enabled the capture of the transcriptomic landscape of every individual cell. Although single-cell gene expression studies have already opened up new avenues, network biology using single-cell transcriptome data will further accelerate our understanding of cellular heterogeneity. In this review, we provide an overview of single-cell network biology and summarize recent progress in method development for network inference from single-cell RNA sequencing (scRNA-seq) data. Then, we describe how cell-type-specific gene networks can be utilized to study regulatory programs specific to disease-associated cell types and cellular states. Moreover, with scRNA data, modeling personal or patient-specific gene networks is feasible. Therefore, we also introduce potential applications of single-cell network biology for precision medicine. We envision a rapid paradigm shift toward single-cell network analysis for systems biology in the near future. Gene regulatory networks reconstructed from single-cell RNA sequencing datasets are allowing researchers to better understand the molecular circuits and cell states that contribute to complex human disease. Junha Cha and Insuk Lee from Yonsei University in Seoul, South Korea, review the concept of ‘single-cell network biology’, which involves using computational algorithms on genetic expression data from thousands of cells to infer functional interactions in various biological contexts. This systems biology approach to analyzing the profiles of messenger RNA in single cells is helping researchers discover new signaling pathways that could serve as disease biomarkers or therapeutic targets. In the future, patient-specific models of personal gene networks could explain why certain genetic variants affect disease risk. This research could also eventually lead to new types of individualized medical treatments.
Collapse
|
23
|
Tarbier M, Mackowiak SD, Frade J, Catuara-Solarz S, Biryukova I, Gelali E, Menéndez DB, Zapata L, Ossowski S, Bienko M, Gallant CJ, Friedländer MR. Nuclear gene proximity and protein interactions shape transcript covariations in mammalian single cells. Nat Commun 2020; 11:5445. [PMID: 33116115 PMCID: PMC7595044 DOI: 10.1038/s41467-020-19011-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 09/15/2020] [Indexed: 01/19/2023] Open
Abstract
Single-cell RNA sequencing studies on gene co-expression patterns could yield important regulatory and functional insights, but have so far been limited by the confounding effects of differentiation and cell cycle. We apply a tailored experimental design that eliminates these confounders, and report thousands of intrinsically covarying gene pairs in mouse embryonic stem cells. These covariations form a network with biological properties, outlining known and novel gene interactions. We provide the first evidence that miRNAs naturally induce transcriptome-wide covariations and compare the relative importance of nuclear organization, transcriptional and post-transcriptional regulation in defining covariations. We find that nuclear organization has the greatest impact, and that genes encoding for physically interacting proteins specifically tend to covary, suggesting importance for protein complex formation. Our results lend support to the concept of post-transcriptional RNA operons, but we further present evidence that nuclear proximity of genes may provide substantial functional regulation in mammalian single cells. Gene expression covariation can be studied by single-cell RNA sequencing. Here the authors analyze intrinsically covarying gene pairs by eliminating the confounding effects in single-cell experiments and observe covariation of proximal genes and miRNA-induced covariation of target mRNAs.
Collapse
Affiliation(s)
- Marcel Tarbier
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Sebastian D Mackowiak
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - João Frade
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Silvina Catuara-Solarz
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Inna Biryukova
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Eleni Gelali
- Science for Life Laboratory, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
| | - Diego Bárcena Menéndez
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain
| | - Luis Zapata
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain.,Center for Evolution and Cancer, The Institute of Cancer Research, London, UK
| | - Stephan Ossowski
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Barcelona, Spain.,Department of Experimental and Health Sciences, University Pompeu Fabra, Barcelona, Spain.,Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, Germany
| | - Magda Bienko
- Science for Life Laboratory, Department of Medical Biochemistry and Biophysics, Karolinska Institute, Stockholm, Sweden
| | - Caroline J Gallant
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Marc R Friedländer
- Science for Life Laboratory, Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
24
|
Kolberg L, Kerimov N, Peterson H, Alasoo K. Co-expression analysis reveals interpretable gene modules controlled by trans-acting genetic variants. eLife 2020; 9:e58705. [PMID: 32880574 PMCID: PMC7470823 DOI: 10.7554/elife.58705] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 08/20/2020] [Indexed: 12/16/2022] Open
Abstract
Understanding the causal processes that contribute to disease onset and progression is essential for developing novel therapies. Although trans-acting expression quantitative trait loci (trans-eQTLs) can directly reveal cellular processes modulated by disease variants, detecting trans-eQTLs remains challenging due to their small effect sizes. Here, we analysed gene expression and genotype data from six blood cell types from 226 to 710 individuals. We used co-expression modules inferred from gene expression data with five methods as traits in trans-eQTL analysis to limit multiple testing and improve interpretability. In addition to replicating three established associations, we discovered a novel trans-eQTL near SLC39A8 regulating a module of metallothionein genes in LPS-stimulated monocytes. Interestingly, this effect was mediated by a transient cis-eQTL present only in early LPS response and lost before the trans effect appeared. Our analyses highlight how co-expression combined with functional enrichment analysis improves the identification and prioritisation of trans-eQTLs when applied to emerging cell-type-specific datasets.
Collapse
Affiliation(s)
- Liis Kolberg
- Institute of Computer Science, University of TartuTartuEstonia
| | - Nurlan Kerimov
- Institute of Computer Science, University of TartuTartuEstonia
| | - Hedi Peterson
- Institute of Computer Science, University of TartuTartuEstonia
| | - Kaur Alasoo
- Institute of Computer Science, University of TartuTartuEstonia
| |
Collapse
|