1
|
Chen X. Reimagining Cortical Connectivity by Deconstructing Its Molecular Logic into Building Blocks. Cold Spring Harb Perspect Biol 2024; 16:a041509. [PMID: 38621822 PMCID: PMC11529856 DOI: 10.1101/cshperspect.a041509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
Comprehensive maps of neuronal connectivity provide a foundation for understanding the structure of neural circuits. In a circuit, neurons are diverse in morphology, electrophysiology, gene expression, activity, and other neuronal properties. Thus, constructing a comprehensive connectivity map requires associating various properties of neurons, including their connectivity, at cellular resolution. A commonly used approach is to use the gene expression profiles as an anchor to which all other neuronal properties are associated. Recent advances in genomics and anatomical techniques dramatically improved the ability to determine and associate the long-range projections of neurons with their gene expression profiles. These studies revealed unprecedented details of the gene-projection relationship, but also highlighted conceptual challenges in understanding this relationship. In this article, I delve into the findings and the challenges revealed by recent studies using state-of-the-art neuroanatomical and transcriptomic techniques. Building upon these insights, I propose an approach that focuses on understanding the gene-projection relationship through basic features in gene expression profiles and projections, respectively, that associate with underlying cellular processes. I then discuss how the developmental trajectories of projections and gene expression profiles create additional challenges and necessitate interrogating the gene-projection relationship across time. Finally, I explore complementary strategies that, together, can provide a comprehensive view of the gene-projection relationship.
Collapse
Affiliation(s)
- Xiaoyin Chen
- Allen Institute for Brain Science, Seattle, Washington 98109, USA
| |
Collapse
|
2
|
Morin A, Chu C, Pavlidis P. Identifying Reproducible Transcription Regulator Coexpression Patterns with Single Cell Transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.15.580581. [PMID: 38559016 PMCID: PMC10979919 DOI: 10.1101/2024.02.15.580581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The proliferation of single cell transcriptomics has potentiated our ability to unveil patterns that reflect dynamic cellular processes, rather than cell type compositional effects that emerge from bulk tissue samples. In this study, we leverage a broad collection of single cell RNA-seq data to identify the gene partners whose expression is most coordinated with each human and mouse transcription regulator (TR). We assembled 120 human and 103 mouse scRNA-seq datasets from the literature (>28 million cells), constructing a single cell coexpression network for each. We aimed to understand the consistency of TR coexpression profiles across a broad sampling of biological contexts, rather than examine the preservation of context-specific signals. Our workflow therefore explicitly prioritizes the patterns that are most reproducible across cell types. Towards this goal, we characterize the similarity of each TR's coexpression within and across species. We create single cell coexpression rankings for each TR, demonstrating that this aggregated information recovers literature curated targets on par with ChIP-seq data. We then combine the coexpression and ChIP-seq information to identify candidate regulatory interactions supported across methods and species. Finally, we highlight interactions for the important neural TR ASCL1 to demonstrate how our compiled information can be adopted for community use.
Collapse
Affiliation(s)
- Alexander Morin
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Chingpan Chu
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Paul Pavlidis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| |
Collapse
|
3
|
Sarwar A, Rue M, French L, Cross H, Chen X, Gillis J. Cross-expression analysis reveals patterns of coordinated gene expression in spatial transcriptomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.17.613579. [PMID: 39345494 PMCID: PMC11429685 DOI: 10.1101/2024.09.17.613579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Spatial transcriptomics promises to transform our understanding of tissue biology by molecularly profiling individual cells in situ. A fundamental question they allow us to ask is how nearby cells orchestrate their gene expression. To investigate this, we introduce cross-expression, a novel framework for discovering gene pairs that coordinate their expression across neighboring cells. Just as co-expression quantifies synchronized gene expression within the same cells, cross-expression measures coordinated gene expression between spatially adjacent cells, allowing us to understand tissue gene expression programs with single cell resolution. Using this framework, we recover ligand-receptor partners and discover gene combinations marking anatomical regions. More generally, we create cross-expression networks to find gene modules with orchestrated expression patterns. Finally, we provide an efficient R package to facilitate cross-expression analysis, quantify effect sizes, and generate novel visualizations to better understand spatial gene expression programs.
Collapse
Affiliation(s)
- Ameer Sarwar
- Department of Cell and Systems Biology and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Mara Rue
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Leon French
- Department of Physiology and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| | - Helen Cross
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Xiaoyin Chen
- Allen Institute for Brain Science, Seattle, WA, USA
| | - Jesse Gillis
- Department of Physiology and Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
4
|
Shad MA, Li X, Rao MJ, Luo Z, Li X, Ali A, Wang L. Exploring Lignin Biosynthesis Genes in Rice: Evolution, Function, and Expression. Int J Mol Sci 2024; 25:10001. [PMID: 39337489 PMCID: PMC11432410 DOI: 10.3390/ijms251810001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 09/09/2024] [Accepted: 09/11/2024] [Indexed: 09/30/2024] Open
Abstract
Lignin is nature's second most abundant vascular plant biopolymer, playing significant roles in mechanical support, water transport, and stress responses. This study identified 90 lignin biosynthesis genes in rice based on phylogeny and motif constitution, and they belong to PAL, C4H, 4CL, HCT, C3H, CCoAOMT, CCR, F5H, COMT, and CAD families. Duplication events contributed largely to the expansion of these gene families, such as PAL, CCoAOMT, CCR, and CAD families, mainly attributed to tandem and segmental duplication. Microarray data of 33 tissue samples covering the entire life cycle of rice suggested fairly high PAL, HCT, C3H, CCoAOMT, CCR, COMT, and CAD gene expressions and rather variable C4H, 4CL, and F5H expressions. Some members of lignin-related genes (OsCCRL11, OsHCT1/2/5, OsCCoAOMT1/3/5, OsCOMT, OsC3H, OsCAD2, and OsPAL1/6) were expressed in all tissues examined. The expression patterns of lignin-related genes can be divided into two major groups with eight subgroups, each showing a distinct co-expression in tissues representing typically primary and secondary cell wall constitutions. Some lignin-related genes were strongly co-expressed in tissues typical of secondary cell walls. Combined HPLC analysis showed increased lignin monomer (H, G, and S) contents from young to old growth stages in five genotypes. Based on 90 genes' microarray data, 27 genes were selected for qRT-PCR gene expression analysis. Four genes (OsPAL9, OsCAD8C, OsCCR8, and OsCOMTL4) were significantly negatively correlated with lignin monomers. Furthermore, eleven genes were co-expressed in certain genotypes during secondary growth stages. Among them, six genes (OsC3H, OsCAD2, OsCCR2, OsCOMT, OsPAL2, and OsPAL8) were overlapped with microarray gene expressions, highlighting their importance in lignin biosynthesis.
Collapse
Affiliation(s)
- Munsif Ali Shad
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Sugarcane Biology, College of Agriculture, Guangxi University, 100 Daxue Rd., Nanning 530004, China; (M.A.S.)
| | - Xukai Li
- College of Life Sciences, Shanxi Agricultural University, Taigu 030801, China
- Biomass & Bioenergy Research Centre, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Muhammad Junaid Rao
- State Key Laboratory of Subtropical Silviculture, College of Forestry and Biotechnology, Zhejiang A & F University, Hangzhou 311300, China
| | - Zixuan Luo
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Sugarcane Biology, College of Agriculture, Guangxi University, 100 Daxue Rd., Nanning 530004, China; (M.A.S.)
| | - Xianlong Li
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Sugarcane Biology, College of Agriculture, Guangxi University, 100 Daxue Rd., Nanning 530004, China; (M.A.S.)
| | - Aamir Ali
- College of Agriculture, Shanxi Agricultural University, Taigu 030801, China
| | - Lingqiang Wang
- State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Laboratory of Sugarcane Biology, College of Agriculture, Guangxi University, 100 Daxue Rd., Nanning 530004, China; (M.A.S.)
- Biomass & Bioenergy Research Centre, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
5
|
Unger Avila P, Padvitski T, Leote AC, Chen H, Saez-Rodriguez J, Kann M, Beyer A. Gene regulatory networks in disease and ageing. Nat Rev Nephrol 2024; 20:616-633. [PMID: 38867109 DOI: 10.1038/s41581-024-00849-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/14/2024]
Abstract
The precise control of gene expression is required for the maintenance of cellular homeostasis and proper cellular function, and the declining control of gene expression with age is considered a major contributor to age-associated changes in cellular physiology and disease. The coordination of gene expression can be represented through models of the molecular interactions that govern gene expression levels, so-called gene regulatory networks. Gene regulatory networks can represent interactions that occur through signal transduction, those that involve regulatory transcription factors, or statistical models of gene-gene relationships based on the premise that certain sets of genes tend to be coexpressed across a range of conditions and cell types. Advances in experimental and computational technologies have enabled the inference of these networks on an unprecedented scale and at unprecedented precision. Here, we delineate different types of gene regulatory networks and their cell-biological interpretation. We describe methods for inferring such networks from large-scale, multi-omics datasets and present applications that have aided our understanding of cellular ageing and disease mechanisms.
Collapse
Affiliation(s)
- Paula Unger Avila
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Tsimafei Padvitski
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - Ana Carolina Leote
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany
| | - He Chen
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany
- Department II of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Julio Saez-Rodriguez
- Faculty of Medicine and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany
| | - Martin Kann
- Department II of Internal Medicine, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
- Center for Molecular Medicine Cologne, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany
| | - Andreas Beyer
- Cluster of Excellence on Cellular Stress Responses in Aging-associated Diseases (CECAD), University of Cologne, Cologne, Germany.
- Center for Molecular Medicine Cologne, Faculty of Medicine and University Hospital Cologne, University of Cologne, Cologne, Germany.
- Institute for Genetics, Faculty of Mathematics and Natural Sciences, University of Cologne, Cologne, Germany.
| |
Collapse
|
6
|
García-Blay Ó, Verhagen PGA, Martin B, Hansen MMK. Exploring the role of transcriptional and post-transcriptional processes in mRNA co-expression. Bioessays 2023; 45:e2300130. [PMID: 37926676 DOI: 10.1002/bies.202300130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 09/18/2023] [Accepted: 10/09/2023] [Indexed: 11/07/2023]
Abstract
Co-expression of two or more genes at the single-cell level is usually associated with functional co-regulation. While mRNA co-expression-measured as the correlation in mRNA levels-can be influenced by both transcriptional and post-transcriptional events, transcriptional regulation is typically considered dominant. We review and connect the literature describing transcriptional and post-transcriptional regulation of co-expression. To enhance our understanding, we integrate four datasets spanning single-cell gene expression data, single-cell promoter activity data and individual transcript half-lives. Confirming expectations, we find that positive co-expression necessitates promoter coordination and similar mRNA half-lives. Surprisingly, negative co-expression is favored by differences in mRNA half-lives, contrary to initial predictions from stochastic simulations. Notably, this association manifests specifically within clusters of genes. We further observe a striking compensation between promoter coordination and mRNA half-lives, which additional stochastic simulations suggest might give rise to the observed co-expression patterns. These findings raise intriguing questions about the functional advantages conferred by this compensation between distal kinetic steps.
Collapse
Affiliation(s)
- Óscar García-Blay
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Pieter G A Verhagen
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Benjamin Martin
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| | - Maike M K Hansen
- Institute for Molecules and Materials, Radboud University, AJ, Nijmegen, the Netherlands
| |
Collapse
|
7
|
Young RL, Price SM, Schumer M, Wang S, Cummings ME. Individual variation in preference behavior in sailfin fish refines the neurotranscriptomic pathway for mate preference. Ecol Evol 2023; 13:e10323. [PMID: 37492456 PMCID: PMC10363800 DOI: 10.1002/ece3.10323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 06/22/2023] [Accepted: 06/30/2023] [Indexed: 07/27/2023] Open
Abstract
Social interactions can drive distinct gene expression profiles which may vary by social context. Here we use female sailfin molly fish (Poecilia latipinna) to identify genomic profiles associated with preference behavior in distinct social contexts: male interactions (mate choice) versus female interactions (shoaling partner preference). We measured the behavior of 15 females interacting in a non-contact environment with either two males or two females for 30 min followed by whole-brain transcriptomic profiling by RNA sequencing. We profiled females that exhibited high levels of social affiliation and great variation in preference behavior to identify an order of magnitude more differentially expressed genes associated with behavioral variation than by differences in social context. Using a linear model (limma), we took advantage of the individual variation in preference behavior to identify unique gene sets that exhibited distinct correlational patterns of expression with preference behavior in each social context. By combining limma and weighted gene co-expression network analyses (WGCNA) approaches we identified a refined set of 401 genes robustly associated with mate preference that is independent of shoaling partner preference or general social affiliation. While our refined gene set confirmed neural plasticity pathways involvement in moderating female preference behavior, we also identified a significant proportion of discovered that our preference-associated genes were enriched for 'immune system' gene ontology categories. We hypothesize that the association between mate preference and transcriptomic immune function is driven by the less well-known role of these genes in neural plasticity which is likely involved in higher-order learning and processing during mate choice decisions.
Collapse
Affiliation(s)
- Rebecca L. Young
- Department of Integrative BiologyUniversity of TexasAustinTexasUSA
| | - Sarah M. Price
- Department of Integrative BiologyUniversity of TexasAustinTexasUSA
| | - Molly Schumer
- Department of Ecology and Evolutionary BiologyPrinceton UniversityPrincetonNew JerseyUSA
- Present address:
Department of BiologyStanford UniversityStanfordCaliforniaUSA
| | - Silu Wang
- Department of Integrative BiologyUniversity of TexasAustinTexasUSA
- Present address:
Department of Integrative BiologyUniversity of California, BerkeleyBerkeleyCaliforniaUSA
| | | |
Collapse
|
8
|
Depuydt T, De Rybel B, Vandepoele K. Charting plant gene functions in the multi-omics and single-cell era. TRENDS IN PLANT SCIENCE 2023; 28:283-296. [PMID: 36307271 DOI: 10.1016/j.tplants.2022.09.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 09/09/2022] [Accepted: 09/30/2022] [Indexed: 06/16/2023]
Abstract
Despite the increased access to high-quality plant genome sequences, the set of genes with a known function remains far from complete. With the advent of novel bulk and single-cell omics profiling methods, we are entering a new era where advanced and highly integrative functional annotation strategies are being developed to elucidate the functions of all plant genes. Here, we review different multi-omics approaches to improve functional and regulatory gene characterization and highlight the power of machine learning and network biology to fully exploit the complementary information embedded in different omics layers. Finally, we discuss the potential of emerging single-cell methods and algorithms to further increase the resolution, allowing generation of functional insights about plant biology.
Collapse
Affiliation(s)
- Thomas Depuydt
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium
| | - Bert De Rybel
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium
| | - Klaas Vandepoele
- Ghent University, Department of Plant Biotechnology and Bioinformatics, Ghent, Belgium; Vlaams Instituut voor Biotechnologie, Center for Plant Systems Biology, Ghent, Belgium; Ghent University, Bioinformatics Institute Ghent, Ghent, Belgium.
| |
Collapse
|
9
|
Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, Mehryary F, Hachilif R, Gable AL, Fang T, Doncheva N, Pyysalo S, Bork P, Jensen L, von Mering C. The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 2023; 51:D638-D646. [PMID: 36370105 PMCID: PMC9825434 DOI: 10.1093/nar/gkac1000] [Citation(s) in RCA: 1720] [Impact Index Per Article: 1720.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/10/2022] [Accepted: 10/19/2022] [Indexed: 11/13/2022] Open
Abstract
Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.org/) systematically collects and integrates protein-protein interactions-both physical interactions as well as functional associations. The data originate from a number of sources: automated text mining of the scientific literature, computational interaction predictions from co-expression, conserved genomic context, databases of interaction experiments and known complexes/pathways from curated sources. All of these interactions are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information. The data can be accessed via the website, but also programmatically and via bulk downloads. The most recent developments in STRING (version 12.0) are: (i) it is now possible to create, browse and analyze a full interaction network for any novel genome of interest, by submitting its complement of encoded proteins, (ii) the co-expression channel now uses variational auto-encoders to predict interactions, and it covers two new sources, single-cell RNA-seq and experimental proteomics data and (iii) the confidence in each experimentally derived interaction is now estimated based on the detection method used, and communicated to the user in the web-interface. Furthermore, STRING continues to enhance its facilities for functional enrichment analysis, which are now fully available also for user-submitted genomes.
Collapse
Affiliation(s)
- Damian Szklarczyk
- Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Rebecca Kirsch
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Mikaela Koutrouli
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Katerina Nastou
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Farrokh Mehryary
- TurkuNLP lab, Department of Computing, University of Turku, 20014 Turku, Finland
| | - Radja Hachilif
- Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Annika L Gable
- Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Tao Fang
- Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Nadezhda T Doncheva
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Sampo Pyysalo
- TurkuNLP lab, Department of Computing, University of Turku, 20014 Turku, Finland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Yonsei Frontier Lab (YFL), Yonsei University, Seoul 03722, South Korea
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
- Department of Bioinformatics, Biozentrum, University of Würzburg, 97074 Würzburg, Germany
| | - Lars J Jensen
- Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, 2200 Copenhagen N, Denmark
| | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich, 8057 Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| |
Collapse
|
10
|
Chen Z, King WC, Hwang A, Gerstein M, Zhang J. DeepVelo: Single-cell transcriptomic deep velocity field learning with neural ordinary differential equations. SCIENCE ADVANCES 2022; 8:eabq3745. [PMID: 36449617 PMCID: PMC9710871 DOI: 10.1126/sciadv.abq3745] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Recent advances in single-cell sequencing technologies have provided unprecedented opportunities to measure the gene expression profile and RNA velocity of individual cells. However, modeling transcriptional dynamics is computationally challenging because of the high-dimensional, sparse nature of the single-cell gene expression measurements and the nonlinear regulatory relationships. Here, we present DeepVelo, a neural network-based ordinary differential equation that can model complex transcriptome dynamics by describing continuous-time gene expression changes within individual cells. We apply DeepVelo to public datasets from different sequencing platforms to (i) formulate transcriptome dynamics on different time scales, (ii) measure the instability of cell states, and (iii) identify developmental driver genes via perturbation analysis. Benchmarking against the state-of-the-art methods shows that DeepVelo can learn a more accurate representation of the velocity field. Furthermore, our perturbation studies reveal that single-cell dynamical systems could exhibit chaotic properties. In summary, DeepVelo allows data-driven discoveries of differential equations that delineate single-cell transcriptome dynamics.
Collapse
Affiliation(s)
- Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - William C. King
- Healthcare and Life Sciences, Microsoft, Redmond, WA 98052, USA
| | - Aheyon Hwang
- Mathematical, Computational, and Systems Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Mark Gerstein
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Corresponding author. (M.G.); (J.Z.)
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, USA
- Corresponding author. (M.G.); (J.Z.)
| |
Collapse
|
11
|
Cha J, Yu J, Cho JW, Hemberg M, Lee I. scHumanNet: a single-cell network analysis platform for the study of cell-type specificity of disease genes. Nucleic Acids Res 2022; 51:e8. [PMID: 36350625 PMCID: PMC9881140 DOI: 10.1093/nar/gkac1042] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 09/19/2022] [Accepted: 10/25/2022] [Indexed: 11/10/2022] Open
Abstract
A major challenge in single-cell biology is identifying cell-type-specific gene functions, which may substantially improve precision medicine. Differential expression analysis of genes is a popular, yet insufficient approach, and complementary methods that associate function with cell type are required. Here, we describe scHumanNet (https://github.com/netbiolab/scHumanNet), a single-cell network analysis platform for resolving cellular heterogeneity across gene functions in humans. Based on cell-type-specific gene networks (CGNs) constructed under the guidance of the HumanNet reference interactome, scHumanNet displayed higher functional relevance to the cellular context than CGNs built by other methods on single-cell transcriptome data. Cellular deconvolution of gene signatures based on network compactness across cell types revealed breast cancer prognostic markers associated with T cells. scHumanNet could also prioritize genes associated with particular cell types using CGN centrality and identified the differential hubness of CGNs between disease and healthy conditions. We demonstrated the usefulness of scHumanNet by uncovering T-cell-specific functional effects of GITR, a prognostic gene for breast cancer, and functional defects in autism spectrum disorder genes specific for inhibitory neurons. These results suggest that scHumanNet will advance our understanding of cell-type specificity across human disease genes.
Collapse
Affiliation(s)
- Junha Cha
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Jiwon Yu
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Jae-Won Cho
- Evergrande Center for Immunologic Disease, Harvard Medical School and Brigham and Women's Hospital, Boston, MA, USA
| | - Martin Hemberg
- Correspondence may also be addressed to Martin Hemberg. Tel: +1 857 307 1422;
| | - Insuk Lee
- To whom correspondence should be addressed. Tel: +82 2 2123 5559; Fax: +82 2 362 7265;
| |
Collapse
|
12
|
Yang M, Harrison BR, Promislow DEL. In search of a Drosophila core cellular network with single-cell transcriptome data. G3 GENES|GENOMES|GENETICS 2022; 12:6670625. [PMID: 35976114 PMCID: PMC9526075 DOI: 10.1093/g3journal/jkac212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022]
Abstract
Along with specialized functions, cells of multicellular organisms also perform essential functions common to most if not all cells. Whether diverse cells do this by using the same set of genes, interacting in a fixed coordinated fashion to execute essential functions, or a subset of genes specific to certain cells, remains a central question in biology. Here, we focus on gene coexpression to search for a core cellular network across a whole organism. Single-cell RNA-sequencing measures gene expression of individual cells, enabling researchers to discover gene expression patterns that contribute to the diversity of cell functions. Current efforts to study cellular functions focus primarily on identifying differentially expressed genes across cells. However, patterns of coexpression between genes are probably more indicative of biological processes than are the expression of individual genes. We constructed cell-type-specific gene coexpression networks using single-cell transcriptome datasets covering diverse cell types from the fruit fly, Drosophila melanogaster. We detected a set of highly coordinated genes preserved across cell types and present this as the best estimate of a core cellular network. This core is very small compared with cell-type-specific gene coexpression networks and shows dense connectivity. Gene members of this core tend to be ancient genes and are enriched for those encoding ribosomal proteins. Overall, we find evidence for a core cellular network in diverse cell types of the fruit fly. The topological, structural, functional, and evolutionary properties of this core indicate that it accounts for only a minority of essential functions.
Collapse
Affiliation(s)
- Ming Yang
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Benjamin R Harrison
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
| | - Daniel E L Promislow
- Department of Laboratory Medicine and Pathology, University of Washington School of Medicine , Seattle, WA 98195, USA
- Department of Biology, University of Washington , Seattle, WA 98195, USA
| |
Collapse
|
13
|
Shared regulation and functional relevance of local gene co-expression revealed by single cell analysis. Commun Biol 2022; 5:876. [PMID: 36028576 PMCID: PMC9418141 DOI: 10.1038/s42003-022-03831-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 08/10/2022] [Indexed: 02/01/2023] Open
Abstract
Most human genes are co-expressed with a nearby gene. Previous studies have revealed this local gene co-expression to be widespread across chromosomes and across dozens of tissues. Yet, so far these studies used bulk RNA-seq, averaging gene expression measurements across millions of cells, thus being unclear if this co-expression stems from transcription events in single cells. Here, we leverage single cell datasets in >85 individuals to identify gene co-expression across cells, unbiased by cell-type heterogeneity and benefiting from the co-occurrence of transcription events in single cells. We discover >3800 co-expressed gene pairs in two human cell types, induced pluripotent stem cells (iPSCs) and lymphoblastoid cell lines (LCLs) and (i) compare single cell to bulk RNA-seq in identifying local gene co-expression, (ii) show that many co-expressed genes – but not the majority – are composed of functionally related genes and (iii) using proteomics data, provide evidence that their co-expression is maintained up to the protein level. Finally, using single cell RNA-sequencing (scRNA-seq) and single cell ATAC-sequencing (scATAC-seq) data for the same single cells, we identify gene-enhancer associations and reveal that >95% of co-expressed gene pairs share regulatory elements. These results elucidate the potential reasons for co-expression in single cell gene regulatory networks and warrant a deeper study of shared regulatory elements, in view of explaining disease comorbidity due to affecting several genes. Our in-depth view of local gene co-expression and regulatory element co-activity advances our understanding of the shared regulatory architecture between genes. Using single-cell data from cell lines, the co-expression of genes and co-activity of regulatory elements is analyzed, providing insight into shared architecture and regulation between genes.
Collapse
|
14
|
Johnson KA, Krishnan A. Robust normalization and transformation techniques for constructing gene coexpression networks from RNA-seq data. Genome Biol 2022; 23:1. [PMID: 34980209 PMCID: PMC8721966 DOI: 10.1186/s13059-021-02568-9] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 12/06/2021] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Constructing gene coexpression networks is a powerful approach for analyzing high-throughput gene expression data towards module identification, gene function prediction, and disease-gene prioritization. While optimal workflows for constructing coexpression networks, including good choices for data pre-processing, normalization, and network transformation, have been developed for microarray-based expression data, such well-tested choices do not exist for RNA-seq data. Almost all studies that compare data processing and normalization methods for RNA-seq focus on the end goal of determining differential gene expression. RESULTS Here, we present a comprehensive benchmarking and analysis of 36 different workflows, each with a unique set of normalization and network transformation methods, for constructing coexpression networks from RNA-seq datasets. We test these workflows on both large, homogenous datasets and small, heterogeneous datasets from various labs. We analyze the workflows in terms of aggregate performance, individual method choices, and the impact of multiple dataset experimental factors. Our results demonstrate that between-sample normalization has the biggest impact, with counts adjusted by size factors producing networks that most accurately recapitulate known tissue-naive and tissue-aware gene functional relationships. CONCLUSIONS Based on this work, we provide concrete recommendations on robust procedures for building an accurate coexpression network from an RNA-seq dataset. In addition, researchers can examine all the results in great detail at https://krishnanlab.github.io/RNAseq_coexpression to make appropriate choices for coexpression analysis based on the experimental factors of their RNA-seq dataset.
Collapse
Affiliation(s)
- Kayla A Johnson
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Arjun Krishnan
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, 48824, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA.
| |
Collapse
|
15
|
A critical period of translational control during brain development at codon resolution. Nat Struct Mol Biol 2022; 29:1277-1290. [PMID: 36482253 PMCID: PMC9758057 DOI: 10.1038/s41594-022-00882-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 10/19/2022] [Indexed: 12/13/2022]
Abstract
Translation modulates the timing and amplification of gene expression after transcription. Brain development requires uniquely complex gene expression patterns, but large-scale measurements of translation directly in the prenatal brain are lacking. We measure the reactants, synthesis and products of mRNA translation spanning mouse neocortex neurogenesis, and discover a transient window of dynamic regulation at mid-gestation. Timed translation upregulation of chromatin-binding proteins like Satb2, which is essential for neuronal subtype differentiation, restricts protein expression in neuronal lineages despite broad transcriptional priming in progenitors. In contrast, translation downregulation of ribosomal proteins sharply decreases ribosome biogenesis, coinciding with a major shift in protein synthesis dynamics at mid-gestation. Changing activity of eIF4EBP1, a direct inhibitor of ribosome biogenesis, is concurrent with ribosome downregulation and affects neurogenesis of the Satb2 lineage. Thus, the molecular logic of brain development includes the refinement of transcriptional programs by translation. Modeling of the developmental neocortex translatome is provided as an open-source searchable resource at https://shiny.mdc-berlin.de/cortexomics .
Collapse
|