1
|
Zhao Y, Yu ZM, Cui T, Li LD, Li YY, Qian FC, Zhou LW, Li Y, Fang QL, Huang XM, Zhang QY, Cai FH, Dong FJ, Shang DS, Li CQ, Wang QY. scBlood: A comprehensive single-cell accessible chromatin database of blood cells. Comput Struct Biotechnol J 2024; 23:2746-2753. [PMID: 39050785 PMCID: PMC11266868 DOI: 10.1016/j.csbj.2024.06.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 06/17/2024] [Accepted: 06/18/2024] [Indexed: 07/27/2024] Open
Abstract
The advent of single cell transposase-accessible chromatin sequencing (scATAC-seq) technology enables us to explore the genomic characteristics and chromatin accessibility of blood cells at the single-cell level. To fully make sense of the roles and regulatory complexities of blood cells, it is critical to collect and analyze these rapidly accumulating scATAC-seq datasets at a system level. Here, we present scBlood (https://bio.liclab.net/scBlood/), a comprehensive single-cell accessible chromatin database of blood cells. The current version of scBlood catalogs 770,907 blood cells and 452,247 non-blood cells from ∼400 high-quality scATAC-seq samples covering 30 tissues and 21 disease types. All data hosted on scBlood have undergone preprocessing from raw fastq files and multiple standards of quality control. Furthermore, we conducted comprehensive downstream analyses, including multi-sample integration analysis, cell clustering and annotation, differential chromatin accessibility analysis, functional enrichment analysis, co-accessibility analysis, gene activity score calculation, and transcription factor (TF) enrichment analysis. In summary, scBlood provides a user-friendly interface for searching, browsing, analyzing, visualizing, and downloading scATAC-seq data of interest. This platform facilitates insights into the functions and regulatory mechanisms of blood cells, as well as their involvement in blood-related diseases.
Collapse
Affiliation(s)
- Yu Zhao
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Zheng-Min Yu
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Ting Cui
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Li-Dong Li
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Yan-Yu Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Feng-Cui Qian
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Li-Wei Zhou
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Ye Li
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Qiao-Li Fang
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Xue-Mei Huang
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Qin-Yi Zhang
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Fu-Hong Cai
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - Fu-Juan Dong
- School of Computer, University of South China, Hengyang, Hunan 421001, China
| | - De-Si Shang
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Chun-Quan Li
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Maternal and Child Health Care Hospital, National Health Commission Key Laboratory of Birth Defect Research and Prevention, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| | - Qiu-Yu Wang
- School of Computer, University of South China, Hengyang, Hunan 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Key Laboratory of Multi-omics And Artificial Intelligence of Cardiovascular Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
- Hunan Provincial Maternal and Child Health Care Hospital, National Health Commission Key Laboratory of Birth Defect Research and Prevention, Hengyang Medical School, University of South China, Hengyang, Hunan 421001, China
| |
Collapse
|
2
|
Hu Y, Wan S, Luo Y, Li Y, Wu T, Deng W, Jiang C, Jiang S, Zhang Y, Liu N, Yang Z, Chen F, Li B, Qu K. Benchmarking algorithms for single-cell multi-omics prediction and integration. Nat Methods 2024:10.1038/s41592-024-02429-w. [PMID: 39322753 DOI: 10.1038/s41592-024-02429-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 08/19/2024] [Indexed: 09/27/2024]
Abstract
The development of single-cell multi-omics technology has greatly enhanced our understanding of biology, and in parallel, numerous algorithms have been proposed to predict the protein abundance and/or chromatin accessibility of cells from single-cell transcriptomic information and to integrate various types of single-cell multi-omics data. However, few studies have systematically compared and evaluated the performance of these algorithms. Here, we present a benchmark study of 14 protein abundance/chromatin accessibility prediction algorithms and 18 single-cell multi-omics integration algorithms using 47 single-cell multi-omics datasets. Our benchmark study showed overall totalVI and scArches outperformed the other algorithms for predicting protein abundance, and LS_Lab was the top-performing algorithm for the prediction of chromatin accessibility in most cases. Seurat, MOJITOO and scAI emerge as leading algorithms for vertical integration, whereas totalVI and UINMF excel beyond their counterparts in both horizontal and mosaic integration scenarios. Additionally, we provide a pipeline to assist researchers in selecting the optimal multi-omics prediction and integration algorithm.
Collapse
Affiliation(s)
- Yinlei Hu
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- School of Mathematical Science, University of Science and Technology of China, Hefei, China
| | - Siyuan Wan
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, China
| | - Yuanhanyu Luo
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, China
- National Institute of Biological Sciences, Beijing, China
| | - Yuanzhe Li
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, China
| | - Tong Wu
- National Institute of Biological Sciences, Beijing, China
- College of Life Sciences, Beijing Normal University, Beijing, China
| | - Wentao Deng
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Chen Jiang
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China
| | - Shan Jiang
- National Institute of Biological Sciences, Beijing, China
| | - Yueping Zhang
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, China
| | - Nianping Liu
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China
| | - Zongcheng Yang
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China
| | - Falai Chen
- School of Mathematical Science, University of Science and Technology of China, Hefei, China.
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, China.
| | - Bin Li
- Tsinghua Institute of Multidisciplinary Biomedical Research, Tsinghua University, Beijing, China.
- National Institute of Biological Sciences, Beijing, China.
| | - Kun Qu
- Department of Oncology, The First Affiliated Hospital of USTC, School of Basic Medical Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, China.
- Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China.
- School of Artificial Intelligence and Data Science, University of Science and Technology of China, Hefei, China.
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, China.
| |
Collapse
|
3
|
Jin S, Plikus MV, Nie Q. CellChat for systematic analysis of cell-cell communication from single-cell transcriptomics. Nat Protoc 2024:10.1038/s41596-024-01045-4. [PMID: 39289562 DOI: 10.1038/s41596-024-01045-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 06/27/2024] [Indexed: 09/19/2024]
Abstract
Recent advances in single-cell sequencing technologies offer an opportunity to explore cell-cell communication in tissues systematically and with reduced bias. A key challenge is integrating known molecular interactions and measurements into a framework to identify and analyze complex cell-cell communication networks. Previously, we developed a computational tool, named CellChat, that infers and analyzes cell-cell communication networks from single-cell transcriptomic data within an easily interpretable framework. CellChat quantifies the signaling communication probability between two cell groups using a simplified mass-action-based model, which incorporates the core interaction between ligands and receptors with multisubunit structure along with modulation by cofactors. Importantly, CellChat performs a systematic and comparative analysis of cell-cell communication using a variety of quantitative metrics and machine-learning approaches. CellChat v2 is an updated version that includes additional comparison functionalities, an expanded database of ligand-receptor pairs along with rich functional annotations, and an Interactive CellChat Explorer. Here we provide a step-by-step protocol for using CellChat v2 on single-cell transcriptomic data, including inference and analysis of cell-cell communication from one dataset and identification of altered intercellular communication, signals and cell populations from different datasets across biological conditions. The R implementation of CellChat v2 toolkit and its tutorials together with the graphic outputs are available at https://github.com/jinworks/CellChat . This protocol typically takes ~5 min depending on dataset size and requires a basic understanding of R and single-cell data analysis but no specialized bioinformatics training for its implementation.
Collapse
Affiliation(s)
- Suoqin Jin
- School of Mathematics and Statistics, Wuhan University, Wuhan, China.
- Hubei Key Laboratory of Computational Science, Wuhan University, Wuhan, China.
| | - Maksim V Plikus
- NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA, USA
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA
| | - Qing Nie
- NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA, USA.
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, USA.
- Department of Mathematics, University of California, Irvine, Irvine, CA, USA.
| |
Collapse
|
4
|
Loers JU, Vermeirssen V. A single-cell multimodal view on gene regulatory network inference from transcriptomics and chromatin accessibility data. Brief Bioinform 2024; 25:bbae382. [PMID: 39207727 PMCID: PMC11359808 DOI: 10.1093/bib/bbae382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/27/2024] [Accepted: 07/23/2024] [Indexed: 09/04/2024] Open
Abstract
Eukaryotic gene regulation is a combinatorial, dynamic, and quantitative process that plays a vital role in development and disease and can be modeled at a systems level in gene regulatory networks (GRNs). The wealth of multi-omics data measured on the same samples and even on the same cells has lifted the field of GRN inference to the next stage. Combinations of (single-cell) transcriptomics and chromatin accessibility allow the prediction of fine-grained regulatory programs that go beyond mere correlation of transcription factor and target gene expression, with enhancer GRNs (eGRNs) modeling molecular interactions between transcription factors, regulatory elements, and target genes. In this review, we highlight the key components for successful (e)GRN inference from (sc)RNA-seq and (sc)ATAC-seq data exemplified by state-of-the-art methods as well as open challenges and future developments. Moreover, we address preprocessing strategies, metacell generation and computational omics pairing, transcription factor binding site detection, and linear and three-dimensional approaches to identify chromatin interactions as well as dynamic and causal eGRN inference. We believe that the integration of transcriptomics together with epigenomics data at a single-cell level is the new standard for mechanistic network inference, and that it can be further advanced with integrating additional omics layers and spatiotemporal data, as well as with shifting the focus towards more quantitative and causal modeling strategies.
Collapse
Affiliation(s)
- Jens Uwe Loers
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| | - Vanessa Vermeirssen
- Lab for Computational Biology, Integromics and Gene Regulation (CBIGR), Cancer Research Institute Ghent (CRIG), Corneel Heymanslaan 10, 9000 Ghent, Belgium
- Department of Biomedical Molecular Biology, Ghent University, Zwijnaarde-Technologiepark 71, 9052 Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Corneel Heymanslaan 10, 9000 Ghent, Belgium
| |
Collapse
|
5
|
Li Y, Ma A, Wang Y, Guo Q, Wang C, Fu H, Liu B, Ma Q. Enhancer-driven gene regulatory networks inference from single-cell RNA-seq and ATAC-seq data. Brief Bioinform 2024; 25:bbae369. [PMID: 39082647 PMCID: PMC11289686 DOI: 10.1093/bib/bbae369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Revised: 06/19/2024] [Accepted: 07/15/2024] [Indexed: 08/03/2024] Open
Abstract
Deciphering the intricate relationships between transcription factors (TFs), enhancers, and genes through the inference of enhancer-driven gene regulatory networks (eGRNs) is crucial in understanding gene regulatory programs in a complex biological system. This study introduces STREAM, a novel method that leverages a Steiner forest problem model, a hybrid biclustering pipeline, and submodular optimization to infer eGRNs from jointly profiled single-cell transcriptome and chromatin accessibility data. Compared to existing methods, STREAM demonstrates enhanced performance in terms of TF recovery, TF-enhancer linkage prediction, and enhancer-gene relation discovery. Application of STREAM to an Alzheimer's disease dataset and a diffuse small lymphocytic lymphoma dataset reveals its ability to identify TF-enhancer-gene relations associated with pseudotime, as well as key TF-enhancer-gene relations and TF cooperation underlying tumor cells.
Collapse
Affiliation(s)
- Yang Li
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
| | - Anjun Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, United States
| | - Yizhong Wang
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Qi Guo
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
| | - Cankun Wang
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
| | - Hongjun Fu
- Department of Neuroscience, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
| | - Bingqiang Liu
- School of Mathematics, Shandong University, Jinan, Shandong 250100, China
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, United States
- Pelotonia Institute for Immuno-Oncology, The James Comprehensive Cancer Center, The Ohio State University, Columbus, OH 43210, United States
| |
Collapse
|
6
|
Sun F, Li H, Sun D, Fu S, Gu L, Shao X, Wang Q, Dong X, Duan B, Xing F, Wu J, Xiao M, Zhao F, Han JDJ, Liu Q, Fan X, Li C, Wang C, Shi T. Single-cell omics: experimental workflow, data analyses and applications. SCIENCE CHINA. LIFE SCIENCES 2024:10.1007/s11427-023-2561-0. [PMID: 39060615 DOI: 10.1007/s11427-023-2561-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 04/18/2024] [Indexed: 07/28/2024]
Abstract
Cells are the fundamental units of biological systems and exhibit unique development trajectories and molecular features. Our exploration of how the genomes orchestrate the formation and maintenance of each cell, and control the cellular phenotypes of various organismsis, is both captivating and intricate. Since the inception of the first single-cell RNA technology, technologies related to single-cell sequencing have experienced rapid advancements in recent years. These technologies have expanded horizontally to include single-cell genome, epigenome, proteome, and metabolome, while vertically, they have progressed to integrate multiple omics data and incorporate additional information such as spatial scRNA-seq and CRISPR screening. Single-cell omics represent a groundbreaking advancement in the biomedical field, offering profound insights into the understanding of complex diseases, including cancers. Here, we comprehensively summarize recent advances in single-cell omics technologies, with a specific focus on the methodology section. This overview aims to guide researchers in selecting appropriate methods for single-cell sequencing and related data analysis.
Collapse
Affiliation(s)
- Fengying Sun
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China
| | - Haoyan Li
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Dongqing Sun
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Shaliu Fu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Lei Gu
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Shao
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China
| | - Qinqin Wang
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Bin Duan
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China
| | - Feiyang Xing
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China
| | - Jun Wu
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Minmin Xiao
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing-Dong J Han
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Center for Quantitative Biology (CQB), Peking University, Beijing, 100871, China.
| | - Qi Liu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Research Institute of Intelligent Computing, Zhejiang Lab, Hangzhou, 311121, China.
- Shanghai Research Institute for Intelligent Autonomous Systems, Shanghai, 201210, China.
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China.
- National Key Laboratory of Chinese Medicine Modernization, Innovation Center of Yangtze River Delta, Zhejiang University, Jiaxing, 314103, China.
- Zhejiang Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou, 310006, China.
| | - Chen Li
- Center for Single-cell Omics, School of Public Health, Shanghai Jiao Tong University School of Medicine, Shanghai, 200025, China.
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration (Tongji University), Ministry of Education, Orthopaedic Department, Tongji Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai, 200082, China.
- Frontier Science Center for Stem Cells, School of Life Sciences and Technology, Tongji University, Shanghai, 200092, China.
| | - Tieliu Shi
- Department of Clinical Laboratory, the Affiliated Wuhu Hospital of East China Normal University (The Second People's Hospital of Wuhu City), Wuhu, 241000, China.
- Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, the Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China.
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, 200062, China.
| |
Collapse
|
7
|
Chang Z, Xu Y, Dong X, Gao Y, Wang C. Single-cell and spatial multiomic inference of gene regulatory networks using SCRIPro. Bioinformatics 2024; 40:btae466. [PMID: 39024032 PMCID: PMC11288411 DOI: 10.1093/bioinformatics/btae466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 06/05/2024] [Accepted: 07/17/2024] [Indexed: 07/20/2024] Open
Abstract
MOTIVATION The burgeoning generation of single-cell or spatial multiomic data allows for the characterization of gene regulation networks (GRNs) at an unprecedented resolution. However, the accurate reconstruction of GRNs from sparse and noisy single-cell or spatial multiomic data remains challenging. RESULTS Here, we present SCRIPro, a comprehensive computational framework that robustly infers GRNs for both single-cell and spatial multi-omics data. SCRIPro first improves sample coverage through a density clustering approach based on multiomic and spatial similarities. Additionally, SCRIPro scans transcriptional regulator (TR) importance by performing chromatin reconstruction and in silico deletion analyses using a comprehensive reference covering 1,292 human and 994 mouse TRs. Finally, SCRIPro combines TR-target importance scores derived from multiomic data with TR-target expression levels to ensure precise GRN reconstruction. We benchmarked SCRIPro on various datasets, including single-cell multiomic data from human B-cell lymphoma, mouse hair follicle development, Stereo-seq of mouse embryos, and Spatial-ATAC-RNA from mouse brain. SCRIPro outperforms existing motif-based methods and accurately reconstructs cell type-specific, stage-specific, and region-specific GRNs. Overall, SCRIPro emerges as a streamlined and fast method capable of reconstructing TR activities and GRNs for both single-cell and spatial multi-omic data. AVAILABILITY SCRIPro is available at https://github.com/wanglabtongji/SCRIPro. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhanhe Chang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yunfan Xu
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
| | - Xin Dong
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
| | - Yawei Gao
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- Institute for Regenerative Medicine, Shanghai East Hospital, Shanghai Key Laboratory of Signaling and Disease Research, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Chenfei Wang
- Key Laboratory of Spine and Spinal Cord Injury Repair and Regeneration of Ministry of Education, Department of Orthopedics, Tongji Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Frontier Science Center for Stem Cell Research, Tongji University, Shanghai, China
- National Key Laboratory of Autonomous Intelligent Unmanned Systems, Tongji University, Shanghai 200120, China
- Frontier Science Center for Intelligent Autonomous Systems, Tongji University, Shanghai 200120, China
| |
Collapse
|
8
|
Warns J, Kim YI, O'Rourke R, Sagerström CG. scMultiome analysis identifies a single caudal hindbrain compartment in the developing zebrafish nervous system. Neural Dev 2024; 19:12. [PMID: 38970093 PMCID: PMC11225431 DOI: 10.1186/s13064-024-00189-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 06/27/2024] [Indexed: 07/07/2024] Open
Abstract
BACKGROUND A key step in nervous system development involves the coordinated control of neural progenitor specification and positioning. A long-standing model for the vertebrate CNS postulates that transient anatomical compartments - known as neuromeres - function to position neural progenitors along the embryonic anteroposterior neuraxis. Such neuromeres are apparent in the embryonic hindbrain - that contains six rhombomeres with morphologically apparent boundaries - but other neuromeres lack clear morphological boundaries and have instead been defined by different criteria, such as differences in gene expression patterns and the outcomes of transplantation experiments. Accordingly, the caudal hindbrain (CHB) posterior to rhombomere (r) 6 has been variably proposed to contain from two to five 'pseudo-rhombomeres', but the lack of comprehensive molecular data has precluded a detailed definition of such structures. METHODS We used single-cell Multiome analysis, which allows simultaneous characterization of gene expression and chromatin state of individual cell nuclei, to identify and characterize CHB progenitors in the developing zebrafish CNS. RESULTS We identified CHB progenitors as a transcriptionally distinct population, that also possesses a unique profile of accessible transcription factor binding motifs, relative to both r6 and the spinal cord. This CHB population can be subdivided along its dorsoventral axis based on molecular characteristics, but we do not find any molecular evidence that it contains multiple pseudo-rhombomeres. We further observe that the CHB is closely related to r6 at the earliest embryonic stages, but becomes more divergent over time, and that it is defined by a unique gene regulatory network. CONCLUSIONS We conclude that the early CHB represents a single neuromere compartment that cannot be molecularly subdivided into pseudo-rhombomeres and that it may share an embryonic origin with r6.
Collapse
Affiliation(s)
- Jessica Warns
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical School, 12801 E. 17th Avenue, Aurora, CO, 80045, USA
- Department of Science and Math, Northern State University, 1200 S. Jay St, Aberdeen, SD, 57401, USA
| | - Yong-Ii Kim
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical School, 12801 E. 17th Avenue, Aurora, CO, 80045, USA
| | - Rebecca O'Rourke
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical School, 12801 E. 17th Avenue, Aurora, CO, 80045, USA
| | - Charles G Sagerström
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical School, 12801 E. 17th Avenue, Aurora, CO, 80045, USA.
| |
Collapse
|
9
|
Moeckel C, Mouratidis I, Chantzi N, Uzun Y, Georgakopoulos-Soares I. Advances in computational and experimental approaches for deciphering transcriptional regulatory networks: Understanding the roles of cis-regulatory elements is essential, and recent research utilizing MPRAs, STARR-seq, CRISPR-Cas9, and machine learning has yielded valuable insights. Bioessays 2024; 46:e2300210. [PMID: 38715516 PMCID: PMC11444527 DOI: 10.1002/bies.202300210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/22/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024]
Abstract
Understanding the influence of cis-regulatory elements on gene regulation poses numerous challenges given complexities stemming from variations in transcription factor (TF) binding, chromatin accessibility, structural constraints, and cell-type differences. This review discusses the role of gene regulatory networks in enhancing understanding of transcriptional regulation and covers construction methods ranging from expression-based approaches to supervised machine learning. Additionally, key experimental methods, including MPRAs and CRISPR-Cas9-based screening, which have significantly contributed to understanding TF binding preferences and cis-regulatory element functions, are explored. Lastly, the potential of machine learning and artificial intelligence to unravel cis-regulatory logic is analyzed. These computational advances have far-reaching implications for precision medicine, therapeutic target discovery, and the study of genetic variations in health and disease.
Collapse
Affiliation(s)
- Camille Moeckel
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
10
|
Xie H, Deng YM, Li JY, Xie KH, Tao T, Zhang JF. Predicting the risk of primary Sjögren's syndrome with key N7-methylguanosine-related genes: A novel XGBoost model. Heliyon 2024; 10:e31307. [PMID: 38803884 PMCID: PMC11128997 DOI: 10.1016/j.heliyon.2024.e31307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 05/10/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024] Open
Abstract
Objectives N7-methylguanosine (m7G) plays a crucial role in mRNA metabolism and other biological processes. However, its regulators' function in Primary Sjögren's Syndrome (PSS) remains enigmatic. Methods We screened five key m7G-related genes across multiple datasets, leveraging statistical and machine learning computations. Based on these genes, we developed a prediction model employing the extreme gradient boosting decision tree (XGBoost) method to assess PSS risk. Immune infiltration in PSS samples was analyzed using the ssGSEA method, revealing the immune landscape of PSS patients. Results The XGBoost model exhibited high accuracy, AUC, sensitivity, and specificity in both training, test sets and extra-test set. The decision curve confirmed its clinical utility. Our findings suggest that m7G methylation might contribute to PSS pathogenesis through immune modulation. Conclusions m7G regulators play an important role in the development of PSS. Our study of m7G-realted genes may inform future immunotherapy strategies for PSS.
Collapse
Affiliation(s)
- Hui Xie
- Department of Radiotherapy, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, PR China
- Faulty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China
| | - Yin-mei Deng
- Department of Nursing, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, PR China
| | - Jiao-yan Li
- Department of Rheumatology and Clinical Immunology, The First Hospital of Changsha, 410005, Changsha, PR China
| | - Kai-hong Xie
- Department of Oncology, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, PR China
| | - Tan Tao
- Faulty of Applied Sciences, Macao Polytechnic University, Macao, 999078, PR China
| | - Jian-fang Zhang
- Department of Physical Examination, Center for Disease Control and Prevention of Beihu District, Chenzhou, 423000, PR China
| |
Collapse
|
11
|
Ledru N, Wilson PC, Muto Y, Yoshimura Y, Wu H, Li D, Asthana A, Tullius SG, Waikar SS, Orlando G, Humphreys BD. Predicting proximal tubule failed repair drivers through regularized regression analysis of single cell multiomic sequencing. Nat Commun 2024; 15:1291. [PMID: 38347009 PMCID: PMC10861555 DOI: 10.1038/s41467-024-45706-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 01/31/2024] [Indexed: 02/15/2024] Open
Abstract
Renal proximal tubule epithelial cells have considerable intrinsic repair capacity following injury. However, a fraction of injured proximal tubule cells fails to undergo normal repair and assumes a proinflammatory and profibrotic phenotype that may promote fibrosis and chronic kidney disease. The healthy to failed repair change is marked by cell state-specific transcriptomic and epigenomic changes. Single nucleus joint RNA- and ATAC-seq sequencing offers an opportunity to study the gene regulatory networks underpinning these changes in order to identify key regulatory drivers. We develop a regularized regression approach to construct genome-wide parametric gene regulatory networks using multiomic datasets. We generate a single nucleus multiomic dataset from seven adult human kidney samples and apply our method to study drivers of a failed injury response associated with kidney disease. We demonstrate that our approach is a highly effective tool for predicting key cis- and trans-regulatory elements underpinning the healthy to failed repair transition and use it to identify NFAT5 as a driver of the maladaptive proximal tubule state.
Collapse
Affiliation(s)
- Nicolas Ledru
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Parker C Wilson
- Division of Anatomic and Molecular Pathology, Department of Pathology and Immunology, Washington University in St. Louis, St. Louis, MO, USA
| | - Yoshiharu Muto
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Yasuhiro Yoshimura
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Haojia Wu
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Dian Li
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA
| | - Amish Asthana
- Department of Surgery, Wake Forest Baptist Medical Center; Wake Forest Institute for Regenerative Medicine, Wake Forest School of Medicine, Winston Salem, NC, USA
| | - Stefan G Tullius
- Division of Transplant Surgery and Transplant Surgery Research Laboratory, Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Sushrut S Waikar
- Section of Nephrology, Department of Medicine, Boston University Chobanian and Avedisian School of Medicine, Boston Medical Center, Boston, MA, USA
| | - Giuseppe Orlando
- Department of Surgery, Wake Forest Baptist Medical Center; Wake Forest Institute for Regenerative Medicine, Wake Forest School of Medicine, Winston Salem, NC, USA
| | - Benjamin D Humphreys
- Division of Nephrology, Department of Medicine, Washington University in St. Louis School of Medicine, St. Louis, MO, USA.
- Department of Developmental Biology, Washington University in St. Louis School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
12
|
Lin Y, Wu TY, Chen X, Wan S, Chao B, Xin J, Yang JYH, Wong WH, Wang YXR. Data integration and inference of gene regulation using single-cell temporal multimodal data with scTIE. Genome Res 2024; 34:119-133. [PMID: 38190633 PMCID: PMC10903952 DOI: 10.1101/gr.277960.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 12/13/2023] [Indexed: 01/10/2024]
Abstract
Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space by using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal data sets, we show scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome data set we generated from differentiating mouse embryonic stem cells over time, we show scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.
Collapse
Affiliation(s)
- Yingxin Lin
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR 999077, China
| | - Tung-Yu Wu
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Xi Chen
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Sheng Wan
- Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan
| | - Brian Chao
- Department of Electrical Engineering, Stanford University, Stanford, California 94305-9505, USA
| | - Jingxue Xin
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA
| | - Jean Y H Yang
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia
- Charles Perkins Centre, The University of Sydney, NSW 2006, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR 999077, China
| | - Wing H Wong
- Department of Statistics, Stanford University, Stanford, California 94305-4020, USA;
- Department of Biomedical Data Science, Stanford University, Stanford, California 94305-5464, USA
- Bio-X Program, Stanford University, Stanford, California 94305, USA
| | - Y X Rachel Wang
- School of Mathematics and Statistics, The University of Sydney, NSW 2006, Australia;
| |
Collapse
|
13
|
Huang X, Song C, Zhang G, Li Y, Zhao Y, Zhang Q, Zhang Y, Fan S, Zhao J, Xie L, Li C. scGRN: a comprehensive single-cell gene regulatory network platform of human and mouse. Nucleic Acids Res 2024; 52:D293-D303. [PMID: 37889053 PMCID: PMC10767939 DOI: 10.1093/nar/gkad885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/19/2023] [Accepted: 10/12/2023] [Indexed: 10/28/2023] Open
Abstract
Gene regulatory networks (GRNs) are interpretable graph models encompassing the regulatory interactions between transcription factors (TFs) and their downstream target genes. Making sense of the topology and dynamics of GRNs is fundamental to interpreting the mechanisms of disease etiology and translating corresponding findings into novel therapies. Recent advances in single-cell multi-omics techniques have prompted the computational inference of GRNs from single-cell transcriptomic and epigenomic data at an unprecedented resolution. Here, we present scGRN (https://bio.liclab.net/scGRN/), a comprehensive single-cell multi-omics gene regulatory network platform of human and mouse. The current version of scGRN catalogs 237 051 cell type-specific GRNs (62 999 692 TF-target gene pairs), covering 160 tissues/cell lines and 1324 single-cell samples. scGRN is the first resource documenting large-scale cell type-specific GRN information of diverse human and mouse conditions inferred from single-cell multi-omics data. We have implemented multiple online tools for effective GRN analysis, including differential TF-target network analysis, TF enrichment analysis, and pathway downstream analysis. We also provided details about TF binding to promoters, super-enhancers and typical enhancers of target genes in GRNs. Taken together, scGRN is an integrative and useful platform for searching, browsing, analyzing, visualizing and downloading GRNs of interest, enabling insight into the differences in regulatory mechanisms across diverse conditions.
Collapse
Affiliation(s)
- Xuemei Huang
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Chao Song
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Department of Cardiology, Hengyang Medical School, University of South China, Hengyang, China
| | - Guorui Zhang
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Ye Li
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Yu Zhao
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Qinyi Zhang
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Yuexin Zhang
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Institute of Cardiovascular Disease, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Shifan Fan
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Jun Zhao
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Liyuan Xie
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| | - Chunquan Li
- The First Affiliated Hospital & MOE Key Lab of Rare Pediatric Diseases, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Key Laboratory of Multi-omics and Artificial Intelligence of Cardiovascular Diseases & College of Basic Medical Sciences, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- School of Computer, University of South China, Hengyang, Hunan, 421001, China
- Hunan Provincial Maternal and Child Health Care Hospital, National Health Commission Key Laboratory of Birth Defect Research and Prevention, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
- The First Affiliated Hospital, Cardiovascular Lab of Big Data and Imaging Artificial Intelligence, Hengyang Medical School, University of South China, Hengyang, Hunan, 421001, China
| |
Collapse
|
14
|
Kim YI, O'Rourke R, Sagerström CG. scMultiome analysis identifies embryonic hindbrain progenitors with mixed rhombomere identities. eLife 2023; 12:e87772. [PMID: 37947350 PMCID: PMC10662952 DOI: 10.7554/elife.87772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 11/09/2023] [Indexed: 11/12/2023] Open
Abstract
Rhombomeres serve to position neural progenitors in the embryonic hindbrain, thereby ensuring appropriate neural circuit formation, but the molecular identities of individual rhombomeres and the mechanism whereby they form has not been fully established. Here, we apply scMultiome analysis in zebrafish to molecularly resolve all rhombomeres for the first time. We find that rhombomeres become molecularly distinct between 10hpf (end of gastrulation) and 13hpf (early segmentation). While the embryonic hindbrain transiently contains alternating odd- versus even-type rhombomeres, our scMultiome analyses do not detect extensive odd versus even molecular characteristics in the early hindbrain. Instead, we find that each rhombomere displays a unique gene expression and chromatin profile. Prior to the appearance of distinct rhombomeres, we detect three hindbrain progenitor clusters (PHPDs) that correlate with the earliest visually observed segments in the hindbrain primordium that represent prospective rhombomere r2/r3 (possibly including r1), r4, and r5/r6, respectively. We further find that the PHPDs form in response to Fgf and RA morphogens and that individual PHPD cells co-express markers of multiple mature rhombomeres. We propose that the PHPDs contain mixed-identity progenitors and that their subdivision into individual rhombomeres requires the resolution of mixed transcription and chromatin states.
Collapse
Affiliation(s)
- Yong-Il Kim
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical SchoolAuroraUnited States
| | - Rebecca O'Rourke
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical SchoolAuroraUnited States
| | - Charles G Sagerström
- Section of Developmental Biology, Department of Pediatrics, University of Colorado Medical SchoolAuroraUnited States
| |
Collapse
|
15
|
Badia-I-Mompel P, Wessels L, Müller-Dott S, Trimbour R, Ramirez Flores RO, Argelaguet R, Saez-Rodriguez J. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023; 24:739-754. [PMID: 37365273 DOI: 10.1038/s41576-023-00618-5] [Citation(s) in RCA: 70] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2023] [Indexed: 06/28/2023]
Abstract
The interplay between chromatin, transcription factors and genes generates complex regulatory circuits that can be represented as gene regulatory networks (GRNs). The study of GRNs is useful to understand how cellular identity is established, maintained and disrupted in disease. GRNs can be inferred from experimental data - historically, bulk omics data - and/or from the literature. The advent of single-cell multi-omics technologies has led to the development of novel computational methods that leverage genomic, transcriptomic and chromatin accessibility information to infer GRNs at an unprecedented resolution. Here, we review the key principles of inferring GRNs that encompass transcription factor-gene interactions from transcriptomics and chromatin accessibility data. We focus on the comparison and classification of methods that use single-cell multimodal data. We highlight challenges in GRN inference, in particular with respect to benchmarking, and potential further developments using additional data modalities.
Collapse
Affiliation(s)
- Pau Badia-I-Mompel
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Lorna Wessels
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
- Department of Vascular Biology and Tumor Angiogenesis, European Center for Angioscience, Medical Faculty, MannHeim Heidelberg University, Mannheim, Germany
| | - Sophia Müller-Dott
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Rémi Trimbour
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
- Institut Pasteur, Université Paris Cité, CNRS UMR 3738, Machine Learning for Integrative Genomics Group, Paris, France
| | - Ricardo O Ramirez Flores
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | | | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.
| |
Collapse
|
16
|
Kim D, Tran A, Kim HJ, Lin Y, Yang JYH, Yang P. Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data. NPJ Syst Biol Appl 2023; 9:51. [PMID: 37857632 PMCID: PMC10587078 DOI: 10.1038/s41540-023-00312-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/02/2023] [Indexed: 10/21/2023] Open
Abstract
Inferring gene regulatory networks (GRNs) is a fundamental challenge in biology that aims to unravel the complex relationships between genes and their regulators. Deciphering these networks plays a critical role in understanding the underlying regulatory crosstalk that drives many cellular processes and diseases. Recent advances in sequencing technology have led to the development of state-of-the-art GRN inference methods that exploit matched single-cell multi-omic data. By employing diverse mathematical and statistical methodologies, these methods aim to reconstruct more comprehensive and precise gene regulatory networks. In this review, we give a brief overview on the statistical and methodological foundations commonly used in GRN inference methods. We then compare and contrast the latest state-of-the-art GRN inference methods for single-cell matched multi-omics data, and discuss their assumptions, limitations and opportunities. Finally, we discuss the challenges and future directions that hold promise for further advancements in this rapidly developing field.
Collapse
Affiliation(s)
- Daniel Kim
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
| | - Andy Tran
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia
| | - Hani Jieun Kim
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
| | - Yingxin Lin
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia
| | - Jean Yee Hwa Yang
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia.
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia.
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia.
| | - Pengyi Yang
- School of Mathematics and Statistics, University of Sydney, Camperdown, NSW, Australia.
- Computational Systems Biology Unit, Children's Medical Research Institute, University of Sydney, Camperdown, NSW, Australia.
- Sydney Precision Data Science Centre, University of Sydney, Camperdown, NSW, Australia.
- Charles Perkins Centre, University of Sydney, Camperdown, NSW, Australia.
| |
Collapse
|
17
|
Guo M, Wikenheiser-Brokamp KA, Kitzmiller JA, Jiang C, Wang G, Wang A, Preissl S, Hou X, Buchanan J, Karolak JA, Miao Y, Frank DB, Zacharias WJ, Sun X, Xu Y, Gu M, Stankiewicz P, Kalinichenko VV, Wambach JA, Whitsett JA. Single Cell Multiomics Identifies Cells and Genetic Networks Underlying Alveolar Capillary Dysplasia. Am J Respir Crit Care Med 2023; 208:709-725. [PMID: 37463497 PMCID: PMC10515568 DOI: 10.1164/rccm.202210-2015oc] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Accepted: 07/18/2023] [Indexed: 07/20/2023] Open
Abstract
Rationale: Alveolar capillary dysplasia with misalignment of pulmonary veins (ACDMPV) is a lethal developmental disorder of lung morphogenesis caused by insufficiency of FOXF1 (forkhead box F1) transcription factor function. The cellular and transcriptional mechanisms by which FOXF1 deficiency disrupts human lung formation are unknown. Objectives: To identify cell types, gene networks, and cell-cell interactions underlying the pathogenesis of ACDMPV. Methods: We used single-nucleus RNA and assay for transposase-accessible chromatin sequencing, immunofluorescence confocal microscopy, and RNA in situ hybridization to identify cell types and molecular networks influenced by FOXF1 in ACDMPV lungs. Measurements and Main Results: Pathogenic single-nucleotide variants and copy-number variant deletions involving the FOXF1 gene locus in all subjects with ACDMPV (n = 6) were accompanied by marked changes in lung structure, including deficient alveolar development and a paucity of pulmonary microvasculature. Single-nucleus RNA and assay for transposase-accessible chromatin sequencing identified alterations in cell number and gene expression in endothelial cells (ECs), pericytes, fibroblasts, and epithelial cells in ACDMPV lungs. Distinct cell-autonomous roles for FOXF1 in capillary ECs and pericytes were identified. Pathogenic variants involving the FOXF1 gene locus disrupt gene expression in EC progenitors, inhibiting the differentiation or survival of capillary 2 ECs and cell-cell interactions necessary for both pulmonary vasculogenesis and alveolar type 1 cell differentiation. Loss of the pulmonary microvasculature was associated with increased VEGFA (vascular endothelial growth factor A) signaling and marked expansion of systemic bronchial ECs expressing COL15A1 (collagen type XV α 1 chain). Conclusions: Distinct FOXF1 gene regulatory networks were identified in subsets of pulmonary endothelial and fibroblast progenitors, providing both cellular and molecular targets for the development of therapies for ACDMPV and other diffuse lung diseases of infancy.
Collapse
Affiliation(s)
- Minzhe Guo
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Department of Pediatrics and
| | - Kathryn A. Wikenheiser-Brokamp
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Division of Pathology and Laboratory Medicine
- Department of Pathology & Laboratory Medicine, College of Medicine, University of Cincinnati, Cincinnati, Ohio
| | - Joseph A. Kitzmiller
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
| | - Cheng Jiang
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
| | - Guolun Wang
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Center for Lung Regenerative Medicine
| | - Allen Wang
- Center for Epigenomics & Department of Cellular & Molecular Medicine
| | - Sebastian Preissl
- Center for Epigenomics & Department of Cellular & Molecular Medicine
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany
| | - Xiaomeng Hou
- Center for Epigenomics & Department of Cellular & Molecular Medicine
| | - Justin Buchanan
- Center for Epigenomics & Department of Cellular & Molecular Medicine
| | - Justyna A. Karolak
- Department of Genetics and Pharmaceutical Microbiology, Poznan University of Medical Sciences, Poznan, Poland
| | - Yifei Miao
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Division of Developmental Biology, and
- Center for Stem Cell and Organoid Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio
- Department of Pediatrics and
| | - David B. Frank
- Penn-CHOP Lung Biology Institute and
- Penn Cardiovascular Institute, University of Pennsylvania, Philadelphia, Pennsylvania
- Division of Cardiology, Department of Pediatrics, Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania
| | - William J. Zacharias
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Department of Pediatrics and
| | - Xin Sun
- Department of Pediatrics, and
- Department of Biological Sciences, University of California, San Diego, La Jolla, California
| | - Yan Xu
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Division of Biomedical Informatics
- Department of Pediatrics and
| | - Mingxia Gu
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Division of Developmental Biology, and
- Center for Stem Cell and Organoid Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio
- Department of Pediatrics and
| | - Pawel Stankiewicz
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas; and
| | - Vladimir V. Kalinichenko
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Center for Lung Regenerative Medicine
- Department of Pediatrics and
| | - Jennifer A. Wambach
- Edward Mallinckrodt Department of Pediatrics, Washington University School of Medicine and St. Louis Children’s Hospital, St. Louis, Missouri
| | - Jeffrey A. Whitsett
- The Perinatal Institute and Section of Neonatology, Perinatal and Pulmonary Biology
- Department of Pediatrics and
| |
Collapse
|
18
|
Duan Z, Dai Y, Hwang A, Lee C, Xie K, Xiao C, Xu M, Girgenti MJ, Zhang J. iHerd: an integrative hierarchical graph representation learning framework to quantify network changes and prioritize risk genes in disease. PLoS Comput Biol 2023; 19:e1011444. [PMID: 37695793 PMCID: PMC10513318 DOI: 10.1371/journal.pcbi.1011444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Revised: 09/21/2023] [Accepted: 08/19/2023] [Indexed: 09/13/2023] Open
Abstract
Different genes form complex networks within cells to carry out critical cellular functions, while network alterations in this process can potentially introduce downstream transcriptome perturbations and phenotypic variations. Therefore, developing efficient and interpretable methods to quantify network changes and pinpoint driver genes across conditions is crucial. We propose a hierarchical graph representation learning method, called iHerd. Given a set of networks, iHerd first hierarchically generates a series of coarsened sub-graphs in a data-driven manner, representing network modules at different resolutions (e.g., the level of signaling pathways). Then, it sequentially learns low-dimensional node representations at all hierarchical levels via efficient graph embedding. Lastly, iHerd projects separate gene embeddings onto the same latent space in its graph alignment module to calculate a rewiring index for driver gene prioritization. To demonstrate its effectiveness, we applied iHerd on a tumor-to-normal GRN rewiring analysis and cell-type-specific GCN analysis using single-cell multiome data of the brain. We showed that iHerd can effectively pinpoint novel and well-known risk genes in different diseases. Distinct from existing models, iHerd's graph coarsening for hierarchical learning allows us to successfully classify network driver genes into early and late divergent genes (EDGs and LDGs), emphasizing genes with extensive network changes across and within signaling pathway levels. This unique approach for driver gene classification can provide us with deeper molecular insights. The code is freely available at https://github.com/aicb-ZhangLabs/iHerd. All other relevant data are within the manuscript and supporting information files.
Collapse
Affiliation(s)
- Ziheng Duan
- Department of Computer Science, University of California, Irvine, California, United States of America
| | - Yi Dai
- Department of Computer Science, University of California, Irvine, California, United States of America
| | - Ahyeon Hwang
- Department of Computer Science, University of California, Irvine, California, United States of America
| | - Cheyu Lee
- Department of Computer Science, University of California, Irvine, California, United States of America
| | - Kaichi Xie
- Department of Computer Science, University of California, Davis, California, United States of America
| | - Chutong Xiao
- Department of Computer Science, University of California, Irvine, California, United States of America
| | - Min Xu
- Department of Computational Biology, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Matthew J. Girgenti
- Department of Psychiatry, School of Medicine, Yale University, New Haven, Connecticut, United States of America
- Clinical Neurosciences Division, National Center for PTSD, U.S. Department of Veterans Affairs, West Haven, Connecticut, United States of America
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, California, United States of America
| |
Collapse
|
19
|
Gaulton KJ, Preissl S, Ren B. Interpreting non-coding disease-associated human variants using single-cell epigenomics. Nat Rev Genet 2023; 24:516-534. [PMID: 37161089 PMCID: PMC10629587 DOI: 10.1038/s41576-023-00598-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/27/2023] [Indexed: 05/11/2023]
Abstract
Genome-wide association studies (GWAS) have linked hundreds of thousands of sequence variants in the human genome to common traits and diseases. However, translating this knowledge into a mechanistic understanding of disease-relevant biology remains challenging, largely because such variants are predominantly in non-protein-coding sequences that still lack functional annotation at cell-type resolution. Recent advances in single-cell epigenomics assays have enabled the generation of cell type-, subtype- and state-resolved maps of the epigenome in heterogeneous human tissues. These maps have facilitated cell type-specific annotation of candidate cis-regulatory elements and their gene targets in the human genome, enhancing our ability to interpret the genetic basis of common traits and diseases.
Collapse
Affiliation(s)
- Kyle J Gaulton
- Department of Paediatrics, Paediatric Diabetes Research Center, University of California San Diego School of Medicine, La Jolla, CA, USA.
| | - Sebastian Preissl
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Experimental and Clinical Pharmacology and Toxicology, Faculty of Medicine, University of Freiburg, Freiburg, Germany.
| | - Bing Ren
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
| |
Collapse
|
20
|
Lin Y, Wu TY, Chen X, Wan S, Chao B, Xin J, Yang JY, Wong WH, Wang YXR. scTIE: data integration and inference of gene regulation using single-cell temporal multimodal data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.18.541381. [PMID: 37292801 PMCID: PMC10245711 DOI: 10.1101/2023.05.18.541381] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Single-cell technologies offer unprecedented opportunities to dissect gene regulatory mechanisms in context-specific ways. Although there are computational methods for extracting gene regulatory relationships from scRNA-seq and scATAC-seq data, the data integration problem, essential for accurate cell type identification, has been mostly treated as a standalone challenge. Here we present scTIE, a unified method that integrates temporal multimodal data and infers regulatory relationships predictive of cellular state changes. scTIE uses an autoencoder to embed cells from all time points into a common space using iterative optimal transport, followed by extracting interpretable information to predict cell trajectories. Using a variety of synthetic and real temporal multimodal datasets, we demonstrate scTIE achieves effective data integration while preserving more biological signals than existing methods, particularly in the presence of batch effects and noise. Furthermore, on the exemplar multiome dataset we generated from differentiating mouse embryonic stem cells over time, we demonstrate scTIE captures regulatory elements highly predictive of cell transition probabilities, providing new potentials to understand the regulatory landscape driving developmental processes.
Collapse
Affiliation(s)
- Yingxin Lin
- School of Mathematics and Statistics, The University of Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Tung-Yu Wu
- Department of Statistics, Stanford University, CA, USA
| | - Xi Chen
- Department of Statistics, Stanford University, CA, USA
| | - Sheng Wan
- Institute of Electronics, National Yang Ming Chiao Tung University, Hsinchu, Taiwan
| | - Brian Chao
- Department of Electrical Engineering, Stanford University, CA, USA
| | - Jingxue Xin
- Department of Statistics, Stanford University, CA, USA
| | - Jean Y.H. Yang
- School of Mathematics and Statistics, The University of Sydney, NSW, Australia
- Charles Perkins Centre, The University of Sydney, NSW, Australia
- Laboratory of Data Discovery for Health Limited (D24H), Science Park, Hong Kong SAR, China
| | - Wing H. Wong
- Department of Statistics, Stanford University, CA, USA
- Department of Biomedical Data Science, Stanford University, CA, USA
- Bio-X Program, Stanford University, CA, USA
| | - Y. X. Rachel Wang
- School of Mathematics and Statistics, The University of Sydney, NSW, Australia
| |
Collapse
|
21
|
Genome and Transcriptome-Wide Analysis of OsWRKY and OsNAC Gene Families in Oryza sativa and Their Response to White-Backed Planthopper Infestation. Int J Mol Sci 2022; 23:ijms232315396. [PMID: 36499722 PMCID: PMC9739594 DOI: 10.3390/ijms232315396] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 11/27/2022] [Accepted: 12/02/2022] [Indexed: 12/12/2022] Open
Abstract
Plants are threatened by a wide variety of herbivorous insect assaults, and display a variety of inherent and induced defenses that shield them against herbivore attacks. Looking at the massive damage caused by the white-backed planthopper (WBPH), Sogatella furcifera, we undertook a study to identify and functionally annotate OsWRKY and OsNAC transcription factors (TFs) in rice, especially their involvement in WBPH stress. OsWRKY and OsNAC TFs are involved in various developmental processes and responses to biotic and abiotic stresses. However, no comprehensive reports are available on the specific phycological functions of most of the OsWRKY and OsNAC genes in rice during WBPH infestation. The current study aimed to comprehensively explore the OsWRKY and OsNAC genes by analyzing their phylogenetic relationships, subcellular localizations, exon-intron arrangements, conserved motif identities, chromosomal allocations, interaction networks and differential gene expressions during stress conditions. Comparative phylogenetic trees of 101 OsWRKY with 72 AtWRKY genes, and 121 OsNAC with 110 AtNAC genes were constructed to study relationships among these TFs across species. Phylogenetic relationships classified OsWRKY and OsNAC into eight and nine clades, respectively. Most TFs in the same clade had similar genomic features that represented similar functions, and had a high degree of co-expression. Some OsWRKYs (Os09g0417800 (OsWRKY62), Os11g0117600 (OsWRKY50), Os11g0117400 (OsWRKY104) and OsNACs (Os05g0442700, Os12g0630800, Os01g0862800 and Os12g0156100)) showed significantly higher expressions under WBPH infestation, based on transcriptome datasets. This study provides valuable information and clues about predicting the potential roles of OsWRKYs and OsNACs in rice, by combining their genome-wide characterization, expression profiling, protein-protein interactions and gene expressions under WBPH stress. These findings may require additional investigation to understand their metabolic and expression processes, and to develop rice cultivars that are resistant to WBPH.
Collapse
|
22
|
Zhang Q, Jin S, Zou X. scAB detects multiresolution cell states with clinical significance by integrating single-cell genomics and bulk sequencing data. Nucleic Acids Res 2022; 50:12112-12130. [PMID: 36440766 PMCID: PMC9757078 DOI: 10.1093/nar/gkac1109] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 10/31/2022] [Accepted: 11/05/2022] [Indexed: 11/29/2022] Open
Abstract
Although single-cell sequencing has provided a powerful tool to deconvolute cellular heterogeneity of diseases like cancer, extrapolating clinical significance or identifying clinically-relevant cells remains challenging. Here, we propose a novel computational method scAB, which integrates single-cell genomics data with clinically annotated bulk sequencing data via a knowledge- and graph-guided matrix factorization model. Once combined, scAB provides a coarse- and fine-grain multiresolution perspective of phenotype-associated cell states and prognostic signatures previously not visible by single-cell genomics. We use scAB to enhance live cancer single-cell RNA-seq data, identifying clinically-relevant previously unrecognized cancer and stromal cell subsets whose signatures show a stronger poor-survival association. The identified fine-grain cell subsets are associated with distinct cancer hallmarks and prognosis power. Furthermore, scAB demonstrates its utility as a biomarker identification tool, with the ability to predict immunotherapy, drug responses and survival when applied to melanoma single-cell RNA-seq datasets and glioma single-cell ATAC-seq datasets. Across multiple single-cell and bulk datasets from different cancer types, we also demonstrate the superior performance of scAB in generating prognosis signatures and survival predictions over existing models. Overall, scAB provides an efficient tool for prioritizing clinically-relevant cell subsets and predictive signatures, utilizing large publicly available databases to improve prognosis and treatments.
Collapse
Affiliation(s)
- Qinran Zhang
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China,Hubei Key Laboratory of Computational Science, Wuhan University, Wuhan 430072, China
| | - Suoqin Jin
- To whom correspondence should be addressed. Tel: +86 027 68752957; Fax: +86 027 68752256;
| | - Xiufen Zou
- Correspondence may also be addressed to Xiufen Zou. Tel: +86 027 68752957; Fax: +86 027 68752256;
| |
Collapse
|
23
|
Jiang J, Lyu P, Li J, Huang S, Tao J, Blackshaw S, Qian J, Wang J. IReNA: Integrated regulatory network analysis of single-cell transcriptomes and chromatin accessibility profiles. iScience 2022; 25:105359. [PMID: 36325073 PMCID: PMC9619378 DOI: 10.1016/j.isci.2022.105359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 09/19/2022] [Accepted: 10/12/2022] [Indexed: 11/16/2022] Open
Abstract
Recently, single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) have been developed to separately measure transcriptomes and chromatin accessibility profiles at the single-cell resolution. However, few methods can reliably integrate these data to perform regulatory network analysis. Here, we developed integrated regulatory network analysis (IReNA) for network inference through the integrated analysis of scRNA-seq and scATAC-seq data, network modularization, transcription factor enrichment, and construction of simplified intermodular regulatory networks. Using public datasets, we showed that integrated network analysis of scRNA-seq data with scATAC-seq data is more precise to identify known regulators than scRNA-seq data analysis alone. Moreover, IReNA outperformed currently available methods in identifying known regulators. IReNA facilitates the systems-level understanding of biological regulatory mechanisms and is available at https://github.com/jiang-junyao/IReNA.
Collapse
Affiliation(s)
- Junyao Jiang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Pin Lyu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jinlian Li
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Sunan Huang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Jiawang Tao
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Seth Blackshaw
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jie Wang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- China-New Zealand Joint Laboratory on Biomedicine and Health, Guangzhou 510530, China
- Corresponding author
| |
Collapse
|
24
|
Xu J, Pratt HE, Moore JE, Gerstein MB, Weng Z. Building integrative functional maps of gene regulation. Hum Mol Genet 2022; 31:R114-R122. [PMID: 36083269 PMCID: PMC9585680 DOI: 10.1093/hmg/ddac195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 08/03/2022] [Accepted: 08/09/2022] [Indexed: 11/13/2022] Open
Abstract
Every cell in the human body inherits a copy of the same genetic information. The three billion base pairs of DNA in the human genome, and the roughly 50 000 coding and non-coding genes they contain, must thus encode all the complexity of human development and cell and tissue type diversity. Differences in gene regulation, or the modulation of gene expression, enable individual cells to interpret the genome differently to carry out their specific functions. Here we discuss recent and ongoing efforts to build gene regulatory maps, which aim to characterize the regulatory roles of all sequences in a genome. Many researchers and consortia have identified such regulatory elements using functional assays and evolutionary analyses; we discuss the results, strengths and shortcomings of their approaches. We also discuss new techniques the field can leverage and emerging challenges it will face while striving to build gene regulatory maps of ever-increasing resolution and comprehensiveness.
Collapse
Affiliation(s)
- Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|