1
|
Yan F, Suzuki A, Iwaya C, Pei G, Chen X, Yoshioka H, Yu M, Simon LM, Iwata J, Zhao Z. Single-cell multiomics decodes regulatory programs for mouse secondary palate development. Nat Commun 2024; 15:821. [PMID: 38280850 PMCID: PMC10821874 DOI: 10.1038/s41467-024-45199-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/17/2024] [Indexed: 01/29/2024] Open
Abstract
Perturbations in gene regulation during palatogenesis can lead to cleft palate, which is among the most common congenital birth defects. Here, we perform single-cell multiome sequencing and profile chromatin accessibility and gene expression simultaneously within the same cells (n = 36,154) isolated from mouse secondary palate across embryonic days (E) 12.5, E13.5, E14.0, and E14.5. We construct five trajectories representing continuous differentiation of cranial neural crest-derived multipotent cells into distinct lineages. By linking open chromatin signals to gene expression changes, we characterize the underlying lineage-determining transcription factors. In silico perturbation analysis identifies transcription factors SHOX2 and MEOX2 as important regulators of the development of the anterior and posterior palate, respectively. In conclusion, our study charts epigenetic and transcriptional dynamics in palatogenesis, serving as a valuable resource for further cleft palate research.
Collapse
Affiliation(s)
- Fangfang Yan
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Akiko Suzuki
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Department of Oral and Craniofacial Sciences, School of Dentistry, University of Missouri - Kansas City, Kansas City, Missouri, 64108, USA
| | - Chihiro Iwaya
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
| | - Guangsheng Pei
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xian Chen
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Hiroki Yoshioka
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA
| | - Meifang Yu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Lukas M Simon
- Therapeutic Innovation Center, Baylor College of Medicine, Houston, TX, 77030, USA.
| | - Junichi Iwata
- Department of Diagnostic and Biomedical Sciences, School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.
- Center for Craniofacial Research, The University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.
- MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX, 77030, USA.
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
| |
Collapse
|
2
|
Chen Y, Zhang S. Automatic Cell Type Annotation Using Marker Genes for Single-Cell RNA Sequencing Data. Biomolecules 2022; 12:biom12101539. [PMID: 36291748 PMCID: PMC9599378 DOI: 10.3390/biom12101539] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 10/01/2022] [Accepted: 10/11/2022] [Indexed: 11/16/2022] Open
Abstract
Recent advancement in single-cell RNA sequencing (scRNA-seq) technology is gaining more and more attention. Cell type annotation plays an essential role in scRNA-seq data analysis. Several computational methods have been proposed for automatic annotation. Traditional cell type annotation is to first cluster the cells using unsupervised learning methods based on the gene expression profiles, then to label the clusters using the aggregated cluster-level expression profiles and the marker genes’ information. Such procedure relies heavily on the clustering results. As the purity of clusters cannot be guaranteed, false detection of cluster features may lead to wrong annotations. In this paper, we improve this procedure and propose an Automatic Cell type Annotation Method (ACAM). ACAM delineates a clear framework to conduct automatic cell annotation through representative cluster identification, representative cluster annotation using marker genes, and the remaining cells’ classification. Experiments on seven real datasets show the better performance of ACAM compared to six well-known cell type annotation methods.
Collapse
Affiliation(s)
- Yu Chen
- School of Mathematical Sciences, Fudan University, Shanghai 200433, China
| | - Shuqin Zhang
- School of Mathematical Sciences, Fudan University, Shanghai 200433, China
- Key Laboratory of Mathematics for Nonlinear Science (Ministry of Education), Fudan University, Shanghai 200433, China
- Shanghai Key Laboratory for Contemporary Applied Mathematics, Fudan University, Shanghai 200433, China
- Correspondence:
| |
Collapse
|
3
|
Jia P, Hu R, Yan F, Dai Y, Zhao Z. scGWAS: landscape of trait-cell type associations by integrating single-cell transcriptomics-wide and genome-wide association studies. Genome Biol 2022; 23:220. [PMID: 36253801 PMCID: PMC9575201 DOI: 10.1186/s13059-022-02785-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 10/05/2022] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND The rapid accumulation of single-cell RNA sequencing (scRNA-seq) data presents unique opportunities to decode the genetically mediated cell-type specificity in complex diseases. Here, we develop a new method, scGWAS, which effectively leverages scRNA-seq data to achieve two goals: (1) to infer the cell types in which the disease-associated genes manifest and (2) to construct cellular modules which imply disease-specific activation of different processes. RESULTS scGWAS only utilizes the average gene expression for each cell type followed by virtual search processes to construct the null distributions of module scores, making it scalable to large scRNA-seq datasets. We demonstrated scGWAS in 40 genome-wide association studies (GWAS) datasets (average sample size N ≈ 154,000) using 18 scRNA-seq datasets from nine major human/mouse tissues (totaling 1.08 million cells) and identified 2533 trait and cell-type associations, each with significant modules for further investigation. The module genes were validated using disease or clinically annotated references from ClinVar, OMIM, and pLI variants. CONCLUSIONS We showed that the trait-cell type associations identified by scGWAS, while generally constrained to trait-tissue associations, could recapitulate many well-studied relationships and also reveal novel relationships, providing insights into the unsolved trait-tissue associations. Moreover, in each specific cell type, the associations with different traits were often mediated by different sets of risk genes, implying disease-specific activation of driving processes. In summary, scGWAS is a powerful tool for exploring the genetic basis of complex diseases at the cell type level using single-cell expression data.
Collapse
Affiliation(s)
- Peilin Jia
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Ruifeng Hu
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Fangfang Yan
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Yulin Dai
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA
| | - Zhongming Zhao
- grid.267308.80000 0000 9206 2401Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA ,grid.267308.80000 0000 9206 2401Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030 USA ,grid.240145.60000 0001 2291 4776MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030 USA
| |
Collapse
|
4
|
Fang S, Liu S, Yang D, Yang L, Hu CD, Wan J. Decoding regulatory associations of G-quadruplex with epigenetic and transcriptomic functional components. Front Genet 2022; 13:957023. [PMID: 36092921 PMCID: PMC9452811 DOI: 10.3389/fgene.2022.957023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Accepted: 07/29/2022] [Indexed: 02/02/2023] Open
Abstract
G-quadruplex (G4) has been previously observed to be associated with gene expression. In this study, we performed integrative analysis on G4 multi-omics data from in-silicon prediction and ChIP-seq in human genome. Potential G4 sites were classified into three distinguished groups, such as one group of high-confidence G4-forming locations (G4-II) and groups only containing either ChIP-seq detected G4s (G4-I) or predicted G4 motif candidates (G4-III). We explored the associations of different-confidence G4 groups with other epigenetic regulatory elements, including CpG islands, chromatin status, enhancers, super-enhancers, G4 locations compared to the genes, and DNA methylation. Our elastic net regression model revealed that G4 structures could correlate with gene expression in two opposite ways depending on their locations to the genes as well as G4-forming DNA strand. Some transcription factors were identified to be over-represented with G4 emergence. The motif analysis discovered distinct consensus sequences enriched in the G4 feet, the flanking regions of two groups of G4s. We found high GC content in the feet of high-confidence G4s (G4-II) when compared to high TA content in solely predicted G4 feet of G4-III. Overall, we uncovered the comprehensive associations of G4 formations or predictions with other epigenetic and transcriptional elements which potentially coordinate gene transcription.
Collapse
Affiliation(s)
- Shuyi Fang
- Department of BioHealth Informatics, Indiana University School of Informatics and Computing, Indiana University—Purdue University Indianapolis, Indianapolis, IN, United States
| | - Sheng Liu
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States,The Collaborative Core for Cancer Bioinformatics (CB) shared by Indiana University Simon Comprehensive Cancer Center and Purdue University Center for Cancer Research, Indianapolis, IN, United States
| | - Danzhou Yang
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, United States,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, United States
| | - Lei Yang
- Department of Pediatrics, Indiana University School of Medicine, Indianapolis, IN, United States,Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN, United States,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Chang-Deng Hu
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, United States,Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, United States
| | - Jun Wan
- Department of BioHealth Informatics, Indiana University School of Informatics and Computing, Indiana University—Purdue University Indianapolis, Indianapolis, IN, United States,Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, United States,The Collaborative Core for Cancer Bioinformatics (CB) shared by Indiana University Simon Comprehensive Cancer Center and Purdue University Center for Cancer Research, Indianapolis, IN, United States,Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, United States,*Correspondence: Jun Wan,
| |
Collapse
|
5
|
Drug-Target Network Study Reveals the Core Target-Protein Interactions of Various COVID-19 Treatments. Genes (Basel) 2022; 13:genes13071210. [PMID: 35885993 PMCID: PMC9316565 DOI: 10.3390/genes13071210] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/01/2022] [Accepted: 07/03/2022] [Indexed: 02/04/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) pandemic has caused a dramatic loss of human life and devastated the worldwide economy. Numerous efforts have been made to mitigate COVID-19 symptoms and reduce the death rate. We conducted literature mining of more than 250 thousand published works and curated the 174 most widely used COVID-19 medications. Overlaid with the human protein-protein interaction (PPI) network, we used Steiner tree analysis to extract a core subnetwork that grew from the pharmacological targets of ten credible drugs ascertained by the CTD database. The resultant core subnetwork consisted of 34 interconnected genes, which were associated with 36 drugs. Immune cell membrane receptors, the downstream cellular signaling cascade, and severe COVID-19 symptom risk were significantly enriched for the core subnetwork genes. The lung mast cell was most enriched for the target genes among 1355 human tissue-cell types. Human bronchoalveolar lavage fluid COVID-19 single-cell RNA-Seq data highlighted the fact that T cells and macrophages have the most overlapping genes from the core subnetwork. Overall, we constructed an actionable human target-protein module that mainly involved anti-inflammatory/antiviral entry functions and highly overlapped with COVID-19-severity-related genes. Our findings could serve as a knowledge base for guiding drug discovery or drug repurposing to confront the fast-evolving SARS-CoV-2 virus and other severe infectious diseases.
Collapse
|
6
|
Dai Y, Hu R, Liu A, Cho KS, Manuel AM, Li X, Dong X, Jia P, Zhao Z. WebCSEA: web-based cell-type-specific enrichment analysis of genes. Nucleic Acids Res 2022; 50:W782-W790. [PMID: 35610053 DOI: 10.1093/nar/gkac392] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 04/19/2022] [Accepted: 05/04/2022] [Indexed: 02/02/2023] Open
Abstract
Human complex traits and common diseases show tissue- and cell-type- specificity. Recently, single-cell RNA sequencing (scRNA-seq) technology has successfully depicted cellular heterogeneity in human tissue, providing an unprecedented opportunity to understand the context-specific expression of complex trait-associated genes in human tissue-cell types (TCs). Here, we present the first web-based application to quickly assess the cell-type-specificity of genes, named Web-based Cell-type Specific Enrichment Analysis of Genes (WebCSEA, available at https://bioinfo.uth.edu/webcsea/). Specifically, we curated a total of 111 scRNA-seq panels of human tissues and 1,355 TCs from 61 different general tissues across 11 human organ systems. We adapted our previous decoding tissue-specificity (deTS) algorithm to measure the enrichment for each tissue-cell type (TC). To overcome the potential bias from the number of signature genes between different TCs, we further developed a permutation-based method that accurately estimates the TC-specificity of a given inquiry gene list. WebCSEA also provides an interactive heatmap that displays the cell-type specificity across 1355 human TCs, and other interactive and static visualizations of cell-type specificity by human organ system, developmental stage, and top-ranked tissues and cell types. In short, WebCSEA is a one-click application that provides a comprehensive exploration of the TC-specificity of genes among human major TC map.
Collapse
Affiliation(s)
- Yulin Dai
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Ruifeng Hu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,Center for Advanced Parkinson Research, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.,Genomics and Bioinformatics Hub, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Andi Liu
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Kyung Serk Cho
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Astrid Marilyn Manuel
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaoyang Li
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xianjun Dong
- Center for Advanced Parkinson Research, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.,Genomics and Bioinformatics Hub, Department of Neurology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Peilin Jia
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Zhongming Zhao
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.,MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences, Houston, TX 77030, USA
| |
Collapse
|