1
|
Huang F, Welner RS, Chen JY, Yue Z. PAGER-scFGA: unveiling cell functions and molecular mechanisms in cell trajectories through single-cell functional genomics analysis. FRONTIERS IN BIOINFORMATICS 2024; 4:1336135. [PMID: 38690527 PMCID: PMC11058213 DOI: 10.3389/fbinf.2024.1336135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 04/01/2024] [Indexed: 05/02/2024] Open
Abstract
Background: Understanding how cells and tissues respond to stress factors and perturbations during disease processes is crucial for developing effective prevention, diagnosis, and treatment strategies. Single-cell RNA sequencing (scRNA-seq) enables high-resolution identification of cells and exploration of cell heterogeneity, shedding light on cell differentiation/maturation and functional differences. Recent advancements in multimodal sequencing technologies have focused on improving access to cell-specific subgroups for functional genomics analysis. To facilitate the functional annotation of cell groups and characterization of molecular mechanisms underlying cell trajectories, we introduce the Pathways, Annotated Gene Lists, and Gene Signatures Electronic Repository for Single-Cell Functional Genomics Analysis (PAGER-scFGA). Results: We have developed PAGER-scFGA, which integrates cell functional annotations and gene-set enrichment analysis into popular single-cell analysis pipelines such as Scanpy. Using differentially expressed genes (DEGs) from pairwise cell clusters, PAGER-scFGA infers cell functions through the enrichment of potential cell-marker genesets. Moreover, PAGER-scFGA provides pathways, annotated gene lists, and gene signatures (PAGs) enriched in specific cell subsets with tissue compositions and continuous transitions along cell trajectories. Additionally, PAGER-scFGA enables the construction of a gene subcellular map based on DEGs and allows examination of the gene functional compartments (GFCs) underlying cell maturation/differentiation. In a real-world case study of mouse natural killer (mNK) cells, PAGER-scFGA revealed two major stages of natural killer (NK) cells and three trajectories from the precursor stage to NK T-like mature stage within blood, spleen, and bone marrow tissues. As the trajectories progress to later stages, the DEGs exhibit greater divergence and variability. However, the DEGs in different trajectories still interact within a network during NK cell maturation. Notably, PAGER-scFGA unveiled cell cytotoxicity, exocytosis, and the response to interleukin (IL) signaling pathways and associated network models during the progression from precursor NK cells to mature NK cells. Conclusion: PAGER-scFGA enables in-depth exploration of functional insights and presents a comprehensive knowledge map of gene networks and GFCs, which can be utilized for future studies and hypothesis generation. It is expected to become an indispensable tool for inferring cell functions and detecting molecular mechanisms within cell trajectories in single-cell studies. The web app (accessible at https://au-singlecell.streamlit.app/) is publicly available.
Collapse
Affiliation(s)
- Fengyuan Huang
- Department of Biomedical Informatics and Data Science, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Robert S. Welner
- Hematology & Oncology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jake Y. Chen
- Department of Biomedical Informatics and Data Science, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Zongliang Yue
- Health Outcome Research and Policy Department, Harrison College of Pharmacy, Auburn University, Auburn, AL, United States
| |
Collapse
|
2
|
Nguyen T, Wei Y, Nakada Y, Chen JY, Zhou Y, Walcott G, Zhang J. Analysis of cardiac single-cell RNA-sequencing data can be improved by the use of artificial-intelligence-based tools. Sci Rep 2023; 13:6821. [PMID: 37100826 PMCID: PMC10133286 DOI: 10.1038/s41598-023-32293-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 03/25/2023] [Indexed: 04/28/2023] Open
Abstract
Single-cell RNA sequencing (scRNAseq) enables researchers to identify and characterize populations and subpopulations of different cell types in hearts recovering from myocardial infarction (MI) by characterizing the transcriptomes in thousands of individual cells. However, the effectiveness of the currently available tools for processing and interpreting these immense datasets is limited. We incorporated three Artificial Intelligence (AI) techniques into a toolkit for evaluating scRNAseq data: AI Autoencoding separates data from different cell types and subpopulations of cell types (cluster analysis); AI Sparse Modeling identifies genes and signaling mechanisms that are differentially activated between subpopulations (pathway/gene set enrichment analysis), and AI Semisupervised Learning tracks the transformation of cells from one subpopulation into another (trajectory analysis). Autoencoding was often used in data denoising; yet, in our pipeline, Autoencoding was exclusively used for cell embedding and clustering. The performance of our AI scRNAseq toolkit and other highly cited non-AI tools was evaluated with three scRNAseq datasets obtained from the Gene Expression Omnibus database. Autoencoder was the only tool to identify differences between the cardiomyocyte subpopulations found in mice that underwent MI or sham-MI surgery on postnatal day (P) 1. Statistically significant differences between cardiomyocytes from P1-MI mice and mice that underwent MI on P8 were identified for six cell-cycle phases and five signaling pathways when the data were analyzed via Sparse Modeling, compared to just one cell-cycle phase and one pathway when the data were analyzed with non-AI techniques. Only Semisupervised Learning detected trajectories between the predominant cardiomyocyte clusters in hearts collected on P28 from pigs that underwent apical resection (AR) on P1, and on P30 from pigs that underwent AR on P1 and MI on P28. In another dataset, the pig scRNAseq data were collected after the injection of CCND2-overexpression Human-induced Pluripotent Stem Cell-derived cardiomyocytes (CCND2hiPSC) into injured P28 pig heart; only the AI-based technique could demonstrate that the host cardiomyocytes increase proliferating by through the HIPPO/YAP and MAPK signaling pathways. For the cluster, pathway/gene set enrichment, and trajectory analysis of scRNAseq datasets generated from studies of myocardial regeneration in mice and pigs, our AI-based toolkit identified results that non-AI techniques did not discover. These different results were validated and were important in explaining myocardial regeneration.
Collapse
Affiliation(s)
- Thanh Nguyen
- Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Yuhua Wei
- Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Yuji Nakada
- Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Jake Y Chen
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Yang Zhou
- Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Gregory Walcott
- Department of Medicine, Cardiovascular Diseases, University of Alabama at Birmingham, Birmingham, AL, 35233, USA
| | - Jianyi Zhang
- Department of Biomedical Engineering, University of Alabama at Birmingham, Birmingham, AL, 35233, USA.
- Department of Medicine, Cardiovascular Diseases, University of Alabama at Birmingham, Birmingham, AL, 35233, USA.
- Department of Biomedical Engineering, School of Medicine and School of Engineering, University of Alabama at Birmingham, 1670 University Blvd, Volker Hall G094J, Birmingham, AL, 35233, USA.
| |
Collapse
|
3
|
Slominski AT, Slominski RM, Raman C, Chen JY, Athar M, Elmets C. Neuroendocrine signaling in the skin with a special focus on the epidermal neuropeptides. Am J Physiol Cell Physiol 2022; 323:C1757-C1776. [PMID: 36317800 PMCID: PMC9744652 DOI: 10.1152/ajpcell.00147.2022] [Citation(s) in RCA: 63] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 10/21/2022] [Accepted: 10/21/2022] [Indexed: 11/07/2022]
Abstract
The skin, which is comprised of the epidermis, dermis, and subcutaneous tissue, is the largest organ in the human body and it plays a crucial role in the regulation of the body's homeostasis. These functions are regulated by local neuroendocrine and immune systems with a plethora of signaling molecules produced by resident and immune cells. In addition, neurotransmitters, endocrine factors, neuropeptides, and cytokines released from nerve endings play a central role in the skin's responses to stress. These molecules act on the corresponding receptors in an intra-, juxta-, para-, or autocrine fashion. The epidermis as the outer most component of skin forms a barrier directly protecting against environmental stressors. This protection is assured by an intrinsic keratinocyte differentiation program, pigmentary system, and local nervous, immune, endocrine, and microbiome elements. These constituents communicate cross-functionally among themselves and with corresponding systems in the dermis and hypodermis to secure the basic epidermal functions to maintain local (skin) and global (systemic) homeostasis. The neurohormonal mediators and cytokines used in these communications regulate physiological skin functions separately or in concert. Disturbances in the functions in these systems lead to cutaneous pathology that includes inflammatory (i.e., psoriasis, allergic, or atopic dermatitis, etc.) and keratinocytic hyperproliferative disorders (i.e., seborrheic and solar keratoses), dysfunction of adnexal structure (i.e., hair follicles, eccrine, and sebaceous glands), hypersensitivity reactions, pigmentary disorders (vitiligo, melasma, and hypo- or hyperpigmentary responses), premature aging, and malignancies (melanoma and nonmelanoma skin cancers). These cellular, molecular, and neural components preserve skin integrity and protect against skin pathologies and can act as "messengers of the skin" to the central organs, all to preserve organismal survival.
Collapse
Affiliation(s)
- Andrzej T Slominski
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, Alabama
- Comprehensive Cancer Center, Cancer Chemoprevention Program, University of Alabama at Birmingham, Birmingham, Alabama
- VA Medical Center, Birmingham, Alabama
| | - Radomir M Slominski
- Graduate Biomedical Sciences Program, University of Alabama at Birmingham, Birmingham, Alabama
| | - Chander Raman
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, Alabama
| | - Jake Y Chen
- Informatics Institute, University of Alabama at Birmingham, Birmingham, Alabama
| | - Mohammad Athar
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, Alabama
- VA Medical Center, Birmingham, Alabama
| | - Craig Elmets
- Department of Dermatology, University of Alabama at Birmingham, Birmingham, Alabama
- Comprehensive Cancer Center, Cancer Chemoprevention Program, University of Alabama at Birmingham, Birmingham, Alabama
- VA Medical Center, Birmingham, Alabama
| |
Collapse
|
4
|
Nguyen T, Yue Z, Slominski R, Welner R, Zhang J, Chen JY. WINNER: A network biology tool for biomolecular characterization and prioritization. Front Big Data 2022; 5:1016606. [DOI: 10.3389/fdata.2022.1016606] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 10/14/2022] [Indexed: 11/06/2022] Open
Abstract
Background and contributionIn network biology, molecular functions can be characterized by network-based inference, or “guilt-by-associations.” PageRank-like tools have been applied in the study of biomolecular interaction networks to obtain further the relative significance of all molecules in the network. However, there is a great deal of inherent noise in widely accessible data sets for gene-to-gene associations or protein-protein interactions. How to develop robust tests to expand, filter, and rank molecular entities in disease-specific networks remains an ad hoc data analysis process.ResultsWe describe a new biomolecular characterization and prioritization tool called Weighted In-Network Node Expansion and Ranking (WINNER). It takes the input of any molecular interaction network data and generates an optionally expanded network with all the nodes ranked according to their relevance to one another in the network. To help users assess the robustness of results, WINNER provides two different types of statistics. The first type is a node-expansion p-value, which helps evaluate the statistical significance of adding “non-seed” molecules to the original biomolecular interaction network consisting of “seed” molecules and molecular interactions. The second type is a node-ranking p-value, which helps evaluate the relative statistical significance of the contribution of each node to the overall network architecture. We validated the robustness of WINNER in ranking top molecules by spiking noises in several network permutation experiments. We have found that node degree–preservation randomization of the gene network produced normally distributed ranking scores, which outperform those made with other gene network randomization techniques. Furthermore, we validated that a more significant proportion of the WINNER-ranked genes was associated with disease biology than existing methods such as PageRank. We demonstrated the performance of WINNER with a few case studies, including Alzheimer's disease, breast cancer, myocardial infarctions, and Triple negative breast cancer (TNBC). In all these case studies, the expanded and top-ranked genes identified by WINNER reveal disease biology more significantly than those identified by other gene prioritizing software tools, including Ingenuity Pathway Analysis (IPA) and DiAMOND.ConclusionWINNER ranking strongly correlates to other ranking methods when the network covers sufficient node and edge information, indicating a high network quality. WINNER users can use this new tool to robustly evaluate a list of candidate genes, proteins, or metabolites produced from high-throughput biology experiments, as long as there is available gene/protein/metabolic network information.
Collapse
|
5
|
Weng Z, Yue Z, Zhu Y, Chen JY. DEMA: a distance-bounded energy-field minimization algorithm to model and layout biomolecular networks with quantitative features. Bioinformatics 2022; 38:i359-i368. [PMID: 35758816 PMCID: PMC9235497 DOI: 10.1093/bioinformatics/btac261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
SUMMARY In biology, graph layout algorithms can reveal comprehensive biological contexts by visually positioning graph nodes in their relevant neighborhoods. A layout software algorithm/engine commonly takes a set of nodes and edges and produces layout coordinates of nodes according to edge constraints. However, current layout engines normally do not consider node, edge or node-set properties during layout and only curate these properties after the layout is created. Here, we propose a new layout algorithm, distance-bounded energy-field minimization algorithm (DEMA), to natively consider various biological factors, i.e., the strength of gene-to-gene association, the gene's relative contribution weight and the functional groups of genes, to enhance the interpretation of complex network graphs. In DEMA, we introduce a parameterized energy model where nodes are repelled by the network topology and attracted by a few biological factors, i.e., interaction coefficient, effect coefficient and fold change of gene expression. We generalize these factors as gene weights, protein-protein interaction weights, gene-to-gene correlations and the gene set annotations-four parameterized functional properties used in DEMA. Moreover, DEMA considers further attraction/repulsion/grouping coefficient to enable different preferences in generating network views. Applying DEMA, we performed two case studies using genetic data in autism spectrum disorder and Alzheimer's disease, respectively, for gene candidate discovery. Furthermore, we implement our algorithm as a plugin to Cytoscape, an open-source software platform for visualizing networks; hence, it is convenient. Our software and demo can be freely accessed at http://discovery.informatics.uab.edu/dema. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenyu Weng
- Communication and Information Security Lab, Institute of Big Data Technologies, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
| | - Zongliang Yue
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Yuesheng Zhu
- Communication and Information Security Lab, Institute of Big Data Technologies, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
| | - Jake Yue Chen
- Informatics Institute, School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| |
Collapse
|
6
|
Gao S, Wang S, Zhao Z, Zhang C, Liu Z, Ye P, Xu Z, Yi B, Jiao K, Naik GA, Wei S, Rais-Bahrami S, Bae S, Yang WH, Sonpavde G, Liu R, Wang L. TUBB4A interacts with MYH9 to protect the nucleus during cell migration and promotes prostate cancer via GSK3β/β-catenin signalling. Nat Commun 2022; 13:2792. [PMID: 35589707 PMCID: PMC9120517 DOI: 10.1038/s41467-022-30409-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/28/2022] [Indexed: 01/22/2023] Open
Abstract
Human tubulin beta class IVa (TUBB4A) is a member of the β-tubulin family. In most normal tissues, expression of TUBB4A is little to none, but it is highly expressed in human prostate cancer. Here we show that high expression levels of TUBB4A are associated with aggressive prostate cancers and poor patient survival, especially for African-American men. Additionally, in prostate cancer cells, TUBB4A knockout (KO) reduces cell growth and migration but induces DNA damage through increased γH2AX and 53BP1. Furthermore, during constricted cell migration, TUBB4A interacts with MYH9 to protect the nucleus, but either TUBB4A KO or MYH9 knockdown leads to severe DNA damage and reduces the NF-κB signaling response. Also, TUBB4A KO retards tumor growth and metastasis. Functional analysis reveals that TUBB4A/GSK3β binds to the N-terminal of MYH9, and that TUBB4A KO reduces MYH9-mediated GSK3β ubiquitination and degradation, leading to decreased activation of β-catenin signaling and its relevant epithelial-mesenchymal transition. Likewise, prostate-specific deletion of Tubb4a reduces spontaneous tumor growth and metastasis via inhibition of NF-κB, cyclin D1, and c-MYC signaling activation. Our results suggest an oncogenic role of TUBB4A and provide a potentially actionable therapeutic target for prostate cancers with TUBB4A overexpression. The β-tubulin family protein TUBB4A is highly expressed in cancer but it’s molecular role is unclear. Here, the authors show that TUBB4A is required to protect the nucleus from genomic instability during migration and that it’s over expression promotes cancer progression.
Collapse
Affiliation(s)
- Song Gao
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Shuaibin Wang
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Zhiying Zhao
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Chao Zhang
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Zhicao Liu
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Ping Ye
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Zhifang Xu
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Baozhu Yi
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Kai Jiao
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Gurudatta A Naik
- Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Shi Wei
- Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA.,Department of Pathology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Soroush Rais-Bahrami
- Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA.,Department of Urology, University of Alabama at Birmingham, Birmingham, AL, USA.,Department of Radiology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Sejong Bae
- Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA.,Department of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Wei-Hsiung Yang
- Department of Biomedical Sciences, Mercer University School of Medicine, Savannah, GA, USA
| | | | - Runhua Liu
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA. .,Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA.
| | - Lizhong Wang
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL, USA. .,Department of O'Neal Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
7
|
Yue Z, Slominski R, Bharti S, Chen JY. PAGER Web APP: An Interactive, Online Gene Set and Network Interpretation Tool for Functional Genomics. Front Genet 2022; 13:820361. [PMID: 35495152 PMCID: PMC9039620 DOI: 10.3389/fgene.2022.820361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 03/17/2022] [Indexed: 12/30/2022] Open
Abstract
Functional genomics studies have helped researchers annotate differentially expressed gene lists, extract gene expression signatures, and identify biological pathways from omics profiling experiments conducted on biological samples. The current geneset, network, and pathway analysis (GNPA) web servers, e.g., DAVID, EnrichR, WebGestaltR, or PAGER, do not allow automated integrative functional genomic downstream analysis. In this study, we developed a new web-based interactive application, “PAGER Web APP”, which supports online R scripting of integrative GNPA. In a case study of melanoma drug resistance, we showed that the new PAGER Web APP enabled us to discover highly relevant pathways and network modules, leading to novel biological insights. We also compared PAGER Web APP’s pathway analysis results retrieved among PAGER, EnrichR, and WebGestaltR to show its advantages in integrative GNPA. The interactive online web APP is publicly accessible from the link, https://aimed-lab.shinyapps.io/PAGERwebapp/.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Radomir Slominski
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
- Graduate Biomedical Sciences Program, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Samuel Bharti
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
| | - Jake Y. Chen
- Informatics Institute in the School of Medicine, The University of Alabama at Birmingham, Birmingham, AL, United States
- *Correspondence: Jake Y. Chen,
| |
Collapse
|
8
|
Zindl CL, Witte SJ, Laufer VA, Gao M, Yue Z, Janowski KM, Cai B, Frey BF, Silberger DJ, Harbour SN, Singer JR, Turner H, Lund FE, Vallance BA, Rosenberg AF, Schoeb TR, Chen JY, Hatton RD, Weaver CT. A nonredundant role for T cell-derived interleukin 22 in antibacterial defense of colonic crypts. Immunity 2022; 55:494-511.e11. [PMID: 35263568 PMCID: PMC9126440 DOI: 10.1016/j.immuni.2022.02.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 11/11/2021] [Accepted: 02/04/2022] [Indexed: 02/05/2023]
Abstract
Interleukin (IL)-22 is central to immune defense at barrier sites. We examined the contributions of innate lymphoid cell (ILC) and T cell-derived IL-22 during Citrobacter rodentium (C.r) infection using mice that both report Il22 expression and allow lineage-specific deletion. ILC-derived IL-22 activated STAT3 in C.r-colonized surface intestinal epithelial cells (IECs) but only temporally restrained bacterial growth. T cell-derived IL-22 induced a more robust and extensive activation of STAT3 in IECs, including IECs lining colonic crypts, and T cell-specific deficiency of IL-22 led to pathogen invasion of the crypts and increased mortality. This reflected a requirement for T cell-derived IL-22 for the expression of a host-protective transcriptomic program that included AMPs, neutrophil-recruiting chemokines, and mucin-related molecules, and it restricted IFNγ-induced proinflammatory genes. Our findings demonstrate spatiotemporal differences in the production and action of IL-22 by ILCs and T cells during infection and reveal an indispensable role for IL-22-producing T cells in the protection of the intestinal crypts.
Collapse
Affiliation(s)
- Carlene L Zindl
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA.
| | - Steven J Witte
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Vincent A Laufer
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Department of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Min Gao
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Informatics Institute, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Zongliang Yue
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Informatics Institute, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Karen M Janowski
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Baiyi Cai
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Blake F Frey
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Daniel J Silberger
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Stacey N Harbour
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Jeffrey R Singer
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Henrietta Turner
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Frances E Lund
- Department of Microbiology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Bruce A Vallance
- Department of Pediatrics, University of British Columbia, Vancouver, BC V6H 3V4, Canada
| | - Alexander F Rosenberg
- Informatics Institute, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Department of Microbiology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Trenton R Schoeb
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Jake Y Chen
- Department of Genetics, University of Alabama at Birmingham, Birmingham, AL 35294, USA; Informatics Institute, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Robin D Hatton
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Casey T Weaver
- Department of Pathology, University of Alabama at Birmingham, Birmingham, AL 35294, USA.
| |
Collapse
|
9
|
Tang Z, Fan W, Li Q, Wang D, Wen M, Wang J, Li X, Zhou Y. MVIP: multi-omics portal of viral infection. Nucleic Acids Res 2021; 50:D817-D827. [PMID: 34718748 PMCID: PMC8689837 DOI: 10.1093/nar/gkab958] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/13/2022] Open
Abstract
Virus infections are huge threats to living organisms and cause many diseases, such as COVID-19 caused by SARS-CoV-2, which has led to millions of deaths. To develop effective strategies to control viral infection, we need to understand its molecular events in host cells. Virus related functional genomic datasets are growing rapidly, however, an integrative platform for systematically investigating host responses to viruses is missing. Here, we developed a user-friendly multi-omics portal of viral infection named as MVIP (https://mvip.whu.edu.cn/). We manually collected available high-throughput sequencing data under viral infection, and unified their detailed metadata including virus, host species, infection time, assay, and target, etc. We processed multi-layered omics data of more than 4900 viral infected samples from 77 viruses and 33 host species with standard pipelines, including RNA-seq, ChIP-seq, and CLIP-seq, etc. In addition, we integrated these genome-wide signals into customized genome browsers, and developed multiple dynamic charts to exhibit the information, such as time-course dynamic and differential gene expression profiles, alternative splicing changes and enriched GO/KEGG terms. Furthermore, we implemented several tools for efficiently mining the virus-host interactions by virus, host and genes. MVIP would help users to retrieve large-scale functional information and promote the understanding of virus-host interactions.
Collapse
Affiliation(s)
- Zhidong Tang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Weiliang Fan
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Qiming Li
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Dehe Wang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Miaomiao Wen
- Institute for Advanced Studies, Wuhan University, Wuhan 430072, China
| | - Junhao Wang
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Xingqiao Li
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Yu Zhou
- State Key Laboratory of Virology, College of Life Sciences, Wuhan University, Wuhan 430072, China.,Institute for Advanced Studies, Wuhan University, Wuhan 430072, China.,RNA Institute, Wuhan University, Wuhan 430072, China.,Frontier Science Center for Immunology and Metabolism, Wuhan University, Wuhan 430072, China
| |
Collapse
|
10
|
Nguyen T, Zhang T, Fox G, Zeng S, Cao N, Pan C, Chen JY. Linking clinotypes to phenotypes and genotypes from laboratory test results in comprehensive physical exams. BMC Med Inform Decis Mak 2021; 21:51. [PMID: 33627109 PMCID: PMC7903607 DOI: 10.1186/s12911-021-01387-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/06/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In this work, we aimed to demonstrate how to utilize the lab test results and other clinical information to support precision medicine research and clinical decisions on complex diseases, with the support of electronic medical record facilities. We defined "clinotypes" as clinical information that could be observed and measured objectively using biomedical instruments. From well-known 'omic' problem definitions, we defined problems using clinotype information, including stratifying patients-identifying interested sub cohorts for future studies, mining significant associations between clinotypes and specific phenotypes-diseases, and discovering potential linkages between clinotype and genomic information. We solved these problems by integrating public omic databases and applying advanced machine learning and visual analytic techniques on two-year health exam records from a large population of healthy southern Chinese individuals (size n = 91,354). When developing the solution, we carefully addressed the missing information, imbalance and non-uniformed data annotation issues. RESULTS We organized the techniques and solutions to address the problems and issues above into CPA framework (Clinotype Prediction and Association-finding). At the data preprocessing step, we handled the missing value issue with predicted accuracy of 0.760. We curated 12,635 clinotype-gene associations. We found 147 Associations between 147 chronic diseases-phenotype and clinotypes, which improved the disease predictive performance to AUC (average) of 0.967. We mined 182 significant clinotype-clinotype associations among 69 clinotypes. CONCLUSIONS Our results showed strong potential connectivity between the omics information and the clinical lab test information. The results further emphasized the needs to utilize and integrate the clinical information, especially the lab test results, in future PheWas and omic studies. Furthermore, it showed that the clinotype information could initiate an alternative research direction and serve as an independent field of data to support the well-known 'phenome' and 'genome' researches.
Collapse
Affiliation(s)
- Thanh Nguyen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, AL, Birmingham, USA
| | - Tongbin Zhang
- School of First Clinical Medical Sciences - School of Information and Engineering, Wenzhou Medical University, Zhejiang, China
- Department of Computer Technology and Information Management, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang, China
| | - Geoffrey Fox
- School of Informatics, Computing, and Engineering, Indiana University, Bloomington, IN, USA
| | - Sisi Zeng
- School of First Clinical Medical Sciences - School of Information and Engineering, Wenzhou Medical University, Zhejiang, China
| | - Ni Cao
- School of First Clinical Medical Sciences - School of Information and Engineering, Wenzhou Medical University, Zhejiang, China
| | - Chuandi Pan
- School of First Clinical Medical Sciences - School of Information and Engineering, Wenzhou Medical University, Zhejiang, China
- Department of Computer Technology and Information Management, The First Affiliated Hospital of Wenzhou Medical University, Zhejiang, China
| | - Jake Y Chen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, AL, Birmingham, USA.
| |
Collapse
|
11
|
Yue Z, Zhang E, Xu C, Khurana S, Batra N, Dang SDH, Cimino JJ, Chen JY. PAGER-CoV: a comprehensive collection of pathways, annotated gene-lists and gene signatures for coronavirus disease studies. Nucleic Acids Res 2021; 49:D589-D599. [PMID: 33245774 PMCID: PMC7778959 DOI: 10.1093/nar/gkaa1094] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 10/23/2020] [Accepted: 10/27/2020] [Indexed: 12/13/2022] Open
Abstract
PAGER-CoV (http://discovery.informatics.uab.edu/PAGER-CoV/) is a new web-based database that can help biomedical researchers interpret coronavirus-related functional genomic study results in the context of curated knowledge of host viral infection, inflammatory response, organ damage, and tissue repair. The new database consists of 11 835 PAGs (Pathways, Annotated gene-lists, or Gene signatures) from 33 public data sources. Through the web user interface, users can search by a query gene or a query term and retrieve significantly matched PAGs with all the curated information. Users can navigate from a PAG of interest to other related PAGs through either shared PAG-to-PAG co-membership relationships or PAG-to-PAG regulatory relationships, totaling 19 996 993. Users can also retrieve enriched PAGs from an input list of COVID-19 functional study result genes, customize the search data sources, and export all results for subsequent offline data analysis. In a case study, we performed a gene set enrichment analysis (GSEA) of a COVID-19 RNA-seq data set from the Gene Expression Omnibus database. Compared with the results using the standard PAGER database, PAGER-CoV allows for more sensitive matching of known immune-related gene signatures. We expect PAGER-CoV to be invaluable for biomedical researchers to find molecular biology mechanisms and tailored therapeutics to treat COVID-19 patients.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - Eric Zhang
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - Clark Xu
- University of Wisconsin-Madison School of Medicine and Public Health, Institute of Clinical and Translational Research, Madison, WI 53705-2221, USA
| | - Sunny Khurana
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - Nishant Batra
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - Son Do Hai Dang
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - James J Cimino
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| | - Jake Y Chen
- Informatics Institute, School of Medicine, The University of Alabama at Birmingham, Birmingham, AL 35223, USA
| |
Collapse
|
12
|
Yue Z, Yan D, Guo G, Chen JY. Biological Network Mining. Methods Mol Biol 2021; 2328:139-151. [PMID: 34251623 DOI: 10.1007/978-1-0716-1534-8_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In this book chapter, we introduce a pipeline to mine significant biomedical entities (or bioentities) in biological networks. Our focus is on prioritizing both bioentities themselves and the associations between bioentities in order to reveal their biological functions. We will introduce three tools BEERE, WIPER, and PAGER 2.0 that can be used together for network analysis and function interpretation: (1) BEERE is a network analysis tool for "Biomedical Entity Expansion, Ranking and Explorations," (2) WIPER is an entity-to-entity association ranking tool, and (3) PAGER 2.0 is a service for gene enrichment analysis.
Collapse
Affiliation(s)
- Zongliang Yue
- The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Da Yan
- The University of Alabama at Birmingham, Birmingham, AL, USA.
| | - Guimu Guo
- The University of Alabama at Birmingham, Birmingham, AL, USA
| | - Jake Y Chen
- The University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
13
|
Association study based on topological constraints of protein-protein interaction networks. Sci Rep 2020; 10:10797. [PMID: 32612246 PMCID: PMC7329836 DOI: 10.1038/s41598-020-67875-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Accepted: 06/15/2020] [Indexed: 12/17/2022] Open
Abstract
The non-random interaction pattern of a protein–protein interaction network (PIN) is biologically informative, but its potentials have not been fully utilized in omics studies. Here, we propose a network-permutation-based association study (NetPAS) method that gauges the observed interactions between two sets of genes based on the comparison between permutation null models and the empirical networks. This enables NetPAS to evaluate relationships, constrained by network topology, between gene sets related to different phenotypes. We demonstrated the utility of NetPAS in 50 well-curated gene sets and comparison of association studies using Z-scores, modified Zʹ-scores, p-values and Jaccard indices. Using NetPAS, a weighted human disease network was generated from the association scores of 19 gene sets from OMIM. We also applied NetPAS in gene sets derived from gene ontology and pathway annotations and showed that NetPAS uncovered functional terms missed by DAVID and WebGestalt. Overall, we show that NetPAS can take topological constraints of molecular networks into account and offer new perspectives than existing methods.
Collapse
|
14
|
Yue Z, Nguyen T, Zhang E, Zhang J, Chen JY. WIPER: Weighted in-Path Edge Ranking for biomolecular association networks. QUANTITATIVE BIOLOGY 2019; 7:313-326. [PMID: 38525413 PMCID: PMC10959292 DOI: 10.1007/s40484-019-0180-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 08/02/2019] [Accepted: 08/08/2019] [Indexed: 10/25/2022]
Abstract
Background In network biology researchers generate biomolecular networks with candidate genes or proteins experimentally-derived from high-throughput data and known biomolecular associations. Current bioinformatics research focuses on characterizing candidate genes/proteins, or nodes, with network characteristics, e.g., betweenness centrality. However, there have been few research reports to characterize and prioritize biomolecular associations ("edges"), which can represent gene regulatory events essential to biological processes. Method We developed Weighted In-Path Edge Ranking (WIPER), a new computational algorithm which can help evaluate all biomolecular interactions/associations ("edges") in a network model and generate a rank order of every edge based on their in-path traversal scores and statistical significance test result. To validate whether WIPER worked as we designed, we tested the algorithm on synthetic network models. Results Our results showed WIPER can reliably discover both critical "well traversed in-path edges", which are statistically more traversed than normal edges, and "peripheral in-path edges", which are less traversed than normal edges. Compared with other simple measures such as betweenness centrality, WIPER provides better biological interpretations. In the case study of analyzing postanal pig hearts gene expression, WIPER highlighted new signaling pathways suggestive of cardiomyocyte regeneration and proliferation. In the case study of Alzheimer's disease genetic disorder association, WIPER reports SRC:APP, AR:APP, APP:FYN, and APP:NES edges (gene-gene associations) both statistically and biologically important from PubMed co-citation. Conclusion We believe that WIPER will become an essential software tool to help biologists discover and validate essential signaling/regulatory events from high-throughput biology data in the context of biological networks. Availability The free WIPER API is described at discovery.informatics.uab.edu/wiper/.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute, School of Medicine, University of Alabama, Birmingham, AL 35233, USA
| | - Thanh Nguyen
- Informatics Institute, School of Medicine, University of Alabama, Birmingham, AL 35233, USA
| | - Eric Zhang
- Department of Biomedical Engineering, University of Alabama, Birmingham, AL 35233, USA
| | - Jianyi Zhang
- Department of Biomedical Engineering, University of Alabama, Birmingham, AL 35233, USA
| | - Jake Y. Chen
- Informatics Institute, School of Medicine, University of Alabama, Birmingham, AL 35233, USA
- Department of Biomedical Engineering, University of Alabama, Birmingham, AL 35233, USA
- Department of Computer Science, University of Alabama, Birmingham, AL 35233, USA
| |
Collapse
|
15
|
Yue Z, Willey CD, Hjelmeland AB, Chen JY. BEERE: a web server for biomedical entity expansion, ranking and explorations. Nucleic Acids Res 2019; 47:W578-W586. [PMID: 31114876 PMCID: PMC6602520 DOI: 10.1093/nar/gkz428] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 05/04/2019] [Accepted: 05/20/2019] [Indexed: 12/02/2022] Open
Abstract
BEERE (Biomedical Entity Expansion, Ranking and Explorations) is a new web-based data analysis tool to help biomedical researchers characterize any input list of genes/proteins, biomedical terms or their combinations, i.e. 'biomedical entities', in the context of existing literature. Specifically, BEERE first aims to help users examine the credibility of known entity-to-entity associative or semantic relationships supported by database or literature references from the user input of a gene/term list. Then, it will help users uncover the relative importance of each entity-a gene or a term-within the user input by computing the ranking scores of all entities. At last, it will help users hypothesize new gene functions or genotype-phenotype associations by an interactive visual interface of constructed global entity relationship network. The output from BEERE includes: a list of the original entities matched with known relationships in databases; any expanded entities that may be generated from the analysis; the ranks and ranking scores reported with statistical significance for each entity; and an interactive graphical display of the gene or term network within data provenance annotations that link to external data sources. The web server is free and open to all users with no login requirement and can be accessed at http://discovery.informatics.uab.edu/beere/.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute, School of Medicine, the University of Alabama at Birmingham, AL 35233, USA
| | - Christopher D Willey
- Department of Radiation Oncology, School of Medicine, the University of Alabama at Birmingham, AL 35233, USA
| | - Anita B Hjelmeland
- Department of Cell, Developmental and Integrative Biology, School of Medicine, the University of Alabama at Birmingham, AL 35233, USA
| | - Jake Y Chen
- Informatics Institute, School of Medicine, the University of Alabama at Birmingham, AL 35233, USA
| |
Collapse
|
16
|
Yue Z, Neylon MT, Nguyen T, Ratliff T, Chen JY. "Super Gene Set" Causal Relationship Discovery from Functional Genomics Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1991-1998. [PMID: 30040650 PMCID: PMC6380687 DOI: 10.1109/tcbb.2018.2858755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this article, we present a computational framework to identify "causal relationships" among super gene sets. For "causal relationships," we refer to both stimulatory and inhibitory regulatory relationships, regardless of through direct or indirect mechanisms. For super gene sets, we refer to "pathways, annotated lists, and gene signatures," or PAGs. To identify causal relationships among PAGs, we extend the previous work on identifying PAG-to-PAG regulatory relationships by further requiring them to be significantly enriched with gene-to-gene co-expression pairs across the two PAGs involved. This is achieved by developing a quantitative metric based on PAG-to-PAG Co-expressions (PPC), which we use to infer the likelihood that PAG-to-PAG relationships under examination are causal-either stimulatory or inhibitory. Since true causal relationships are unknown, we approximate the overall performance of inferring causal relationships with the performance of recalling known r-type PAG-to-PAG relationships from causal PAG-to-PAG inference, using a functional genomics benchmark dataset from the GEO database. We report the area-under-curve (AUC) performance for both precision and recall being 0.81. By applying our framework to a myeloid-derived suppressor cells (MDSC) dataset, we further demonstrate that this framework is effective in helping build multi-scale biomolecular systems models with new insights on regulatory and causal links for downstream biological interpretations.
Collapse
Affiliation(s)
- Zongliang Yue
- Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35233, US.
| | - Michael T. Neylon
- School of Informatics and Computing, Indiana University, Indianapolis, IN 46202, US.
| | - Thanh Nguyen
- Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35233, US.
| | - Timothy Ratliff
- Purdue University Center for Cancer Research, West Lafayette, IN 47906, US.
| | - Jake Y. Chen
- Informatics Institute, the University of Alabama at Birmingham, Birmingham, AL 35233, US.
| |
Collapse
|