1
|
Lei Y, Huang XT, Guo X, Hang Katie Chan K, Gao L. DeepGRNCS: deep learning-based framework for jointly inferring gene regulatory networks across cell subpopulations. Brief Bioinform 2024; 25:bbae334. [PMID: 38980373 PMCID: PMC11232306 DOI: 10.1093/bib/bbae334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/03/2024] [Accepted: 07/01/2024] [Indexed: 07/10/2024] Open
Abstract
Inferring gene regulatory networks (GRNs) allows us to obtain a deeper understanding of cellular function and disease pathogenesis. Recent advances in single-cell RNA sequencing (scRNA-seq) technology have improved the accuracy of GRN inference. However, many methods for inferring individual GRNs from scRNA-seq data are limited because they overlook intercellular heterogeneity and similarities between different cell subpopulations, which are often present in the data. Here, we propose a deep learning-based framework, DeepGRNCS, for jointly inferring GRNs across cell subpopulations. We follow the commonly accepted hypothesis that the expression of a target gene can be predicted based on the expression of transcription factors (TFs) due to underlying regulatory relationships. We initially processed scRNA-seq data by discretizing data scattering using the equal-width method. Then, we trained deep learning models to predict target gene expression from TFs. By individually removing each TF from the expression matrix, we used pre-trained deep model predictions to infer regulatory relationships between TFs and genes, thereby constructing the GRN. Our method outperforms existing GRN inference methods for various simulated and real scRNA-seq datasets. Finally, we applied DeepGRNCS to non-small cell lung cancer scRNA-seq data to identify key genes in each cell subpopulation and analyzed their biological relevance. In conclusion, DeepGRNCS effectively predicts cell subpopulation-specific GRNs. The source code is available at https://github.com/Nastume777/DeepGRNCS.
Collapse
Affiliation(s)
- Yahui Lei
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Xiao-Tai Huang
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Xingli Guo
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| | - Kei Hang Katie Chan
- Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China
- Department of Biomedical Sciences, City University of Hong Kong, Hong Kong SAR, China
- Department of Epidemiology and Center for Global Cardiometabolic Health, Brown University, Providence, RI, United States
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an 710071, Shaanxi, China
| |
Collapse
|
2
|
Munshi TA, Jahan LN, Howladar MF, Hashan M. Prediction of gross calorific value from coal analysis using decision tree-based bagging and boosting techniques. Heliyon 2024; 10:e23395. [PMID: 38169874 PMCID: PMC10758790 DOI: 10.1016/j.heliyon.2023.e23395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 11/12/2023] [Accepted: 12/03/2023] [Indexed: 01/05/2024] Open
Abstract
The calorific value of any fuel is one of the crucial parameters to grade fuel's burning capability. The bomb calorimeter has historically been used to calculate coal's gross calorific value (GCV). However, for many years, engineers and scientists were trying to measure coal's GCV without a bomb calorimeter, using only laboratory-derived ultimate and/or proximate analyses to eliminate tedious and time-consuming laboratory analyses. In this study, Extra trees, Bagging, Decision tree, and Adaptive boosting are developed for the first time in coal's GCV modeling. In addition, the prediction and computational efficiency of previously applied decision tree-based algorithms, such as Random forest, Gradient boosting, and XGBoost are investigated. Well-established empirical models, namely Schuster, Mazumdar, Channiwala and Parikh, Parikh et al. and Central Fuel Research Institute of India are examined to compare their efficiency with newly developed algorithms. Proximate and ultimate analysis parameters are ranked based on their significance in GCV modeling. The studied models are tuned using an exhaustive grid search technique. Statistical indexes, such as explained variance (EV), mean absolute error (MAE), coefficient of determinant (R2), mean squared error (MSE), maximum error, minimum error, and mean absolute percentage error (MAPE) are used to critique these models. To accomplish the goals, 7430 data points containing ten coal features, such as ash, moisture, fixed carbon, volatile matter, hydrogen, carbon, sulfur, nitrogen, oxygen, and GCV are selected from the U.S. Geological Survey Coal Quality (COALQUAL) database. It has been found that, due to simplicity and location-specific constraints, empirical models could not correlate proximate and/or ultimate analyses with GCV. Bagging and boosting techniques tested here performed well with the coefficient of determinant (R 2 ) of over 0.97. The XGBoost model outperforms other tree-based algorithms with the most significant coefficient of determinant (R 2 of 0.9974) and lowest error values (MSE of 14703.3, max_error of 1027.2, MAE of 89.2, MAPE of 0.009). The studied models' ranking (highest to lowest) based on their performance are XGBoost, Extra trees, Random forest, Bagging, Gradient boosting, Decision tree, and Adaptive boosting. The correlation heatmap and scatterplots used here clearly indicate that oxygen and carbon are the utmost significant, whereas volatile matter and sulfur are the least essential rank parameters for GCV modeling. The strategy suggested in this research can aid engineers/operators in obtaining a rapid and accurate determination of the GCV with a few coal features, thus lessening complicated, tedious, expensive, and time-consuming laboratory efforts.
Collapse
Affiliation(s)
- Tanveer Alam Munshi
- Department of Petroleum and Mining Engineering, Shahjalal University of Science and Technology, Sylhet, 3114, Bangladesh
| | - Labiba Nusrat Jahan
- Department of Petroleum and Mining Engineering, Shahjalal University of Science and Technology, Sylhet, 3114, Bangladesh
| | - M. Farhad Howladar
- Department of Petroleum and Mining Engineering, Shahjalal University of Science and Technology, Sylhet, 3114, Bangladesh
| | - Mahamudul Hashan
- Department of Petroleum and Mining Engineering, Shahjalal University of Science and Technology, Sylhet, 3114, Bangladesh
| |
Collapse
|
3
|
Li L, Sun L, Chen G, Wong CW, Ching WK, Liu ZP. LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinformatics 2023; 39:btad256. [PMID: 37079737 PMCID: PMC10172039 DOI: 10.1093/bioinformatics/btad256] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/25/2023] [Accepted: 04/13/2023] [Indexed: 04/22/2023] Open
Abstract
MOTIVATION From a systematic perspective, it is crucial to infer and analyze gene regulatory network (GRN) from high-throughput single-cell RNA sequencing data. However, most existing GRN inference methods mainly focus on the network topology, only few of them consider how to explicitly describe the updated logic rules of regulation in GRNs to obtain their dynamics. Moreover, some inference methods also fail to deal with the over-fitting problem caused by the noise in time series data. RESULTS In this article, we propose a novel embedded Boolean threshold network method called LogBTF, which effectively infers GRN by integrating regularized logistic regression and Boolean threshold function. First, the continuous gene expression values are converted into Boolean values and the elastic net regression model is adopted to fit the binarized time series data. Then, the estimated regression coefficients are applied to represent the unknown Boolean threshold function of the candidate Boolean threshold network as the dynamical equations. To overcome the multi-collinearity and over-fitting problems, a new and effective approach is designed to optimize the network topology by adding a perturbation design matrix to the input data and thereafter setting sufficiently small elements of the output coefficient vector to zeros. In addition, the cross-validation procedure is implemented into the Boolean threshold network model framework to strengthen the inference capability. Finally, extensive experiments on one simulated Boolean value dataset, dozens of simulation datasets, and three real single-cell RNA sequencing datasets demonstrate that the LogBTF method can infer GRNs from time series data more accurately than some other alternative methods for GRN inference. AVAILABILITY AND IMPLEMENTATION The source data and code are available at https://github.com/zpliulab/LogBTF.
Collapse
Affiliation(s)
- Lingyu Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Liangjie Sun
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Guangyi Chen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| | - Chi-Wing Wong
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Wai-Ki Ching
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| |
Collapse
|
4
|
Tu YH, Juan HF, Huang HC. Context-dependent gene regulatory network reveals regulation dynamics and cell trajectories using unspliced transcripts. Brief Bioinform 2023; 24:6991202. [PMID: 36653899 DOI: 10.1093/bib/bbac633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/06/2022] [Accepted: 12/29/2022] [Indexed: 01/20/2023] Open
Abstract
Gene regulatory networks govern complex gene expression programs in various biological phenomena, including embryonic development, cell fate decisions and oncogenesis. Single-cell techniques are increasingly being used to study gene expression, providing higher resolution than traditional approaches. However, inferring a comprehensive gene regulatory network across different cell types remains a challenge. Here, we propose to construct context-dependent gene regulatory networks (CDGRNs) from single-cell RNA sequencing data utilizing both spliced and unspliced transcript expression levels. A gene regulatory network is decomposed into subnetworks corresponding to different transcriptomic contexts. Each subnetwork comprises the consensus active regulation pairs of transcription factors and their target genes shared by a group of cells, inferred by a Gaussian mixture model. We find that the union of gene regulation pairs in all contexts is sufficient to reconstruct differentiation trajectories. Functions specific to the cell cycle, cell differentiation or tissue-specific functions are enriched throughout the developmental process in each context. Surprisingly, we also observe that the network entropy of CDGRNs decreases along differentiation trajectories, indicating directionality in differentiation. Overall, CDGRN allows us to establish the connection between gene regulation at the molecular level and cell differentiation at the macroscopic level.
Collapse
Affiliation(s)
- Yueh-Hua Tu
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 115, Taiwan
- Taiwan International Graduate Program on Bioinformatics, National Taiwan University, Taipei, 106, Taiwan
| | - Hsueh-Fen Juan
- Taiwan International Graduate Program on Bioinformatics, National Taiwan University, Taipei, 106, Taiwan
- Department of Life Science, National Taiwan University, Taipei, 106, Taiwan
| | - Hsuan-Cheng Huang
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
| |
Collapse
|
5
|
Keyl P, Bischoff P, Dernbach G, Bockmayr M, Fritz R, Horst D, Blüthgen N, Montavon G, Müller KR, Klauschen F. Single-cell gene regulatory network prediction by explainable AI. Nucleic Acids Res 2023; 51:e20. [PMID: 36629274 PMCID: PMC9976884 DOI: 10.1093/nar/gkac1212] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 11/16/2022] [Accepted: 12/06/2022] [Indexed: 01/12/2023] Open
Abstract
The molecular heterogeneity of cancer cells contributes to the often partial response to targeted therapies and relapse of disease due to the escape of resistant cell populations. While single-cell sequencing has started to improve our understanding of this heterogeneity, it offers a mostly descriptive view on cellular types and states. To obtain more functional insights, we propose scGeneRAI, an explainable deep learning approach that uses layer-wise relevance propagation (LRP) to infer gene regulatory networks from static single-cell RNA sequencing data for individual cells. We benchmark our method with synthetic data and apply it to single-cell RNA sequencing data of a cohort of human lung cancers. From the predicted single-cell networks our approach reveals characteristic network patterns for tumor cells and normal epithelial cells and identifies subnetworks that are observed only in (subgroups of) tumor cells of certain patients. While current state-of-the-art methods are limited by their ability to only predict average networks for cell populations, our approach facilitates the reconstruction of networks down to the level of single cells which can be utilized to characterize the heterogeneity of gene regulation within and across tumors.
Collapse
Affiliation(s)
- Philipp Keyl
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Philip Bischoff
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Anna-Louisa-Karsch-Straße 2, 10178 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Berlin partner site, Germany
| | - Gabriel Dernbach
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | - Michael Bockmayr
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- Department of Pediatric Hematology and Oncolog, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246 Hamburg, Germany
- Mildred Scheel Cancer Career Center HaTriCS4, University Medical Center Hamburg-Eppendorf Martinistr. 52, 20246 Hamburg, Germany
| | - Rebecca Fritz
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - David Horst
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Berlin partner site, Germany
| | - Nils Blüthgen
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- Institut für Biologie, Humboldt University, Free University of Berlin, Unter den Linden 6, 10099 Berlin, Germany
| | - Grégoire Montavon
- BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Machine Learning Group, Technical University of Berlin, Marchstr. 23, 10587 Berlin, Germany
| | - Klaus-Robert Müller
- BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Machine Learning Group, Technical University of Berlin, Marchstr. 23, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Seoul 136-713, South Korea
- Max-Planck-Institute for Informatics, Stuhlsatzenhausweg 4, 66123 Saarbrücken, Germany
| | - Frederick Klauschen
- Institute of Pathology, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität Berlin, Charitéplatz 1, 10117 Berlin, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Berlin partner site, Germany
- BIFOLD – Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- Institute of Pathology, Ludwig-Maximilians-University Munich, Thalkirchner Str. 36, 80337 München, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Munich partner site, Germany
| |
Collapse
|
6
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
- Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Murray J. Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia
- Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| |
Collapse
|
7
|
Mao G, Zeng R, Peng J, Zuo K, Pang Z, Liu J. Reconstructing gene regulatory networks of biological function using differential equations of multilayer perceptrons. BMC Bioinformatics 2022; 23:503. [PMID: 36434499 PMCID: PMC9700916 DOI: 10.1186/s12859-022-05055-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 11/14/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Building biological networks with a certain function is a challenge in systems biology. For the functionality of small (less than ten nodes) biological networks, most methods are implemented by exhausting all possible network topological spaces. This exhaustive approach is difficult to scale to large-scale biological networks. And regulatory relationships are complex and often nonlinear or non-monotonic, which makes inference using linear models challenging. RESULTS In this paper, we propose a multi-layer perceptron-based differential equation method, which operates by training a fully connected neural network (NN) to simulate the transcription rate of genes in traditional differential equations. We verify whether the regulatory network constructed by the NN method can continue to achieve the expected biological function by verifying the degree of overlap between the regulatory network discovered by NN and the regulatory network constructed by the Hill function. And we validate our approach by adapting to noise signals, regulator knockout, and constructing large-scale gene regulatory networks using link-knockout techniques. We apply a real dataset (the mesoderm inducer Xenopus Brachyury expression) to construct the core topology of the gene regulatory network and find that Xbra is only strongly expressed at moderate levels of activin signaling. CONCLUSION We have demonstrated from the results that this method has the ability to identify the underlying network topology and functional mechanisms, and can also be applied to larger and more complex gene network topologies.
Collapse
Affiliation(s)
- Guo Mao
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ruigeng Zeng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jintao Peng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ke Zuo
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Zhengbin Pang
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jie Liu
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China ,grid.412110.70000 0000 9548 2110Laboratory of Software Engineering for Complex System, National University of Defense Technology, Deya Road, Changsha, 410073 China
| |
Collapse
|
8
|
Roels J, Van Hulle J, Lavaert M, Kuchmiy A, Strubbe S, Putteman T, Vandekerckhove B, Leclercq G, Van Nieuwerburgh F, Boehme L, Taghon T. Transcriptional dynamics and epigenetic regulation of E and ID protein encoding genes during human T cell development. Front Immunol 2022; 13:960918. [PMID: 35967340 PMCID: PMC9366357 DOI: 10.3389/fimmu.2022.960918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 07/05/2022] [Indexed: 12/05/2022] Open
Abstract
T cells are generated from hematopoietic stem cells through a highly organized developmental process, in which stage-specific molecular events drive maturation towards αβ and γδ T cells. Although many of the mechanisms that control αβ- and γδ-lineage differentiation are shared between human and mouse, important differences have also been observed. Here, we studied the regulatory dynamics of the E and ID protein encoding genes during pediatric human T cell development by evaluating changes in chromatin accessibility, histone modifications and bulk and single cell gene expression. We profiled patterns of ID/E protein activity and identified up- and downstream regulators and targets, respectively. In addition, we compared transcription of E and ID protein encoding genes in human versus mouse to predict both shared and unique activities in these species, and in prenatal versus pediatric human T cell differentiation to identify regulatory changes during development. This analysis showed a putative involvement of TCF3/E2A in the development of γδ T cells. In contrast, in αβ T cell precursors a pivotal pre-TCR-driven population with high ID gene expression and low predicted E protein activity was identified. Finally, in prenatal but not postnatal thymocytes, high HEB/TCF12 levels were found to counteract high ID levels to sustain thymic development. In summary, we uncovered novel insights in the regulation of E and ID proteins on a cross-species and cross-developmental level.
Collapse
MESH Headings
- Animals
- Cell Differentiation/genetics
- Child
- Epigenesis, Genetic
- Hematopoietic Stem Cells/metabolism
- Humans
- Mice
- Receptors, Antigen, T-Cell, alpha-beta/genetics
- Receptors, Antigen, T-Cell, alpha-beta/metabolism
- Receptors, Antigen, T-Cell, gamma-delta/genetics
- Receptors, Antigen, T-Cell, gamma-delta/metabolism
- Transcription Factors/metabolism
Collapse
Affiliation(s)
- Juliette Roels
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Department of Biomolecular Medicine, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Jolien Van Hulle
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Marieke Lavaert
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Anna Kuchmiy
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Steven Strubbe
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Tom Putteman
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
| | - Bart Vandekerckhove
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Georges Leclercq
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
| | - Filip Van Nieuwerburgh
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- Laboratory of Pharmaceutical Biotechnology, Ghent University, Ghent, Belgium
| | - Lena Boehme
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- *Correspondence: Lena Boehme, ; Tom Taghon,
| | - Tom Taghon
- Department of Diagnostic Sciences, Ghent University, Ghent, Belgium
- Cancer Research Institute Ghent (CRIG), Ghent, Belgium
- *Correspondence: Lena Boehme, ; Tom Taghon,
| |
Collapse
|