1
|
Liu J, Ma J, Wen J, Zhou X. A Cell Cycle-Aware Network for Data Integration and Label Transferring of Single-Cell RNA-Seq and ATAC-Seq. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024:e2401815. [PMID: 38887194 DOI: 10.1002/advs.202401815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 04/22/2024] [Indexed: 06/20/2024]
Abstract
In recent years, the integration of single-cell multi-omics data has provided a more comprehensive understanding of cell functions and internal regulatory mechanisms from a non-single omics perspective, but it still suffers many challenges, such as omics-variance, sparsity, cell heterogeneity, and confounding factors. As it is known, the cell cycle is regarded as a confounder when analyzing other factors in single-cell RNA-seq data, but it is not clear how it will work on the integrated single-cell multi-omics data. Here, a cell cycle-aware network (CCAN) is developed to remove cell cycle effects from the integrated single-cell multi-omics data while keeping the cell type-specific variations. This is the first computational model to study the cell-cycle effects in the integration of single-cell multi-omics data. Validations on several benchmark datasets show the outstanding performance of CCAN in a variety of downstream analyses and applications, including removing cell cycle effects and batch effects of scRNA-seq datasets from different protocols, integrating paired and unpaired scRNA-seq and scATAC-seq data, accurately transferring cell type labels from scRNA-seq to scATAC-seq data, and characterizing the differentiation process from hematopoietic stem cells to different lineages in the integration of differentiation data.
Collapse
Affiliation(s)
- Jiajia Liu
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Jian Ma
- Department of Electronic Information and Computer Engineering, The Engineering & Technical College of Chengdu University of Technology, Leshan, Sichuan, 614000, China
| | - Jianguo Wen
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| |
Collapse
|
2
|
Sun H, Qiu J, Qiu J. Epigenetic regulation of innate lymphoid cells. Eur J Immunol 2024:e2350379. [PMID: 38824666 DOI: 10.1002/eji.202350379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 05/17/2024] [Accepted: 05/21/2024] [Indexed: 06/04/2024]
Abstract
Innate lymphoid cells (ILCs) lack antigen-specific receptors and are considered the innate arm of the immune system, phenotypically and functionally mirroring CD4+ helper T cells. ILCs are categorized into groups 1, 2, and 3 based on transcription factors and cytokine expression. ILCs predominantly reside in mucosal tissues and play important roles in regional immune responses. The development and function of ILC subsets are controlled by both transcriptional and epigenetic mechanisms, which have been extensively studied in recent years. Epigenetic regulation refers to inheritable changes in gene expression that occur without affecting DNA sequences. This mainly includes chromatin status, histone modifications, and DNA methylation. In this review, we summarize recent discoveries on epigenetic mechanisms regulating ILC development and function, and how these regulations affect disease progression under pathological conditions. Although the ablation of specific epigenetic regulators can cause global changes in corresponding epigenetic modifications to the chromatin, only partial genes with altered epigenetic modifications change their mRNA expression, resulting in specific outcomes in cell differentiation and function. Therefore, elucidating epigenetic mechanisms underlying the regulation of ILCs will provide potential targets for the diagnosis and treatment of inflammatory diseases.
Collapse
Affiliation(s)
- Hanxiao Sun
- Department of Laboratory Medicine, Department of Blood Transfusion, Tongren Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Jinxin Qiu
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Ju Qiu
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
3
|
Kuraz Abebe B, Wang J, Guo J, Wang H, Li A, Zan L. A review of the role of epigenetic studies for intramuscular fat deposition in beef cattle. Gene 2024; 908:148295. [PMID: 38387707 DOI: 10.1016/j.gene.2024.148295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 01/23/2024] [Accepted: 02/15/2024] [Indexed: 02/24/2024]
Abstract
Intramuscular fat (IMF) deposition profoundly influences meat quality and economic value in beef cattle production. Meanwhile, contemporary developments in epigenetics have opened new outlooks for understanding the molecular basics of IMF regulation, and it has become a key area of research for world scholars. Therefore, the aim of this paper was to provide insight and synthesis into the intricate relationship between epigenetic mechanisms and IMF deposition in beef cattle. The methodology involves a thorough analysis of existing literature, including pertinent books, academic journals, and online resources, to provide a comprehensive overview of the role of epigenetic studies in IMF deposition in beef cattle. This review summarizes the contemporary studies in epigenetic mechanisms in IMF regulation, high-resolution epigenomic mapping, single-cell epigenomics, multi-omics integration, epigenome editing approaches, longitudinal studies in cattle growth, environmental epigenetics, machine learning in epigenetics, ethical and regulatory considerations, and translation to industry practices from perspectives of IMF deposition in beef cattle. Moreover, this paper highlights DNA methylation, histone modifications, acetylation, phosphorylation, ubiquitylation, non-coding RNAs, DNA hydroxymethylation, epigenetic readers, writers, and erasers, chromatin immunoprecipitation followed by sequencing, whole genome bisulfite sequencing, epigenome-wide association studies, and their profound impact on the expression of crucial genes governing adipogenesis and lipid metabolism. Nutrition and stress also have significant influences on epigenetic modifications and IMF deposition. The key findings underscore the pivotal role of epigenetic studies in understanding and enhancing IMF deposition in beef cattle, with implications for precision livestock farming and ethical livestock management. In conclusion, this review highlights the crucial significance of epigenetic pathways and environmental factors in affecting IMF deposition in beef cattle, providing insightful information for improving the economics and meat quality of cattle production.
Collapse
Affiliation(s)
- Belete Kuraz Abebe
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China; Department of Animal Science, Werabe University, P.O. Box 46, Werabe, Ethiopia
| | - Jianfang Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China
| | - Juntao Guo
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China
| | - Hongbao Wang
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China
| | - Anning Li
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China
| | - Linsen Zan
- College of Animal Science and Technology, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China; National Beef Cattle Improvement Center, Northwest A&F University, Yangling, Shaanxi 712100, People's Republic of China.
| |
Collapse
|
4
|
Shu C, Street K, Breton CV, Bastain TM, Wilson ML. A review of single-cell transcriptomics and epigenomics studies in maternal and child health. Epigenomics 2024:1-20. [PMID: 38709139 DOI: 10.1080/17501911.2024.2343276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 04/11/2024] [Indexed: 05/07/2024] Open
Abstract
Single-cell sequencing technologies enhance our understanding of cellular dynamics throughout pregnancy. We outlined the workflow of single-cell sequencing techniques and reviewed single-cell studies in maternal and child health. We conducted a literature review of single cell studies on maternal and child health using PubMed. We summarized the findings from 16 single-cell atlases of the human and mammalian placenta across gestational stages and 31 single-cell studies on maternal exposures and complications including infection, obesity, diet, gestational diabetes, pre-eclampsia, environmental exposure and preterm birth. Single-cell studies provides insights on novel cell types in placenta and cell type-specific marks associated with maternal exposures and complications.
Collapse
Affiliation(s)
- Chang Shu
- Center for Genetic Epidemiology, Division of Epidemiology & Genetics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Kelly Street
- Division of Biostatistics, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Carrie V Breton
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Theresa M Bastain
- Division of Environmental Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA USA
| | - Melissa L Wilson
- Division of Disease Prevention, Policy, & Global Health, Department of Population & Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles,CA USA
| |
Collapse
|
5
|
Li J, Pan X, Yuan Y, Shen HB. TFvelo: gene regulation inspired RNA velocity estimation. Nat Commun 2024; 15:1387. [PMID: 38360714 DOI: 10.1038/s41467-024-45661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 01/30/2024] [Indexed: 02/17/2024] Open
Abstract
RNA velocity is closely related with cell fate and is an important indicator for the prediction of cell states with elegant physical explanation derived from single-cell RNA-seq data. Most existing RNA velocity models aim to extract dynamics from the phase delay between unspliced and spliced mRNA for each individual gene. However, unspliced/spliced mRNA abundance may not provide sufficient signal for dynamic modeling, leading to poor fit in phase portraits. Motivated by the idea that RNA velocity could be driven by the transcriptional regulation, we propose TFvelo, which expands RNA velocity concept to various single-cell datasets without relying on splicing information, by introducing gene regulatory information. Our experiments on synthetic data and multiple scRNA-Seq datasets show that TFvelo can accurately fit genes dynamics on phase portraits, and effectively infer cell pseudo-time and trajectory from RNA abundance data. TFvelo opens a robust and accurate avenue for modeling RNA velocity for single cell data.
Collapse
Affiliation(s)
- Jiachen Li
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
| | - Ye Yuan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
| |
Collapse
|
6
|
Martens LD, Fischer DS, Yépez VA, Theis FJ, Gagneur J. Modeling fragment counts improves single-cell ATAC-seq analysis. Nat Methods 2024; 21:28-31. [PMID: 38049697 PMCID: PMC10776385 DOI: 10.1038/s41592-023-02112-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 10/25/2023] [Indexed: 12/06/2023]
Abstract
Single-cell ATAC sequencing coverage in regulatory regions is typically binarized as an indicator of open chromatin. Here we show that binarization is an unnecessary step that neither improves goodness of fit, clustering, cell type identification nor batch integration. Fragment counts, but not read counts, should instead be modeled, which preserves quantitative regulatory information. These results have immediate implications for single-cell ATAC sequencing analysis.
Collapse
Affiliation(s)
- Laura D Martens
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany
- Helmholtz Association, Munich School for Data Science (MUDS), Munich, Germany
| | - David S Fischer
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Vicente A Yépez
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Fabian J Theis
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
- Helmholtz Association, Munich School for Data Science (MUDS), Munich, Germany.
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
- Helmholtz Association, Munich School for Data Science (MUDS), Munich, Germany.
- Institute of Human Genetics, School of Medicine, Technical University of Munich, Munich, Germany.
| |
Collapse
|
7
|
Ramakrishnan M, Zhou M, Ceasar SA, Ali DJ, Maharajan T, Vinod KK, Sharma A, Ahmad Z, Wei Q. Epigenetic modifications and miRNAs determine the transition of somatic cells into somatic embryos. PLANT CELL REPORTS 2023; 42:1845-1873. [PMID: 37792027 DOI: 10.1007/s00299-023-03071-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 09/13/2023] [Indexed: 10/05/2023]
Abstract
KEY MESSAGE This review discusses the epigenetic changes during somatic embryo (SE) development, highlights the genes and miRNAs involved in the transition of somatic cells into SEs as a result of epigenetic changes, and draws insights on biotechnological opportunities to study SE development. Somatic embryogenesis from somatic cells occurs in a series of steps. The transition of somatic cells into somatic embryos (SEs) is the most critical step under genetic and epigenetic regulations. Major regulatory genes such as SERK, WUS, BBM, FUS3/FUSA3, AGL15, and PKL, control SE steps and development by turning on and off other regulatory genes. Gene transcription profiles of somatic cells during SE development is the result of epigenetic changes, such as DNA and histone protein modifications, that control and decide the fate of SE formation. Depending on the type of somatic cells and the treatment with plant growth regulators, epigenetic changes take place dynamically. Either hypermethylation or hypomethylation of SE-related genes promotes the transition of somatic cells. For example, the reduced levels of DNA methylation of SERK and WUS promotes SE initiation. Histone modifications also promote SE induction by regulating SE-related genes in somatic cells. In addition, miRNAs contribute to the various stages of SE by regulating the expression of auxin signaling pathway genes (TIR1, AFB2, ARF6, and ARF8), transcription factors (CUC1 and CUC2), and growth-regulating factors (GRFs) involved in SE formation. These epigenetic and miRNA functions are unique and have the potential to regenerate bipolar structures from somatic cells when a pluripotent state is induced. However, an integrated overview of the key regulators involved in SE development and downstream processes is lacking. Therefore, this review discusses epigenetic modifications involved in SE development, SE-related genes and miRNAs associated with epigenetics, and common cis-regulatory elements in the promoters of SE-related genes. Finally, we highlight future biotechnological opportunities to alter epigenetic pathways using the genome editing tool and to study the transition mechanism of somatic cells.
Collapse
Affiliation(s)
- Muthusamy Ramakrishnan
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration On Subtropical Forest Biodiversity Conservation, School of Life Sciences, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Mingbing Zhou
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Lin'an, Hangzhou, 311300, Zhejiang, China
- Zhejiang Provincial Collaborative Innovation Center for Bamboo Resources and High-Efficiency Utilization, Zhejiang A&F University, Lin'an, Hangzhou, 311300, Zhejiang, China
| | - Stanislaus Antony Ceasar
- Department of Biosciences, Rajagiri College of Social Sciences (Autonomous), Kalamassery, Kochi, 683104, Kerala, India
| | - Doulathunnisa Jaffar Ali
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096, Jiangsu, China
| | - Theivanayagam Maharajan
- Department of Biosciences, Rajagiri College of Social Sciences (Autonomous), Kalamassery, Kochi, 683104, Kerala, India
| | | | - Anket Sharma
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Lin'an, Hangzhou, 311300, Zhejiang, China
| | - Zishan Ahmad
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration On Subtropical Forest Biodiversity Conservation, School of Life Sciences, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Qiang Wei
- State Key Laboratory of Tree Genetics and Breeding, Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration On Subtropical Forest Biodiversity Conservation, School of Life Sciences, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China.
| |
Collapse
|
8
|
Wang Q, Zhang J, Liu Z, Duan Y, Li C. Integrative approaches based on genomic techniques in the functional studies on enhancers. Brief Bioinform 2023; 25:bbad442. [PMID: 38048082 PMCID: PMC10694556 DOI: 10.1093/bib/bbad442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 10/22/2023] [Accepted: 11/08/2023] [Indexed: 12/05/2023] Open
Abstract
With the development of sequencing technology and the dramatic drop in sequencing cost, the functions of noncoding genes are being characterized in a wide variety of fields (e.g. biomedicine). Enhancers are noncoding DNA elements with vital transcription regulation functions. Tens of thousands of enhancers have been identified in the human genome; however, the location, function, target genes and regulatory mechanisms of most enhancers have not been elucidated thus far. As high-throughput sequencing techniques have leapt forwards, omics approaches have been extensively employed in enhancer research. Multidimensional genomic data integration enables the full exploration of the data and provides novel perspectives for screening, identification and characterization of the function and regulatory mechanisms of unknown enhancers. However, multidimensional genomic data are still difficult to integrate genome wide due to complex varieties, massive amounts, high rarity, etc. To facilitate the appropriate methods for studying enhancers with high efficacy, we delineate the principles, data processing modes and progress of various omics approaches to study enhancers and summarize the applications of traditional machine learning and deep learning in multi-omics integration in the enhancer field. In addition, the challenges encountered during the integration of multiple omics data are addressed. Overall, this review provides a comprehensive foundation for enhancer analysis.
Collapse
Affiliation(s)
- Qilin Wang
- School of Engineering Medicine, Beihang University, Beijing 100191, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Junyou Zhang
- School of Engineering Medicine, Beihang University, Beijing 100191, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Zhaoshuo Liu
- School of Engineering Medicine, Beihang University, Beijing 100191, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Yingying Duan
- School of Engineering Medicine, Beihang University, Beijing 100191, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Chunyan Li
- School of Engineering Medicine, Beihang University, Beijing 100191, China
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
- Key Laboratory of Big Data-Based Precision Medicine (Ministry of Industry and Information Technology), Beihang University, Beijing 100191, China
- Beijing Advanced Innovation Center for Big Data-Based Precision Medicine, Beihang University, Beijing 100191, China
| |
Collapse
|
9
|
Lei Q, Yuan B, Liu K, Peng L, Xia Z. A novel prognostic related lncRNA signature associated with amino acid metabolism in glioma. Front Immunol 2023; 14:1014378. [PMID: 37114036 PMCID: PMC10126287 DOI: 10.3389/fimmu.2023.1014378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 03/13/2023] [Indexed: 04/29/2023] Open
Abstract
Background Glioma is one of the deadliest malignant brain tumors in adults, which is highly invasive and has a poor prognosis, and long non-coding RNAs (lncRNAs) have key roles in the progression of glioma. Amino acid metabolism reprogramming is an emerging hallmark in cancer. However, the diverse amino acid metabolism programs and prognostic value remain unclear during glioma progression. Thus, we aim to find potential amino-related prognostic glioma hub genes, elaborate and verify their functions, and explore further their impact on glioma. Methods Glioblastoma (GBM) and low-grade glioma (LGG) patients' data were downloaded from TCGA and CCGA datasets. LncRNAs associated with amino acid metabolism were discriminated against via correlation analysis. LASSO analysis and Cox regression analysis were conducted to identify lncRNAs related to prognosis. GSVA and GSEA were performed to predict the potential biological functions of lncRNA. Somatic mutation data and CNV data were further built to demonstrate genomic alterations and the correlation between risk scores. Human glioma cell lines U251 and U87-MG were used for further validation in vitro experiments. Results There were eight amino-related lncRNAs in total with a high prognostic value that were identified via Cox regression and LASSO regression analyses. The high risk-score group presented a significantly poorer prognosis compared with the low risk-score group, with more clinicopathological features and characteristic genomic aberrations. Our results provided new insights into biological functions in the above signature lncRNAs, which participate in the amino acid metabolism of glioma. LINC01561 is one of the eight identified lncRNAs, which was adopted for further verification. In in vitro experiments, siRNA-mediated LINC01561 silencing suppresses glioma cells' viability, migration, and proliferation. Conclusion Novel amino-related lncRNAs associated with the survival of glioma patients were identified, and a lncRNA signature can predict glioma prognosis and therapy response, which possibly has vital roles in glioma. Meanwhile, it emphasized the importance of amino acid metabolism in glioma, particularly in providing deeper research at the molecular level.
Collapse
Affiliation(s)
- Qiang Lei
- Department of Neurology, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
| | - Bo Yuan
- Department of Cerebrovascular Surgery, The Second People’s Hospital of Hunan Province, The Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Kun Liu
- Department of Cerebrovascular Surgery, The Second People’s Hospital of Hunan Province, The Hospital of Hunan University of Chinese Medicine, Changsha, Hunan, China
| | - Li Peng
- Department of Ophthalmology, Central South University Xiangya School of Medicine Affiliated Haikou Hospital, Haikou, Hainan, China
- Department of Ophthalmology, The Second Xiangya Hospital, Central South University, Changsha, Hunan, China
- *Correspondence: Zhiwei Xia, ; Li Peng,
| | - Zhiwei Xia
- Department of Neurology, Hunan Aerospace Hospital, Changsha, Hunan, China
- *Correspondence: Zhiwei Xia, ; Li Peng,
| |
Collapse
|
10
|
Ramakrishnan M, Zhang Z, Mullasseri S, Kalendar R, Ahmad Z, Sharma A, Liu G, Zhou M, Wei Q. Epigenetic stress memory: A new approach to study cold and heat stress responses in plants. FRONTIERS IN PLANT SCIENCE 2022; 13:1075279. [PMID: 36570899 PMCID: PMC9772030 DOI: 10.3389/fpls.2022.1075279] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 11/23/2022] [Indexed: 05/28/2023]
Abstract
Understanding plant stress memory under extreme temperatures such as cold and heat could contribute to plant development. Plants employ different types of stress memories, such as somatic, intergenerational and transgenerational, regulated by epigenetic changes such as DNA and histone modifications and microRNAs (miRNA), playing a key role in gene regulation from early development to maturity. In most cases, cold and heat stresses result in short-term epigenetic modifications that can return to baseline modification levels after stress cessation. Nevertheless, some of the modifications may be stable and passed on as stress memory, potentially allowing them to be inherited across generations, whereas some of the modifications are reactivated during sexual reproduction or embryogenesis. Several stress-related genes are involved in stress memory inheritance by turning on and off transcription profiles and epigenetic changes. Vernalization is the best example of somatic stress memory. Changes in the chromatin structure of the Flowering Locus C (FLC) gene, a MADS-box transcription factor (TF), maintain cold stress memory during mitosis. FLC expression suppresses flowering at high levels during winter; and during vernalization, B3 TFs, cold memory cis-acting element and polycomb repressive complex 1 and 2 (PRC1 and 2) silence FLC activation. In contrast, the repression of SQUAMOSA promoter-binding protein-like (SPL) TF and the activation of Heat Shock TF (HSFA2) are required for heat stress memory. However, it is still unclear how stress memory is inherited by offspring, and the integrated view of the regulatory mechanisms of stress memory and mitotic and meiotic heritable changes in plants is still scarce. Thus, in this review, we focus on the epigenetic regulation of stress memory and discuss the application of new technologies in developing epigenetic modifications to improve stress memory.
Collapse
Affiliation(s)
- Muthusamy Ramakrishnan
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | - Zhijun Zhang
- Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
- School of Forestry and Biotechnology, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Sileesh Mullasseri
- Department of Zoology, St. Albert’s College (Autonomous), Kochi, Kerala, India
| | - Ruslan Kalendar
- Helsinki Institute of Life Science HiLIFE, Biocenter 3, University of Helsinki, Helsinki, Finland
- National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan
| | - Zishan Ahmad
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | - Anket Sharma
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Guohua Liu
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| | - Mingbing Zhou
- State Key Laboratory of Subtropical Silviculture, Bamboo Industry Institute, Zhejiang A&F University, Hangzhou, Zhejiang, China
- Zhejiang Provincial Collaborative Innovation Center for Bamboo Resources and High-Efficiency Utilization, Zhejiang A&F University, Hangzhou, Zhejiang, China
| | - Qiang Wei
- Co-Innovation Center for Sustainable Forestry in Southern China, Bamboo Research Institute, Key Laboratory of National Forestry and Grassland Administration on Subtropical Forest Biodiversity Conservation, College of Biology and the Environment, Nanjing Forestry University, Nanjing, Jiangsu, China
| |
Collapse
|
11
|
Wang L, Feng Y, Wang J, Jin X, Zhang Q, Ackah M, Wang Y, Xu D, Zhao W. ATAC-seq exposes differences in chromatin accessibility leading to distinct leaf shapes in mulberry. PLANT DIRECT 2022; 6:e464. [PMID: 36540416 PMCID: PMC9755926 DOI: 10.1002/pld3.464] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2022] [Accepted: 10/30/2022] [Indexed: 06/17/2023]
Abstract
Mulberry leaf shape is an important agronomic trait indicating yield, growth, development, and habitat variation. China was the earliest country in the world to grow mulberry for sericulture, and it is also one of the great contributions of the Chinese nation to human civilization. ATAC-seq (Assay for Transposase Accessible Chromatin using sequencing) is a recently developed technique for genome-wide analysis of chromatin accessibility. The samples used for ATAC sequencing in this study were divided into two groups of whole leaves (CK-1 and CK-2) and lobed leaves (HL-1 and HL-2), with two replicates in each group. The related motif analysis, differential expression motif screening, and functional annotation of mulberry leaf shape differences were performed by raw letter analysis to finally obtain the transcription factors (TFs) that lead to the production of heteromorphic leaves. These transcription factors are common in plants, especially the TCP family, shown to be associated with leaf development and growth in other woody plants and are a potential transcription factor responsible for leaf shape differences in mulberry. Dissecting the regulatory mechanisms of leaf shape of different forms of mulberry leaves by ATAC-seq is an important way to protect mulberry germplasm resources and improve mulberry yield. It is conducive to cultivating mulberry varieties with high resistance to adversity, promoting the sustainable development of sericulture, and protecting and improving the ecological environment.
Collapse
Affiliation(s)
- Lei Wang
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Yuming Feng
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Jiangying Wang
- Leisure Agriculture LaboratoryLianyungang Academy of Agricultural SciencesLianyungangChina
| | - Xin Jin
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Qiaonan Zhang
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Michael Ackah
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Yuhua Wang
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| | - Dayong Xu
- Leisure Agriculture LaboratoryLianyungang Academy of Agricultural SciencesLianyungangChina
| | - Weiguo Zhao
- School of Biology and TechnologyJiangsu University of Science and TechnologyZhenjiangChina
| |
Collapse
|
12
|
Shu H, Ding F, Zhou J, Xue Y, Zhao D, Zeng J, Ma J. Boosting single-cell gene regulatory network reconstruction via bulk-cell transcriptomic data. Brief Bioinform 2022; 23:6693602. [PMID: 36070863 DOI: 10.1093/bib/bbac389] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/12/2022] Open
Abstract
Computational recovery of gene regulatory network (GRN) has recently undergone a great shift from bulk-cell towards designing algorithms targeting single-cell data. In this work, we investigate whether the widely available bulk-cell data could be leveraged to assist the GRN predictions for single cells. We infer cell-type-specific GRNs from both the single-cell RNA sequencing data and the generic GRN derived from the bulk cells by constructing a weakly supervised learning framework based on the axial transformer. We verify our assumption that the bulk-cell transcriptomic data are a valuable resource, which could improve the prediction of single-cell GRN by conducting extensive experiments. Our GRN-transformer achieves the state-of-the-art prediction accuracy in comparison to existing supervised and unsupervised approaches. In addition, we show that our method can identify important transcription factors and potential regulations for Alzheimer's disease risk genes by using the predicted GRN. Availability: The implementation of GRN-transformer is available at https://github.com/HantaoShu/GRN-Transformer.
Collapse
Affiliation(s)
- Hantao Shu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Fan Ding
- Department of Computer Science, Purdue University, IN 47907, United States
| | - Jingtian Zhou
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States.,Bioinformatics Program, University of California, San Diego, La Jolla, CA 92093, United States
| | - Yexiang Xue
- Department of Computer Science, Purdue University, IN 47907, United States
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianzhu Ma
- Institute for Artificial Intelligence, Peking University, Beijing 100091, China
| |
Collapse
|
13
|
Shi P, Nie Y, Yang J, Zhang W, Tang Z, Xu J. Fundamental and practical approaches for single-cell ATAC-seq analysis. ABIOTECH 2022; 3:212-223. [PMID: 36313930 PMCID: PMC9590475 DOI: 10.1007/s42994-022-00082-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 09/07/2022] [Indexed: 11/28/2022]
Abstract
Assays for transposase-accessible chromatin through high-throughput sequencing (ATAC-seq) are effective tools in the study of genome-wide chromatin accessibility landscapes. With the rapid development of single-cell technology, open chromatin regions that play essential roles in epigenetic regulation have been measured at the single-cell level using single-cell ATAC-seq approaches. The application of scATAC-seq has become as popular as that of scRNA-seq. However, owing to the nature of scATAC-seq data, which are sparse and noisy, processing the data requires different methodologies and empirical experience. This review presents a practical guide for processing scATAC-seq data, from quality evaluation to downstream analysis, for various applications. In addition to the epigenomic profiling from scATAC-seq, we also discuss recent studies in which the function of non-coding variants has been investigated based on cell type-specific cis-regulatory elements and how to use the by-product genetic information obtained from scATAC-seq to infer single-cell copy number variants and trace cell lineage. We anticipate that this review will assist researchers in designing and implementing scATAC-seq assays to facilitate research in diverse fields.
Collapse
Affiliation(s)
- Peiyu Shi
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275 China
| | - Yage Nie
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, 510275 China
| | - Jiawen Yang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275 China
| | - Weixing Zhang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275 China
| | - Zhongjie Tang
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275 China
| | - Jin Xu
- State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou, 510275 China
| |
Collapse
|
14
|
Single-cell specific and interpretable machine learning models for sparse scChIP-seq data imputation. PLoS One 2022; 17:e0270043. [PMID: 35776722 PMCID: PMC9249201 DOI: 10.1371/journal.pone.0270043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 06/02/2022] [Indexed: 11/19/2022] Open
Abstract
Motivation Single-cell Chromatin ImmunoPrecipitation DNA-Sequencing (scChIP-seq) analysis is challenging due to data sparsity. High degree of sparsity in biological high-throughput single-cell data is generally handled with imputation methods that complete the data, but specific methods for scChIP-seq are lacking. We present SIMPA, a scChIP-seq data imputation method leveraging predictive information within bulk data from the ENCODE project to impute missing protein-DNA interacting regions of target histone marks or transcription factors. Results Imputations using machine learning models trained for each single cell, each ChIP protein target, and each genomic region accurately preserve cell type clustering and improve pathway-related gene identification on real human data. Results on bulk data simulating single cells show that the imputations are single-cell specific as the imputed profiles are closer to the simulated cell than to other cells related to the same ChIP protein target and the same cell type. Simulations also show that 100 input genomic regions are already enough to train single-cell specific models for the imputation of thousands of undetected regions. Furthermore, SIMPA enables the interpretation of machine learning models by revealing interaction sites of a given single cell that are most important for the imputation model trained for a specific genomic region. The corresponding feature importance values derived from promoter-interaction profiles of H3K4me3, an activating histone mark, highly correlate with co-expression of genes that are present within the cell-type specific pathways in 2 real human and mouse datasets. The SIMPA’s interpretable imputation method allows users to gain a deep understanding of individual cells and, consequently, of sparse scChIP-seq datasets. Availability and implementation Our interpretable imputation algorithm was implemented in Python and is available at https://github.com/salbrec/SIMPA.
Collapse
|
15
|
Vadapalli S, Abdelhalim H, Zeeshan S, Ahmed Z. Artificial intelligence and machine learning approaches using gene expression and variant data for personalized medicine. Brief Bioinform 2022; 23:6590150. [PMID: 35595537 DOI: 10.1093/bib/bbac191] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 04/02/2022] [Accepted: 04/26/2022] [Indexed: 12/16/2022] Open
Abstract
Precision medicine uses genetic, environmental and lifestyle factors to more accurately diagnose and treat disease in specific groups of patients, and it is considered one of the most promising medical efforts of our time. The use of genetics is arguably the most data-rich and complex components of precision medicine. The grand challenge today is the successful assimilation of genetics into precision medicine that translates across different ancestries, diverse diseases and other distinct populations, which will require clever use of artificial intelligence (AI) and machine learning (ML) methods. Our goal here was to review and compare scientific objectives, methodologies, datasets, data sources, ethics and gaps of AI/ML approaches used in genomics and precision medicine. We selected high-quality literature published within the last 5 years that were indexed and available through PubMed Central. Our scope was narrowed to articles that reported application of AI/ML algorithms for statistical and predictive analyses using whole genome and/or whole exome sequencing for gene variants, and RNA-seq and microarrays for gene expression. We did not limit our search to specific diseases or data sources. Based on the scope of our review and comparative analysis criteria, we identified 32 different AI/ML approaches applied in variable genomics studies and report widely adapted AI/ML algorithms for predictive diagnostics across several diseases.
Collapse
Affiliation(s)
- Sreya Vadapalli
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Habiba Abdelhalim
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers University, 195 Little Albany St, New Brunswick, NJ, USA
| | - Zeeshan Ahmed
- Rutgers Institute for Health, Health Care Policy and Aging Research, Rutgers University, 112 Paterson St, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, 125 Paterson St, New Brunswick, NJ, USA
| |
Collapse
|
16
|
Xu S, Skarica M, Hwang A, Dai Y, Lee C, Girgenti MJ, Zhang J. Translator: A Transfer Learning Approach to Facilitate Single-Cell ATAC-Seq Data Analysis fr om Reference Dataset. J Comput Biol 2022; 29:619-633. [PMID: 35584295 PMCID: PMC9464368 DOI: 10.1089/cmb.2021.0596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Recent advances in single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) have allowed simultaneous epigenetic profiling over thousands of individual cells to dissect the cellular heterogeneity and elucidate regulatory mechanisms at the finest possible resolution. However, scATAC-seq is challenging to model computationally due to the ultra-high dimensionality, low signal-to-noise ratio, complex feature interactions, and high vulnerability to various confounding factors. In this study, we present Translator, an efficient transfer learning approach to capture generalizable chromatin interactions from high-quality (HQ) reference scATAC-seq data to obtain robust cell representations in low-to-moderate quality target scATAC-seq data. We applied Translator on various simulated and real scATAC-seq datasets and demonstrated that Translator could learn more biologically meaningful cell representations than other methods by incorporating information learned from the reference data, thus facilitating various downstream analyses such as clustering and motif enrichment measurements. Moreover, Translator's block-wise deep learning framework can handle nonlinear relationships with restricted connections using fewer parameters to boost computational efficiency through Graphics Processing Unit (GPU) parallelism. Finally, we have implemented Translator as a free software package available for the community to leverage large-scale, HQ reference data to study target scATAC-seq data.
Collapse
Affiliation(s)
- Siwei Xu
- Department of Computer Science, University of California, Irvine, California, USA
| | - Mario Skarica
- Department of Neuroscience, School of Medicine, Yale University, New Haven, Connecticut, USA
| | - Ahyeon Hwang
- Mathematical, Computational, and Systems Biology, University of California, Irvine, California, USA
| | - Yi Dai
- Department of Computer Science, University of California, Irvine, California, USA
| | - Cheyu Lee
- Department of Computer Science, University of California, Irvine, California, USA
| | - Matthew J Girgenti
- Department of Psychiatry, School of Medicine, Yale University, New Haven, Connecticut, USA.,Clinical Neurosciences Division, National Center for PTSD U.S. Department of Veterans Affairs, Washington, DC, USA
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, California, USA
| |
Collapse
|
17
|
Kalinina A, Lagace D. Single-Cell and Single-Nucleus RNAseq Analysis of Adult Neurogenesis. Cells 2022; 11:cells11101633. [PMID: 35626670 PMCID: PMC9139993 DOI: 10.3390/cells11101633] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 05/02/2022] [Accepted: 05/07/2022] [Indexed: 02/04/2023] Open
Abstract
The complexity of adult neurogenesis is becoming increasingly apparent as we learn more about cellular heterogeneity and diversity of the neurogenic lineages and stem cell niches within the adult brain. This complexity has been unraveled in part due to single-cell and single-nucleus RNA sequencing (sc-RNAseq and sn-RNAseq) studies that have focused on adult neurogenesis. This review summarizes 33 published studies in the field of adult neurogenesis that have used sc- or sn-RNAseq methods to answer questions about the three main regions that host adult neural stem cells (NSCs): the subventricular zone (SVZ), the dentate gyrus (DG) of the hippocampus, and the hypothalamus. The review explores the similarities and differences in methodology between these studies and provides an overview of how these studies have advanced the field and expanded possibilities for the future.
Collapse
|
18
|
Mani DR, Krug K, Zhang B, Satpathy S, Clauser KR, Ding L, Ellis M, Gillette MA, Carr SA. Cancer proteogenomics: current impact and future prospects. Nat Rev Cancer 2022; 22:298-313. [PMID: 35236940 DOI: 10.1038/s41568-022-00446-5] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/21/2022] [Indexed: 02/07/2023]
Abstract
Genomic analyses in cancer have been enormously impactful, leading to the identification of driver mutations and development of targeted therapies. But the functions of the vast majority of somatic mutations and copy number variants in tumours remain unknown, and the causes of resistance to targeted therapies and methods to overcome them are poorly defined. Recent improvements in mass spectrometry-based proteomics now enable direct examination of the consequences of genomic aberrations, providing deep and quantitative characterization of tumour tissues. Integration of proteins and their post-translational modifications with genomic, epigenomic and transcriptomic data constitutes the new field of proteogenomics, and is already leading to new biological and diagnostic knowledge with the potential to improve our understanding of malignant transformation and therapeutic outcomes. In this Review we describe recent developments in proteogenomics and key findings from the proteogenomic analysis of a wide range of cancers. Considerations relevant to the selection and use of samples for proteogenomics and the current technologies used to generate, analyse and integrate proteomic with genomic data are described. Applications of proteogenomics in translational studies and immuno-oncology are rapidly emerging, and the prospect for their full integration into therapeutic trials and clinical care seems bright.
Collapse
Affiliation(s)
- D R Mani
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA.
| | - Karsten Krug
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Bing Zhang
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, USA
| | - Shankha Satpathy
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Karl R Clauser
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
| | - Li Ding
- Department of Medicine and Genetics, Siteman Cancer Center, Washington University in St. Louis, St. Louis, MO, USA
| | - Matthew Ellis
- Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, TX, USA
| | - Michael A Gillette
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Steven A Carr
- Broad Institute of Massachusetts Institute of Technology and Harvard, Cambridge, MA, USA.
| |
Collapse
|
19
|
Huang J, Sheng J, Wang D. Manifold learning analysis suggests strategies to align single-cell multimodal data of neuronal electrophysiology and transcriptomics. Commun Biol 2021; 4:1308. [PMID: 34799674 PMCID: PMC8604989 DOI: 10.1038/s42003-021-02807-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 10/26/2021] [Indexed: 12/30/2022] Open
Abstract
Recent single-cell multimodal data reveal multi-scale characteristics of single cells, such as transcriptomics, morphology, and electrophysiology. However, integrating and analyzing such multimodal data to deeper understand functional genomics and gene regulation in various cellular characteristics remains elusive. To address this, we applied and benchmarked multiple machine learning methods to align gene expression and electrophysiological data of single neuronal cells in the mouse brain from the Brain Initiative. We found that nonlinear manifold learning outperforms other methods. After manifold alignment, the cells form clusters highly corresponding to transcriptomic and morphological cell types, suggesting a strong nonlinear relationship between gene expression and electrophysiology at the cell-type level. Also, the electrophysiological features are highly predictable by gene expression on the latent space from manifold alignment. The aligned cells further show continuous changes of electrophysiological features, implying cross-cluster gene expression transitions. Functional enrichment and gene regulatory network analyses for those cell clusters revealed potential genome functions and molecular mechanisms from gene expression to neuronal electrophysiology.
Collapse
Affiliation(s)
- Jiawei Huang
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
- Carl H. Lindner College of Business, University of Cincinnati, Cincinnati, OH, 45223, USA
| | - Jie Sheng
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA
| | - Daifeng Wang
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, 53706, USA.
- Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI, 53706, USA.
| |
Collapse
|
20
|
Zhang H, Lu T, Liu S, Yang J, Sun G, Cheng T, Xu J, Chen F, Yen K. Comprehensive understanding of Tn5 insertion preference improves transcription regulatory element identification. NAR Genom Bioinform 2021; 3:lqab094. [PMID: 34729473 PMCID: PMC8557372 DOI: 10.1093/nargab/lqab094] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2021] [Revised: 09/20/2021] [Accepted: 09/29/2021] [Indexed: 12/11/2022] Open
Abstract
Tn5 transposase, which can efficiently tagment the genome, has been widely adopted as a molecular tool in next-generation sequencing, from short-read sequencing to more complex methods such as assay for transposase-accessible chromatin using sequencing (ATAC-seq). Here, we systematically map Tn5 insertion characteristics across several model organisms, finding critical parameters that affect its insertion. On naked genomic DNA, we found that Tn5 insertion is not uniformly distributed or random. To uncover drivers of these biases, we used a machine learning framework, which revealed that DNA shape cooperatively works with DNA motif to affect Tn5 insertion preference. These intrinsic insertion preferences can be modeled using nucleotide dependence information from DNA sequences, and we developed a computational pipeline to correct for these biases in ATAC-seq data. Using our pipeline, we show that bias correction improves the overall performance of ATAC-seq peak detection, recovering many potential false-negative peaks. Furthermore, we found that these peaks are bound by transcription factors, underscoring the biological relevance of capturing this additional information. These findings highlight the benefits of an improved understanding and precise correction of Tn5 insertion preference.
Collapse
Affiliation(s)
- Houyu Zhang
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | - Ting Lu
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | - Shan Liu
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | - Jianyu Yang
- Department of Developmental Biology, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Guohuan Sun
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | - Tao Cheng
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| | - Jin Xu
- Division of Cell, Developmental and Integrative Biology, School of Medicine, South China University of Technology, Guangzhou 510006, China
| | - Fangyao Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Xi'an Jiaotong University Health Science Center, Xi'an, Shaanxi 710061, China
| | - Kuangyu Yen
- State Key Laboratory of Experimental Hematology, National Clinical Research Center for Blood Diseases, Institute of Hematology and Blood Diseases Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Tianjin 300020, China
| |
Collapse
|
21
|
Chen S, Liu Q, Cui X, Feng Z, Li C, Wang X, Zhang X, Wang Y, Jiang R. OpenAnnotate: a web server to annotate the chromatin accessibility of genomic regions. Nucleic Acids Res 2021; 49:W483-W490. [PMID: 33999180 PMCID: PMC8262705 DOI: 10.1093/nar/gkab337] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 04/12/2021] [Accepted: 04/20/2021] [Indexed: 12/13/2022] Open
Abstract
Chromatin accessibility, as a powerful marker of active DNA regulatory elements, provides valuable information for understanding regulatory mechanisms. The revolution in high-throughput methods has accumulated massive chromatin accessibility profiles in public repositories. Nevertheless, utilization of these data is hampered by cumbersome collection, time-consuming processing, and manual chromatin accessibility (openness) annotation of genomic regions. To fill this gap, we developed OpenAnnotate (http://health.tsinghua.edu.cn/openannotate/) as the first web server for efficiently annotating openness of massive genomic regions across various biosample types, tissues, and biological systems. In addition to the annotation resource from 2729 comprehensive profiles of 614 biosample types of human and mouse, OpenAnnotate provides user-friendly functionalities, ultra-efficient calculation, real-time browsing, intuitive visualization, and elaborate application notebooks. We show its unique advantages compared to existing databases and toolkits by effectively revealing cell type-specificity, identifying regulatory elements and 3D chromatin contacts, deciphering gene functional relationships, inferring functions of transcription factors, and unprecedentedly promoting single-cell data analyses. We anticipate OpenAnnotate will provide a promising avenue for researchers to construct a more holistic perspective to understand regulatory mechanisms.
Collapse
Affiliation(s)
- Shengquan Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Qiao Liu
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xuejian Cui
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Zhanying Feng
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xiaowo Wang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xuegong Zhang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Yong Wang
- CEMS, NCMIS, MDIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| |
Collapse
|
22
|
Yu F, Sankaran VG, Yuan GC. CUT&RUNTools 2.0: a pipeline for single-cell and bulk-level CUT&RUN and CUT&Tag data analysis. Bioinformatics 2021; 38:252-254. [PMID: 34244724 PMCID: PMC8696090 DOI: 10.1093/bioinformatics/btab507] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 07/01/2021] [Accepted: 07/07/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Genome-wide profiling of transcription factor binding and chromatin states is a widely-used approach for mechanistic understanding of gene regulation. Recent technology development has enabled such profiling at single-cell resolution. However, an end-to-end computational pipeline for analyzing such data is still lacking. RESULTS Here, we have developed a flexible pipeline for analysis and visualization of single-cell CUT&Tag and CUT&RUN data, which provides functions for sequence alignment, quality control, dimensionality reduction, cell clustering, data aggregation and visualization. Furthermore, it is also seamlessly integrated with the functions in original CUT&RUNTools for population-level analyses. As such, this provides a valuable toolbox for the community. AVAILABILITY AND IMPLEMENTATION https://github.com/fl-yu/CUT-RUNTools-2.0. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fulong Yu
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA,Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA,Program in Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02115, USA
| | - Vijay G Sankaran
- Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA,Division of Hematology/Oncology, Boston Children’s Hospital, Boston, MA 02115, USA,Department of Pediatrics, Harvard Medical School, Boston, MA 02115, USA,Program in Medical & Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02115, USA
| | | |
Collapse
|
23
|
Chen S, Yan G, Zhang W, Li J, Jiang R, Lin Z. RA3 is a reference-guided approach for epigenetic characterization of single cells. Nat Commun 2021; 12:2177. [PMID: 33846355 PMCID: PMC8041798 DOI: 10.1038/s41467-021-22495-4] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 03/18/2021] [Indexed: 12/13/2022] Open
Abstract
The recent advancements in single-cell technologies, including single-cell chromatin accessibility sequencing (scCAS), have enabled profiling the epigenetic landscapes for thousands of individual cells. However, the characteristics of scCAS data, including high dimensionality, high degree of sparsity and high technical variation, make the computational analysis challenging. Reference-guided approaches, which utilize the information in existing datasets, may facilitate the analysis of scCAS data. Here, we present RA3 (Reference-guided Approach for the Analysis of single-cell chromatin Accessibility data), which utilizes the information in massive existing bulk chromatin accessibility and annotated scCAS data. RA3 simultaneously models (1) the shared biological variation among scCAS data and the reference data, and (2) the unique biological variation in scCAS data that identifies distinct subpopulations. We show that RA3 achieves superior performance when used on several scCAS datasets, and on references constructed using various approaches. Altogether, these analyses demonstrate the wide applicability of RA3 in analyzing scCAS data.
Collapse
Affiliation(s)
- Shengquan Chen
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Guanao Yan
- School of Mathematical Sciences, Zhejiang University, Hangzhou, China
| | - Wenyu Zhang
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Jinzhao Li
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China.
| | - Zhixiang Lin
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong SAR, China.
| |
Collapse
|
24
|
Rai MF, Wu CL, Capellini TD, Guilak F, Dicks AR, Muthuirulan P, Grandi F, Bhutani N, Westendorf JJ. Single Cell Omics for Musculoskeletal Research. Curr Osteoporos Rep 2021; 19:131-140. [PMID: 33559841 PMCID: PMC8743139 DOI: 10.1007/s11914-021-00662-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/19/2021] [Indexed: 02/04/2023]
Abstract
PURPOSE OF REVIEW The ability to analyze the molecular events occurring within individual cells as opposed to populations of cells is revolutionizing our understanding of musculoskeletal tissue development and disease. Single cell studies have the great potential of identifying cellular subpopulations that work in a synchronized fashion to regenerate and repair damaged tissues during normal homeostasis. In addition, such studies can elucidate how these processes break down in disease as well as identify cellular subpopulations that drive the disease. This review highlights three emerging technologies: single cell RNA sequencing (scRNA-seq), Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq), and Cytometry by Time-Of-Flight (CyTOF) mass cytometry. RECENT FINDINGS Technological and bioinformatic tools to analyze the transcriptome, epigenome, and proteome at the individual cell level have advanced rapidly making data collection relatively easy; however, understanding how to access and interpret the data remains a challenge for many scientists. It is, therefore, of paramount significance to educate the musculoskeletal community on how single cell technologies can be used to answer research questions and advance translation. This article summarizes talks given during a workshop on "Single Cell Omics" at the 2020 annual meeting of the Orthopedic Research Society. Studies that applied scRNA-seq, ATAC-seq, and CyTOF mass cytometry to cartilage development and osteoarthritis are reviewed. This body of work shows how these cutting-edge tools can advance our understanding of the cellular heterogeneity and trajectories of lineage specification during development and disease.
Collapse
Affiliation(s)
- Muhammad Farooq Rai
- Department of Orthopaedic Surgery, Washington University, St. Louis, MO, USA
| | - Chia-Lung Wu
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | - Terence D Capellini
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Farshid Guilak
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | - Amanda R Dicks
- Department of Orthopaedic Surgery, Washington University and Shriners Hospitals for Children, St. Louis, MO, USA
| | | | - Fiorella Grandi
- Department of Orthopedic Surgery, Stanford University, Stanford, CA, USA
| | - Nidhi Bhutani
- Department of Orthopedic Surgery, Stanford University, Stanford, CA, USA
| | | |
Collapse
|
25
|
Sharma R, Pandey N, Mongia A, Mishra S, Majumdar A, Kumar V. FITs: forest of imputation trees for recovering true signals in single-cell open chromatin profiles. NAR Genom Bioinform 2020; 2:lqaa091. [PMID: 33575635 PMCID: PMC7676476 DOI: 10.1093/nargab/lqaa091] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 10/13/2020] [Accepted: 10/21/2020] [Indexed: 12/15/2022] Open
Abstract
The advent of single-cell open-chromatin profiling technology has facilitated the analysis of heterogeneity of activity of regulatory regions at single-cell resolution. However, stochasticity and availability of low amount of relevant DNA, cause high drop-out rate and noise in single-cell open-chromatin profiles. We introduce here a robust method called as forest of imputation trees (FITs) to recover original signals from highly sparse and noisy single-cell open-chromatin profiles. FITs makes multiple imputation trees to avoid bias during the restoration of read-count matrices. It resolves the challenging issue of recovering open chromatin signals without blurring out information at genomic sites with cell-type-specific activity. Besides visualization and classification, FITs-based imputation also improved accuracy in the detection of enhancers, calculating pathway enrichment score and prediction of chromatin-interactions. FITs is generalized for wider applicability, especially for highly sparse read-count matrices. The superiority of FITs in recovering signals of minority cells also makes it highly useful for single-cell open-chromatin profile from in vivo samples. The software is freely available at https://reggenlab.github.io/FITs/.
Collapse
Affiliation(s)
- Rachesh Sharma
- Department of Electronic and Communication Engineering, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| | - Neetesh Pandey
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| | - Aanchal Mongia
- Department of Electronic and Communication Engineering, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| | - Shreya Mishra
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| | - Angshul Majumdar
- Department of Electronic and Communication Engineering, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| | - Vibhor Kumar
- Department of Computational Biology, Indraprastha Institute of Information Technology Delhi, Okhla Industrial Estate, Phase-III, New Delhi 110020, India
| |
Collapse
|
26
|
Cis-regulatory units of grass genomes identified by their DNA methylation. Proc Natl Acad Sci U S A 2020; 117:25198-25199. [PMID: 33008886 DOI: 10.1073/pnas.2017729117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|