1
|
Sidorenko D, Pushkov S, Sakip A, Leung GHD, Lok SWY, Urban A, Zagirova D, Veviorskiy A, Tihonova N, Kalashnikov A, Kozlova E, Naumov V, Pun FW, Aliper A, Ren F, Zhavoronkov A. Precious2GPT: the combination of multiomics pretrained transformer and conditional diffusion for artificial multi-omics multi-species multi-tissue sample generation. NPJ AGING 2024; 10:37. [PMID: 39117678 PMCID: PMC11310469 DOI: 10.1038/s41514-024-00163-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2024] [Accepted: 07/22/2024] [Indexed: 08/10/2024]
Abstract
Synthetic data generation in omics mimics real-world biological data, providing alternatives for training and evaluation of genomic analysis tools, controlling differential expression, and exploring data architecture. We previously developed Precious1GPT, a multimodal transformer trained on transcriptomic and methylation data, along with metadata, for predicting biological age and identifying dual-purpose therapeutic targets potentially implicated in aging and age-associated diseases. In this study, we introduce Precious2GPT, a multimodal architecture that integrates Conditional Diffusion (CDiffusion) and decoder-only Multi-omics Pretrained Transformer (MoPT) models trained on gene expression and DNA methylation data. Precious2GPT excels in synthetic data generation, outperforming Conditional Generative Adversarial Networks (CGANs), CDiffusion, and MoPT. We demonstrate that Precious2GPT is capable of generating representative synthetic data that captures tissue- and age-specific information from real transcriptomics and methylomics data. Notably, Precious2GPT surpasses other models in age prediction accuracy using the generated data, and it can generate data beyond 120 years of age. Furthermore, we showcase the potential of using this model in identifying gene signatures and potential therapeutic targets in a colorectal cancer case study.
Collapse
Affiliation(s)
- Denis Sidorenko
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Stefan Pushkov
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Akhmed Sakip
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Geoffrey Ho Duen Leung
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Sarah Wing Yan Lok
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Anatoly Urban
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Diana Zagirova
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Alexander Veviorskiy
- Insilico Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, Abu Dhabi, UAE
| | - Nina Tihonova
- Insilico Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, Abu Dhabi, UAE
| | - Aleksandr Kalashnikov
- Insilico Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, Abu Dhabi, UAE
| | - Ekaterina Kozlova
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Vladimir Naumov
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Frank W Pun
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Alex Aliper
- Insilico Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, Abu Dhabi, UAE
| | - Feng Ren
- Insilico Medicine Shanghai Ltd., Suite 902, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong, Shanghai, 201203, China
| | - Alex Zhavoronkov
- Insilico Medicine Hong Kong Ltd., Unit 310, 3/F, Building 8W Hong Kong Science and Technology Park, Hong Kong SAR, China.
- Insilico Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, Abu Dhabi, UAE.
- Buck Institute for Research on Aging, Novato, CA, 94945, USA.
| |
Collapse
|
2
|
Dwaraka VB, Aronica L, Carreras-Gallo N, Robinson JL, Hennings T, Carter MM, Corley MJ, Lin A, Turner L, Smith R, Mendez TL, Went H, Ebel ER, Sonnenburg ED, Sonnenburg JL, Gardner CD. Unveiling the epigenetic impact of vegan vs. omnivorous diets on aging: insights from the Twins Nutrition Study (TwiNS). BMC Med 2024; 22:301. [PMID: 39069614 DOI: 10.1186/s12916-024-03513-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 07/02/2024] [Indexed: 07/30/2024] Open
Abstract
BACKGROUND Geroscience focuses on interventions to mitigate molecular changes associated with aging. Lifestyle modifications, medications, and social factors influence the aging process, yet the complex molecular mechanisms require an in-depth exploration of the epigenetic landscape. The specific epigenetic clock and predictor effects of a vegan diet, compared to an omnivorous diet, remain underexplored despite potential impacts on aging-related outcomes. METHODS This study examined the impact of an entirely plant-based or healthy omnivorous diet over 8 weeks on blood DNA methylation in paired twins. Various measures of epigenetic age acceleration (PC GrimAge, PC PhenoAge, DunedinPACE) were assessed, along with system-specific effects (Inflammation, Heart, Hormone, Liver, and Metabolic). Methylation surrogates of clinical, metabolite, and protein markers were analyzed to observe diet-specific shifts. RESULTS Distinct responses were observed, with the vegan cohort exhibiting significant decreases in overall epigenetic age acceleration, aligning with anti-aging effects of plant-based diets. Diet-specific shifts were noted in the analysis of methylation surrogates, demonstrating the influence of diet on complex trait prediction through DNA methylation markers. An epigenome-wide analysis revealed differentially methylated loci specific to each diet, providing insights into the affected pathways. CONCLUSIONS This study suggests that a short-term vegan diet is associated with epigenetic age benefits and reduced calorie intake. The use of epigenetic biomarker proxies (EBPs) highlights their potential for assessing dietary impacts and facilitating personalized nutrition strategies for healthy aging. Future research should explore the long-term effects of vegan diets on epigenetic health and overall well-being, considering the importance of proper nutrient supplementation. TRIAL REGISTRATION Clinicaltrials.gov identifier: NCT05297825.
Collapse
Affiliation(s)
- Varun B Dwaraka
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA.
| | - Lucia Aronica
- Stanford Prevention Research Center, Department of Medicine, School of Medicine, Stanford University, 3180 Porter Dr, Palo Alto, Stanford, CA, 94305, USA
| | | | - Jennifer L Robinson
- Stanford Prevention Research Center, Department of Medicine, School of Medicine, Stanford University, 3180 Porter Dr, Palo Alto, Stanford, CA, 94305, USA
| | - Tayler Hennings
- Seattle Children's Research Institute, Seattle, WA, 98101, USA
| | - Matthew M Carter
- Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford University, Palo Alto, CA, USA
| | - Michael J Corley
- Department of Medicine, Division of Infectious Diseases, Weill Cornell Medicine, New York, NY, USA
| | - Aaron Lin
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA
| | - Logan Turner
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA
| | - Ryan Smith
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA
| | - Tavis L Mendez
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA
| | - Hannah Went
- TruDiagnostic, Inc, 881 Corporate Dr, Lexington, KY, 40503, USA
| | - Emily R Ebel
- Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford University, Palo Alto, CA, USA
| | - Erica D Sonnenburg
- Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford University, Palo Alto, CA, USA
| | - Justin L Sonnenburg
- Department of Microbiology and Immunology, School of Medicine, Stanford University, Stanford University, Palo Alto, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Center for Human Microbiome Studies, Stanford University School of Medicine, Stanford, CA, USA
| | - Christopher D Gardner
- Stanford Prevention Research Center, Department of Medicine, School of Medicine, Stanford University, 3180 Porter Dr, Palo Alto, Stanford, CA, 94305, USA.
| |
Collapse
|
3
|
Sagy N, Meyrom N, Beckerman P, Pleniceanu O, Bar DZ. Kidney-specific methylation patterns correlate with kidney function and are lost upon kidney disease progression. Clin Epigenetics 2024; 16:27. [PMID: 38347603 PMCID: PMC10863297 DOI: 10.1186/s13148-024-01642-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Accepted: 02/07/2024] [Indexed: 02/15/2024] Open
Abstract
BACKGROUND Chronological and biological age correlate with DNA methylation levels at specific sites in the genome. Linear combinations of multiple methylation sites, termed epigenetic clocks, can inform us the chronological age and predict multiple health-related outcomes. However, why some sites correlating with lifespan, healthspan, or specific medical conditions remain poorly understood. Kidney fibrosis is the common pathway for chronic kidney disease, which affects 10% of European and US populations. RESULTS Here we identify epigenetic clocks and methylation sites that correlate with kidney function. Moreover, we identify methylation sites that have a unique methylation signature in the kidney. Methylation levels in majority of these sites correlate with kidney state and function. When kidney function deteriorates, all of these sites regress toward the common methylation pattern observed in other tissues. Interestingly, while the majority of sites are less methylated in the kidney and become more methylated with loss of function, a fraction of the sites are highly methylated in the kidney and become less methylated when kidney function declines. These methylation sites are enriched for specific transcription-factor binding sites. In a large subset of sites, changes in methylation patterns are accompanied by changes in gene expression in kidneys of chronic kidney disease patients. CONCLUSIONS These results support the information theory of aging, and the hypothesis that the unique tissue identity, as captured by methylation patterns, is lost as tissue function declines. However, this information loss is not random, but guided toward a baseline that is dependent on the genomic loci. SIGNIFICANCE STATEMENT DNA methylation at specific sites accurately reflects chronological and biological age. We identify sites that have a unique methylation pattern in the kidney. Methylation levels in the majority of these sites correlate with kidney state and function. Moreover, when kidney function deteriorates, all of these sites regress toward the common methylation pattern observed in other tissues. Thus, the unique methylation signature of the kidney is degraded, and epigenetic information is lost, when kidney disease progresses. These methylation sites are enriched for specific and methylation-sensitive transcription-factor binding sites, and associated genes show disease-dependent changes in expression. These results support the information theory of aging, and the hypothesis that the unique tissue identity, as captured by methylation patterns, is lost as tissue function declines.
Collapse
Affiliation(s)
- Naor Sagy
- Department of Oral Biology, Goldschleger School of Dental Medicine, The Faculty of Medical and Health Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
| | - Noa Meyrom
- Department of Oral Biology, Goldschleger School of Dental Medicine, The Faculty of Medical and Health Sciences, Tel Aviv University, 69978, Tel Aviv, Israel
| | - Pazit Beckerman
- Kidney Research Lab, The Institute of Nephrology and Hypertension, Sheba Medical Center, Tel-Hashomer and The Faculty of Medical and Health Sciences, Tel-Aviv University, Tel Aviv, Israel
| | - Oren Pleniceanu
- Kidney Research Lab, The Institute of Nephrology and Hypertension, Sheba Medical Center, Tel-Hashomer and The Faculty of Medical and Health Sciences, Tel-Aviv University, Tel Aviv, Israel
| | - Daniel Z Bar
- Department of Oral Biology, Goldschleger School of Dental Medicine, The Faculty of Medical and Health Sciences, Tel Aviv University, 69978, Tel Aviv, Israel.
- The AI and Data Science Center (TAD), Tel Aviv University, 69978, Tel Aviv, Israel.
| |
Collapse
|
4
|
Bai X, Bao Y, Bei S, Bu C, Cao R, Cao Y, Cen H, Chao J, Chen F, Chen H, Chen K, Chen M, Chen M, Chen M, Chen Q, Chen R, Chen S, Chen T, Chen X, Chen X, Cheng Y, Chu Y, Cui Q, Dong L, Du Z, Duan G, Fan S, Fan Z, Fang X, Fang Z, Feng Z, Fu S, Gao F, Gao G, Gao H, Gao W, Gao X, Gao X, Gao X, Gong J, Gong J, Gou Y, Gu S, Guo AY, Guo G, Guo X, Han C, Hao D, Hao L, He Q, He S, He S, Hu W, Huang K, Huang T, Huang X, Huang Y, Jia P, Jia Y, Jiang C, Jiang M, Jiang S, Jiang T, Jiang X, Jin E, Jin W, Kang H, Kang H, Kong D, Lan L, Lei W, Li CY, Li C, Li C, Li H, Li J, Li J, Li L, Li P, Li R, Li X, Li Y, Li Y, Li Z, Liao X, Lin S, Lin Y, Ling Y, Liu B, Liu CJ, Liu D, Liu GH, Liu L, Liu S, Liu W, Liu X, Liu X, Liu Y, Liu Y, Lu M, Lu T, Luo H, Luo H, Luo M, Luo S, Luo X, Ma L, Ma Y, Mai J, Meng J, Meng X, Meng Y, Meng Y, Miao W, Miao YR, Ni L, Nie Z, Niu G, Niu X, Niu Y, Pan R, Pan S, Peng D, Peng J, Qi J, Qi Y, Qian Q, Qin Y, Qu H, Ren J, Ren J, Sang Z, Shang K, Shen WK, Shen Y, Shi Y, Song S, Song T, Su T, Sun J, Sun Y, Sun Y, Sun Y, Tang B, Tang D, Tang Q, Tang Z, Tian D, Tian F, Tian W, Tian Z, Wang A, Wang G, Wang G, Wang J, Wang J, Wang P, Wang P, Wang W, Wang Y, Wang Y, Wang Y, Wang Y, Wang Z, Wei H, Wei Y, Wei Z, Wu D, Wu G, Wu S, Wu S, Wu W, Wu W, Wu Z, Xia Z, Xiao J, Xiao L, Xiao Y, Xie G, Xie GY, Xie J, Xie Y, Xiong J, Xiong Z, Xu D, Xu S, Xu T, Xu T, Xue Y, Xue Y, Yan C, Yang D, Yang F, Yang F, Yang H, Yang J, Yang K, Yang N, Yang QY, Yang S, Yang X, Yang X, Yang X, Yang YG, Ye W, Yu C, Yu F, Yu S, Yuan C, Yuan H, Zeng J, Zhai S, Zhang C, Zhang F, Zhang G, Zhang M, Zhang P, Zhang Q, Zhang R, Zhang S, Zhang W, Zhang W, Zhang W, Zhang X, Zhang X, Zhang Y, Zhang Y, Zhang Y, Zhang YE, Zhang Y, Zhang Z, Zhang Z, Zhao D, Zhao F, Zhao G, Zhao M, Zhao W, Zhao W, Zhao X, Zhao Y, Zhao Y, Zhao Z, Zheng X, Zheng Y, Zhou C, Zhou H, Zhou X, Zhou X, Zhou Y, Zhou Y, Zhu J, Zhu L, Zhu R, Zhu T, Zong W, Zou D, Zuo Z. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2024. Nucleic Acids Res 2024; 52:D18-D32. [PMID: 38018256 PMCID: PMC10767964 DOI: 10.1093/nar/gkad1078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/12/2023] [Accepted: 10/27/2023] [Indexed: 11/30/2023] Open
Abstract
The National Genomics Data Center (NGDC), which is a part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support the global academic and industrial communities. With the rapid accumulation of multi-omics data at an unprecedented pace, CNCB-NGDC continuously expands and updates core database resources through big data archiving, integrative analysis and value-added curation. Importantly, NGDC collaborates closely with major international databases and initiatives to ensure seamless data exchange and interoperability. Over the past year, significant efforts have been dedicated to integrating diverse omics data, synthesizing expanding knowledge, developing new resources, and upgrading major existing resources. Particularly, several database resources are newly developed for the biodiversity of protists (P10K), bacteria (NTM-DB, MPA) as well as plant (PPGR, SoyOmics, PlantPan) and disease/trait association (CROST, HervD Atlas, HALL, MACdb, BioKA, BioKA, RePoS, PGG.SV, NAFLDkb). All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.
Collapse
|
5
|
Martínez-Enguita D, Dwivedi SK, Jörnsten R, Gustafsson M. NCAE: data-driven representations using a deep network-coherent DNA methylation autoencoder identify robust disease and risk factor signatures. Brief Bioinform 2023; 24:bbad293. [PMID: 37587790 PMCID: PMC10516364 DOI: 10.1093/bib/bbad293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/25/2023] [Accepted: 07/29/2023] [Indexed: 08/18/2023] Open
Abstract
Precision medicine relies on the identification of robust disease and risk factor signatures from omics data. However, current knowledge-driven approaches may overlook novel or unexpected phenomena due to the inherent biases in biological knowledge. In this study, we present a data-driven signature discovery workflow for DNA methylation analysis utilizing network-coherent autoencoders (NCAEs) with biologically relevant latent embeddings. First, we explored the architecture space of autoencoders trained on a large-scale pan-tissue compendium (n = 75 272) of human epigenome-wide association studies. We observed the emergence of co-localized patterns in the deep autoencoder latent space representations that corresponded to biological network modules. We determined the NCAE configuration with the strongest co-localization and centrality signals in the human protein interactome. Leveraging the NCAE embeddings, we then trained interpretable deep neural networks for risk factor (aging, smoking) and disease (systemic lupus erythematosus) prediction and classification tasks. Remarkably, our NCAE embedding-based models outperformed existing predictors, revealing novel DNA methylation signatures enriched in gene sets and pathways associated with the studied condition in each case. Our data-driven biomarker discovery workflow provides a generally applicable pipeline to capture relevant risk factor and disease information. By surpassing the limitations of knowledge-driven methods, our approach enhances the understanding of complex epigenetic processes, facilitating the development of more effective diagnostic and therapeutic strategies.
Collapse
Affiliation(s)
- David Martínez-Enguita
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Sweden
| | - Sanjiv K Dwivedi
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Sweden
| | - Rebecka Jörnsten
- Department of Mathematical Sciences, Chalmers University of Technology, Sweden
| | - Mika Gustafsson
- Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Sweden
| |
Collapse
|
6
|
Xue Y, Bao Y, Zhang Z, Zhao W, Xiao J, He S, Zhang G, Li Y, Zhao G, Chen R, Ma Y, Chen M, Li C, Jiang S, Zou D, Gong Z, Zhao X, Wang Y, Zhu J, Zhang Z, Zhao W, Xue Y, Bao Y, Song S, Zhang G, Ling Y, Wang Y, Yang J, Zhuang X, Duan G, Wu G, Chen X, Tian D, Li Z, Sun Y, Du Z, Hao L, Song S, Gao Y, Xiao J, Zhang Z, Bao Y, Tang B, Zhao W, Zhang Y, Zhang H, Zhang Z, Qian Q, Zhang Z, Xiao J, Kang H, Huang T, Chen X, Xia Z, Zhou X, Chao J, Tang B, Wang Z, Zhu J, Du Z, Zhang S, Xiao J, Tian W, Wang W, Zhao W, Wu S, Huang Y, Zhang M, Gong Z, Wang G, Zheng X, Zong W, Zhao W, Xing P, Li R, Liu Z, Bao Y, Lu M, Zhang Y, Yang F, Mai J, Gao Q, Xu X, Kang H, Hou L, Shang Y, Qain Q, Liu J, Jiang M, Zhang H, Bu C, Wang J, Zhang Z, Zhang Z, Zeng J, Li J, Xiao J, Pan S, Kang H, Liu X, Lin S, Yuan N, Zhang Z, Bao Y, Jia P, Zheng X, Zong W, Li Z, Sun Y, Ma Y, Xiong Z, Wu S, Yang F, Zhao W, Bu C, Du Z, Xiao J, Bao Y, Chen X, Chen T, Zhang S, Sun Y, Yu C, Tang B, Zhu J, Dong L, Zhai S, Sun Y, Chen Q, Yang X, Zhang X, Sang Z, Wang Y, Zhao Y, Chen H, Lan L, Wang Y, Zhao W, Wang A, Yu C, Wang Y, Zhang S, Ma Y, Jia Y, Zhao X, Chen M, Li C, Tian D, Tang B, Pan Y, Dong L, Liu X, Song S, Liu X, Tian D, Li C, Tang B, Wang Z, Zhang R, Pan Y, Wang Y, Zou D, Song S, Li C, Zou D, Ma L, Gong Z, Zhu J, Teng X, Li L, Li N, Cui Y, Duan G, Zhang M, Jin T, Kang H, Wang Z, Wu G, Huang T, Zhao W, Jin E, Zhang T, Zhang Z, Zhao W, Xue Y, Bao Y, Song S, Xu T, Zou D, Chen M, Niu G, Pan R, Zhu T, Chu Y, Hao L, Sang J, Pan R, Zou D, Zhang Y, Wang Z, Chen M, Zhang Y, Xu T, Yao Q, Zhu T, Niu G, Hao L, Xiong Z, Yang F, Wang G, Li R, Zong W, Zhang M, Zou D, Zhao W, Wang G, Yang F, Wu S, Zhang X, Guo X, Ma Y, Xiong Z, Li R, Li Z, Liu L, Feng C, Qin Y, Xiao J, Ma L, Jing W, Luo S, Li Z, Ma L, Jiang S, Qian Q, Zhu T, Zong W, Shang Y, Jin T, Zhang Y, Chen M, Wu Z, Chu Y, Zhang R, Luo S, Jing W, Zou D, Bao Y, Xiao J, Zhang Z, Zou D, Liu L, Qin Y, Luo S, Jing W, Li Q, Liu P, Sun Y, Ma L, Jiang S, Fan Z, Zhao W, Xiao J, Bao Y, Zhang Z, Shen WK, Guo AY, Zuo Z, Ren J, Zhang X, Xiao Y, Li X, Zhang X, Xiao Y, Li X, Liu D, Zhang C, Xue Y, Zhao Z, Jiang T, Wu W, Zhao F, Meng X, Chen M, Gou Y, Chen M, Xue Y, Peng D, Xue Y, Luo H, Gao F, Ning W, Xue Y, Liu W, Ling Y, Cao R, Zhang G, Wei Y, Xue Y, Liu CJ, Guo AY, Xie GY, Guo AY, Yuan H, Su T, Zhang YE, Zhou C, Wang P, Zhang G, Zhou Y, Chen M, Guo G, Zhang Q, Guo AY, Fu S, Tan X, Xue Y, Tang D, Xue Y, Zhang W, Xue Y, Luo M, Guo AY, Xie Y, Ren J, Miao YR, Guo AY, Zhou Y, Chen M, Guo G, Huang X, Feng Z, Xue Y, Liu CJ, Guo AY, Liao X, Gao X, Wang J, Xie G, Guo AY, Yuan C, Chen M, Yang D, Tian F, Gao G, Wu W, Chen M, Han C, Xue Y, Cui Q, Xiao C, Li CY, Luo X, Ren J, Zhang X, Xiao Y, Li X, Tang Q, Guo AY, Luo H, Gao F, Xue Y, Bao Y, Zhang Z, Zhao W, Xiao J, He S, Zhang G, Li Y, Zhao G, Chen R. Database Resources of the National Genomics Data Center, China National Center for Bioinformation in 2023. Nucleic Acids Res 2023; 51:D18-D28. [PMID: 36420893 PMCID: PMC9825504 DOI: 10.1093/nar/gkac1073] [Citation(s) in RCA: 97] [Impact Index Per Article: 97.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/14/2022] [Accepted: 10/27/2022] [Indexed: 11/27/2022] Open
Abstract
The National Genomics Data Center (NGDC), part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support global academic and industrial communities. With the explosive accumulation of multi-omics data generated at an unprecedented rate, CNCB-NGDC constantly expands and updates core database resources by big data archive, integrative analysis and value-added curation. In the past year, efforts have been devoted to integrating multiple omics data, synthesizing the growing knowledge, developing new resources and upgrading a set of major resources. Particularly, several database resources are newly developed for infectious diseases and microbiology (MPoxVR, KGCoV, ProPan), cancer-trait association (ASCancer Atlas, TWAS Atlas, Brain Catalog, CCAS) as well as tropical plants (TCOD). Importantly, given the global health threat caused by monkeypox virus and SARS-CoV-2, CNCB-NGDC has newly constructed the monkeypox virus resource, along with frequent updates of SARS-CoV-2 genome sequences, variants as well as haplotypes. All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.
Collapse
|
7
|
Zhang M, Zong W, Zou D, Wang G, Zhao W, Yang F, Wu S, Zhang X, Guo X, Ma Y, Xiong Z, Zhang Z, Bao Y, Li R. MethBank 4.0: an updated database of DNA methylation across a variety of species. Nucleic Acids Res 2022; 51:D208-D216. [PMID: 36318250 PMCID: PMC9825483 DOI: 10.1093/nar/gkac969] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 10/05/2022] [Accepted: 10/13/2022] [Indexed: 11/05/2022] Open
Abstract
DNA methylation, as the most intensively studied epigenetic mark, regulates gene expression in numerous biological processes including development, aging, and disease. With the rapid accumulation of whole-genome bisulfite sequencing data, integrating, archiving, analyzing, and visualizing those data becomes critical. Since its first publication in 2015, MethBank has been continuously updated to include more DNA methylomes across more diverse species. Here, we present MethBank 4.0 (https://ngdc.cncb.ac.cn/methbank/), which reports an increase of 309% in data volume, with 1449 single-base resolution methylomes of 23 species, covering 236 tissues/cell lines and 15 biological contexts. Value-added information, such as more rigorous quality evaluation, more standardized metadata, and comprehensive downstream annotations have been integrated in the new version. Moreover, expert-curated knowledge modules of featured differentially methylated genes associated with biological contexts and methylation analysis tools have been incorporated as new components of MethBank. In addition, MethBank 4.0 is equipped with a series of new web interfaces to browse, search, and visualize DNA methylation profiles and related information. With all these improvements, we believe the updated MethBank 4.0 will serve as a fundamental resource to provide a wide range of data services for the global research community.
Collapse
Affiliation(s)
| | | | | | | | - Wei Zhao
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Fei Yang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China
| | - Song Wu
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xinran Zhang
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xutong Guo
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yingke Ma
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China
| | - Zhuang Xiong
- National Genomics Data Center & CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhang Zhang
- Correspondence may also be addressed to Zhang Zhang. Tel: +86 10 84097261;
| | - Yiming Bao
- Correspondence may also be addressed to Yiming Bao. Tel: +86 10 84097858;
| | - Rujiao Li
- To whom correspondence should be addressed. Tel: +86 10 84097638;
| |
Collapse
|
8
|
Identification of COVID-19-Associated DNA Methylation Variations by Integrating Methylation Array and scRNA-Seq Data at Cell-Type Resolution. Genes (Basel) 2022; 13:genes13071109. [DOI: 10.3390/genes13071109] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 06/16/2022] [Accepted: 06/17/2022] [Indexed: 01/27/2023] Open
Abstract
Single-cell transcriptome studies have revealed immune dysfunction in COVID-19 patients, including lymphopenia, T cell exhaustion, and increased levels of pro-inflammatory cytokines, while DNA methylation plays an important role in the regulation of immune response and inflammatory response. The specific cell types of immune responses regulated by DNA methylation in COVID-19 patients will be better understood by exploring the COVID-19 DNA methylation variation at the cell-type level. Here, we developed an analytical pipeline to explore single-cell DNA methylation variations in COVID-19 patients by transferring bulk-tissue-level knowledge to the single-cell level. We discovered that the methylation variations in the whole blood of COVID-19 patients showed significant cell-type specificity with remarkable enrichment in gamma-delta T cells and presented a phenomenon of hypermethylation and low expression. Furthermore, we identified five genes whose methylation variations were associated with several cell types. Among them, S100A9, AHNAK, and CX3CR1 have been reported as potential COVID-19 biomarkers previously, and the others (TRAF3IP3 and LFNG) are closely associated with the immune and virus-related signaling pathways. We propose that they might serve as potential epigenetic biomarkers for COVID-19 and could play roles in important biological processes such as the immune response and antiviral activity.
Collapse
|