1
|
Wang S, Li H, Zhang K, Wu H, Pang S, Wu W, Ye L, Su J, Zhang Y. scSID: A lightweight algorithm for identifying rare cell types by capturing differential expression from single-cell sequencing data. Comput Struct Biotechnol J 2024; 23:589-600. [PMID: 38274993 PMCID: PMC10809081 DOI: 10.1016/j.csbj.2023.12.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 12/27/2023] [Accepted: 12/27/2023] [Indexed: 01/27/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) is currently an important technology for identifying cell types and studying diseases at the genetic level. Identifying rare cell types is biologically important as one of the downstream data analyses of single-cell RNA sequencing. Although rare cell identification methods have been developed, most of these suffer from insufficient mining of intercellular similarities, low scalability, and being time-consuming. In this paper, we propose a single-cell similarity division algorithm (scSID) for identifying rare cells. It takes cell-to-cell similarity into consideration by analyzing both inter-cluster and intra-cluster similarities, and discovers rare cell types based on the similarity differences. We show that scSID outperforms other existing methods by benchmarking it on different experimental datasets. Application of scSID to multiple datasets, including 68K PBMC and intestine, highlights its exceptional scalability and remarkable ability to identify rare cell populations.
Collapse
Affiliation(s)
- Shudong Wang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Hengxiao Li
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Kuijie Zhang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Hao Wu
- College of Information Engineering, Northwest A&F University, 712100, Yangling, China
- School of Software, Shandong University, 250100, Jinan, China
| | - Shanchen Pang
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Wenhao Wu
- Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, 266580, China
| | - Lan Ye
- Cancer Center, the Second Hospital of Shandong University, Jinan, 250033, China
| | - Jionglong Su
- School of AI and Advanced Computing, XJTLU Entrepreneur College (Taicang), Xi'an Jiaotong-Liverpool University, Suzhou, 215123, Jiangsu, China
| | - Yulin Zhang
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao, 266590, China
| |
Collapse
|
2
|
van der Flier F, Estell D, Pricelius S, Dankmeyer L, van Stigt Thans S, Mulder H, Otsuka R, Goedegebuur F, Lammerts L, Staphorst D, van Dijk AD, de Ridder D, Redestig H. Enzyme structure correlates with variant effect predictability. Comput Struct Biotechnol J 2024; 23:3489-3497. [PMID: 39435338 PMCID: PMC11491678 DOI: 10.1016/j.csbj.2024.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 09/03/2024] [Accepted: 09/12/2024] [Indexed: 10/23/2024] Open
Abstract
Protein engineering increasingly relies on machine learning models to computationally pre-screen promising novel candidates. Although machine learning approaches have proven effective, their performance on prospective screening data leaves room for improvement; prediction accuracy can vary greatly from one protein variant to the next. So far, it is unclear what characterizes variants that are associated with large prediction error. In order to establish whether structural characteristics influence predictability, we created a novel high-order combinatorial dataset for an enzyme spanning 3,706 variants, that can be partitioned into subsets of variants with mutations at positions exclusively belonging to a particular structural class. By training four different supervised variant effect prediction (VEP) models on structurally partitioned subsets of our data, we found that predictability strongly depended on all four structural characteristics we tested; buriedness, number of contact residues, proximity to the active site and presence of secondary structure elements. These dependencies were also found in several single mutation enzyme variant datasets, albeit with dataset specific directions. Most importantly, we found that these dependencies were similar for all four models we tested, indicating that there are specific structure and function determinants that are insufficiently accounted for by current machine learning algorithms. Overall, our findings suggest that improvements can be made to VEP models by exploring new inductive biases and by leveraging different data modalities of protein variants, and that stratified dataset design can highlight areas of improvement for machine learning guided protein engineering.
Collapse
Affiliation(s)
- Floris van der Flier
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Dave Estell
- Health & Biosciences, International Flavors and Fragrances, Palo Alto, 94304 CA, USA
| | - Sina Pricelius
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Lydia Dankmeyer
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Sander van Stigt Thans
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Harm Mulder
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Rei Otsuka
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Frits Goedegebuur
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Laurens Lammerts
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Diego Staphorst
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| | - Aalt D.J. van Dijk
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Dick de Ridder
- Department of Plant Sciences, Wageningen University & Research, Wageningen, 6708 PB, the Netherlands
| | - Henning Redestig
- Health & Biosciences, International Flavors and Fragrances, Oegstgeest, 2342 BG, the Netherlands
| |
Collapse
|
3
|
Magalhães Borges V, Horimoto ARVR, Wijsman EM, Kimura L, Nunes K, Nato AQ, Mingroni-Netto RC. Genomic Exploration of Essential Hypertension in African-Brazilian Quilombo Populations: A Comprehensive Approach with Pedigree Analysis and Family-Based Association Studies. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.26.24309531. [PMID: 38978678 PMCID: PMC11230341 DOI: 10.1101/2024.06.26.24309531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Essential Hypertension (EH) is a global health issue. Despite extensive research, much of EH heritability remains unexplained. We investigated the genetic basis of EH in African-derived individuals from partially isolated quilombo populations in Vale do Ribeira (SP-Brazil). Methods Samples from 431 individuals (167 affected, 261 unaffected, 3 unknown) were genotyped using a 650k SNP array. Estimated global ancestry proportions were 47% African, 36% European, and 16% Native American. We constructed six pedigrees using additional data from 673 individuals and created three non-overlapping SNP subpanels. We phased haplotypes and performed local ancestry analysis to account for admixture. Genome-wide linkage analysis (GWLA) and fine-mapping via family-based association studies (FBAS) were conducted, prioritizing EH-associated genes through systematic approach involving databases like PubMed, ClinVar, and GWAS Catalog. Results Linkage analysis identified 22 regions of interest (ROIs) with LOD scores ranging 1.45-3.03, encompassing 2363 genes. Fine-mapping (FBAS) identified 60 EH-related candidate genes and 117 suggestive/significant variants. Among these, 14 genes, including PHGDH , S100A10 , MFN2 , and RYR2 , were strongly related to hypertension harboring 29 suggestive/significant SNPs. Conclusions Through a complementary approach - combining admixture-adjusted GWLA based on Markov chain Monte Carlo methods, FBAS on known and imputed data, and gene prioritizing - new loci, variants, and candidate genes were identified. These findings provide targets for future research, replication in other populations, facilitate personalized treatments, and improve public health towards African-derived underrepresented populations. Limitations include restricted SNP coverage, self-reported pedigree data, and lack of available EH genomic studies on admixed populations for independent validation, despite the performed genetic correlation analyses using summary statistics.
Collapse
|
4
|
Li R, Qu R, Parisi F, Strino F, Lam H, Stanley JS, Cheng X, Myung P, Kluger Y. LMD: Cluster-Independent Multiscale Marker Identification in Single-cell RNA-seq Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.12.566780. [PMID: 38014159 PMCID: PMC10680591 DOI: 10.1101/2023.11.12.566780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Identifying accurate cell markers in single-cell RNA-seq data is crucial for understanding cellular diversity and function. Localized Marker Detector (LMD) is a novel tool to identify "localized genes" - genes exclusively expressed in groups of highly similar cells - thereby characterizing cellular diversity in a multi-resolution and fine-grained manner. LMD constructs a cell-cell affinity graph, diffuses the gene expression value across the cell graph, and assigns a score to each gene based on its diffusion dynamics. LMD's candidate markers can be grouped into functional gene modules, which accurately reflect cell types, subtypes, and other sources of variation such as cell cycle status. We apply LMD to mouse bone marrow and hair follicle dermal condensate datasets, where LMD facilitates cross-sample comparisons, identifying shared and sample-specific gene signatures and novel cell populations without requiring batch effect correction or integration methods. Furthermore, we assessed the performance of LMD across nine single-cell RNA sequencing datasets, compared it with six other methods aimed at achieving similar objectives, and found that LMD outperforms the other methods evaluated.
Collapse
|
5
|
Zhang Y, Wu D, Yu T, Liu Y, Zhao C, Xue R. Prognostic value of TMTC1 in pan-cancer analysis. Heliyon 2024; 10:e38308. [PMID: 39397950 PMCID: PMC11471174 DOI: 10.1016/j.heliyon.2024.e38308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 09/20/2024] [Accepted: 09/22/2024] [Indexed: 10/15/2024] Open
Abstract
Background Transmembrane and tetratricopeptide repeat containing 1 (TMTC1) is a recently discovered enzyme involved in the O-mannosylation of cadherins and protocadherins. It has been implicated in various types of cancer, but the overall prognostic significance of TMTC1 in pan-cancer and its potential as an immunotherapeutic target remain unclear. Methods We applied various bioinformatics methods to investigate the potential oncogenic roles of TMTC1 using public databases. This analysis involved examining the expression, prognosis, genetic alterations, immune infiltration, immunotherapy response, drug sensitivity, and regulatory mechanisms of the TMTC1 gene in diverse cancer types. Results In this study, we observed that TMTC1 expression is reduced in 19 types of cancer (ACC, BLCA, BRCA, CESC, COAD, ESCA, GBM, KICH, KIRC, KIRP, LAML, LUAD, LUSC, PRAD, READ, STAD, THCA, UCEC, and UCS) compared to normal tissues. Conversely, TMTC1 expression is elevated in OV and PAAD relative to normal tissues. Moreover, our analysis revealed that high expression of TMTC1 was associated with worse overall survival (OS) outcomes in patients with ACC, BLCA, COAD, GBM, KIRP, OV, STAD, and UCEC, but better OS outcomes in patients with CESC, KIRC, LUSC, and PAAD. Notably, patients with TMTC1 mutations or deep deletions demonstrated longer OS, while those with TMTC1 amplification showed shorter OS. There was a significant correlation between the expression level of TMTC1 and the infiltration of cancer-associated fibroblasts (CAFs) and endothelial cells. Using data from six real-world immunotherapy cohorts of BLCA, SKCM and RCC, we discovered that high TMTC1 expression was associated with better OS or progression-free survival (PFS). Lastly, through TMTC1-related gene enrichment analysis, some biological processes and pathways were found to be significantly enriched, such as vascular endothelial growth factor receptor signaling pathway and ECM-receptor interaction. Conclusions Our study demonstrates the prognostic significance of TMTC1 in pan-cancer and highlights its potential as an immunotherapeutic target.
Collapse
Affiliation(s)
- Ying Zhang
- The International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Labouratory of Embryo Original Diseases, 200030, Shanghai, China
- Institute of Birth Defects and Rare Diseases, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
| | - Dan Wu
- Department of Obstrics and Gynecology, The First People's Hospital of Jiande, Hangzhou, China
| | - Tiantian Yu
- The International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Labouratory of Embryo Original Diseases, 200030, Shanghai, China
- Institute of Birth Defects and Rare Diseases, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
| | - Yao Liu
- The International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Labouratory of Embryo Original Diseases, 200030, Shanghai, China
- Institute of Birth Defects and Rare Diseases, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
| | - Chunbo Zhao
- Department of Obstrics and Gynecology, The First People's Hospital of Jiande, Hangzhou, China
| | - Ruihong Xue
- The International Peace Maternity and Child Health Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Shanghai Key Labouratory of Embryo Original Diseases, 200030, Shanghai, China
- Institute of Birth Defects and Rare Diseases, School of Medicine, Shanghai Jiao Tong University, 200030, Shanghai, China
| |
Collapse
|
6
|
Liu Y, Zhao Z, Zeng Y, He M, Lyu Y, Yuan Q. Thermodynamics and Kinetics-Directed Regulation of Nucleic Acid-Based Molecular Recognition. SMALL METHODS 2024:e2401102. [PMID: 39392199 DOI: 10.1002/smtd.202401102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 09/28/2024] [Indexed: 10/12/2024]
Abstract
Nucleic acid-based molecular recognition plays crucial roles in various fields like biosensing and disease diagnostics. To achieve optimal detection and analysis, it is essential to regulate the response performance of nucleic acid probes or switches to match specific application requirements by regulating thermodynamics and kinetics properties. However, the impacts of thermodynamics and kinetics theories on recognition performance are sometimes obscure and the relative conclusions are not intuitive. To promote the thorough understanding and rational utilization of thermodynamics and kinetics theories, this review focuses on the landmarks and recent advances of nucleic acid thermodynamics and kinetics and summarizes the nucleic acid thermodynamics and kinetics-based strategies for regulation of nucleic acid-based molecular recognition. This work hopes such a review can provide reference and guidance for the development and optimization of nucleic acid probes and switches in the future, as well as for advancements in other nucleic acid-related fields.
Collapse
Affiliation(s)
- Yihao Liu
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
| | - Zihan Zhao
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
| | - Yuqi Zeng
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
| | - Minze He
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
| | - Yifan Lyu
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
- Furong Laboratory, Changsha, 410082, China
| | - Quan Yuan
- Molecular Science and Biomedicine Laboratory (MBL), State Key Laboratory of Chemo/Biosensing and Chemometrics, College of Chemistry and Chemical Engineering, Aptamer Engineering Center of Hunan Province, Hunan University, Changsha, 410082, China
- Institute of Chemical Biology and Nanomedicine, College of Biology, Hunan University, Changsha, 410082, China
| |
Collapse
|
7
|
Kumari P, Kaur M, Dindhoria K, Ashford B, Amarasinghe SL, Thind AS. Advances in long-read single-cell transcriptomics. Hum Genet 2024; 143:1005-1020. [PMID: 38787419 PMCID: PMC11485027 DOI: 10.1007/s00439-024-02678-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 05/07/2024] [Indexed: 05/25/2024]
Abstract
Long-read single-cell transcriptomics (scRNA-Seq) is revolutionizing the way we profile heterogeneity in disease. Traditional short-read scRNA-Seq methods are limited in their ability to provide complete transcript coverage, resolve isoforms, and identify novel transcripts. The scRNA-Seq protocols developed for long-read sequencing platforms overcome these limitations by enabling the characterization of full-length transcripts. Long-read scRNA-Seq techniques initially suffered from comparatively poor accuracy compared to short read scRNA-Seq. However, with improvements in accuracy, accessibility, and cost efficiency, long-reads are gaining popularity in the field of scRNA-Seq. This review details the advances in long-read scRNA-Seq, with an emphasis on library preparation protocols and downstream bioinformatics analysis tools.
Collapse
Affiliation(s)
- Pallawi Kumari
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Manmeet Kaur
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Kiran Dindhoria
- Institute of Microbial Technology, Council of Scientific and Industrial Research, Chandigarh, India
| | - Bruce Ashford
- Illawarra Shoalhaven Local Health District (ISLHD), NSW Health, Wollongong, NSW, Australia
| | - Shanika L Amarasinghe
- Monash Biomedical Discovery Institute, Monash University, Clayton, VIC, 3800, Australia
- Walter and Eliza Hall Institute of Medical Research, 1G, Royal Parade, Parkville, VIC, 3025, Australia
| | - Amarinder Singh Thind
- Illawarra Shoalhaven Local Health District (ISLHD), NSW Health, Wollongong, NSW, Australia.
- The School of Chemistry and Molecular Bioscience (SCMB), University of Wollongong, Loftus St, Wollongong, NSW, 2500, Australia.
| |
Collapse
|
8
|
Zheng Y, Shang X. FindCSV: a long-read based method for detecting complex structural variations. BMC Bioinformatics 2024; 25:315. [PMID: 39342151 PMCID: PMC11439270 DOI: 10.1186/s12859-024-05937-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Accepted: 09/18/2024] [Indexed: 10/01/2024] Open
Abstract
BACKGROUND Structural variations play a significant role in genetic diseases and evolutionary mechanisms. Extensive research has been conducted over the past decade to detect simple structural variations, leading to the development of well-established detection methods. However, recent studies have highlighted the potentially greater impact of complex structural variations on individuals compared to simple structural variations. Despite this, the field still lacks precise detection methods specifically designed for complex structural variations. Therefore, the development of a highly efficient and accurate detection method is of utmost importance. RESULT In response to this need, we propose a novel method called FindCSV, which leverages deep learning techniques and consensus sequences to enhance the detection of SVs using long-read sequencing data. Compared to current methods, FindCSV performs better in detecting complex and simple structural variations. CONCLUSIONS FindCSV is a new method to detect complex and simple structural variations with reasonable accuracy in real and simulated data. The source code for the program is available at https://github.com/nwpuzhengyan/FindCSV .
Collapse
Affiliation(s)
- Yan Zheng
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, West Youyi Road 127, Xi'an, 710072, China.
| |
Collapse
|
9
|
Wu C, Qin W, Lu W, Lin J, Yang H, Li C, Mao Y. Unraveling the immune landscape of lung adenocarcinoma: insights for tailoring therapeutic approaches. Discov Oncol 2024; 15:470. [PMID: 39331252 PMCID: PMC11436577 DOI: 10.1007/s12672-024-01396-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 09/24/2024] [Indexed: 09/28/2024] Open
Abstract
Lung adenocarcinoma (LUAD), a prevalent type of non-small cell lung cancer (NSCLC), was known for its diversity and intricate tumor microenvironment (TME). Comprehending the interaction among human immune-related genes (IRGs) and the TME is vital in the creation of accurate predictive models and specific treatments. We created a risk score based on IRGs and designed a nomogram to predict the prognosis of LUAD accurately. This involved a thorough examination of TME and the infiltration of immune cells in both high-risk and low-risk LUAD groups. Furthermore, the examination of the association between characteristic genes (BIRC5 and BMP5) and immune cells, along with immune checkpoints in the TME, was also conducted. The findings of our research unveiled unique immune profiles and interactions among individuals in the high- and low-risk categories, which contribute to variations in prognosis. LUAD demonstrated significant associations between BIRC5, BMP5, immune cells, and checkpoints, suggesting their involvement in disease advancement and resistance to medication. Furthermore, by correlating our findings with a multidrug database, we identified specific LUAD patient subsets that might benefit from tailored treatments. Our study establishes a groundbreaking prognostic model for LUAD, which not only underscores the importance of the immune context in LUAD but also paves the way for advancing precision medicine strategies in this complex malignancy.
Collapse
Affiliation(s)
- Changjiang Wu
- Department of Intensive Care Unit, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou, 215028, Jiangsu, China
| | - Wangshang Qin
- Genetic and Metabolic Central Laboratory, Maternal and Child Health Hospital of Guangxi Zhuang Autonomous Region, Nanning, 530003, Guangxi, China
| | - Wenqiang Lu
- Department of Thoracic Surgery, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou, 215028, Jiangsu, China
| | - Jingyu Lin
- Department of Science & Education, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou, 215028, Jiangsu, China
| | - Hongwei Yang
- Department of Clinical Laboratory, Suzhou BOE Hospital, Suzhou, 215028, Jiangsu, China
| | - Chunhong Li
- Central Laboratory, The Second Affiliated Hospital of Guilin Medical University, Guilin, 541199, Guangxi, China.
- Guangxi Health Commission Key Laboratory of Glucose and Lipid Metabolism Disorders, The Second Affiliated Hospital of Guilin Medical University, Guilin, 541199, Guangxi, China.
| | - Yiming Mao
- Department of Thoracic Surgery, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou, 215028, Jiangsu, China.
| |
Collapse
|
10
|
Su C, Lee D, Jin P, Zhang J. Cell-type-specific mapping of enhancers and target genes from single-cell multimodal data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.24.614814. [PMID: 39386519 PMCID: PMC11463474 DOI: 10.1101/2024.09.24.614814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
Mapping enhancers and target genes in disease-related cell types has provided critical insights into the functional mechanisms of genetic variants identified by genome-wide association studies (GWAS). However, most existing analyses rely on bulk data or cultured cell lines, which may fail to identify cell-type-specific enhancers and target genes. Recently, single-cell multimodal data measuring both gene expression and chromatin accessibility within the same cells have enabled the inference of enhancer-gene pairs in a cell-type-specific and context-specific manner. However, this task is challenged by the data's high sparsity, sequencing depth variation, and the computational burden of analyzing a large number of enhancer-gene pairs. To address these challenges, we propose scMultiMap, a statistical method that infers enhancer-gene association from sparse multimodal counts using a joint latent-variable model. It adjusts for technical confounding, permits fast moment-based estimation and provides analytically derived p -values. In systematic analyses of blood and brain data, scMultiMap shows appropriate type I error control, high statistical power with greater reproducibility across independent datasets and stronger consistency with orthogonal data modalities. Meanwhile, its computational cost is less than 1% of existing methods. When applied to single-cell multimodal data from postmortem brain samples from Alzheimer's disease (AD) patients and controls, scMultiMap gave the highest heritability enrichment in microglia and revealed new insights into the regulatory mechanisms of AD GWAS variants in microglia.
Collapse
Affiliation(s)
- Chang Su
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
| | - Dongsoo Lee
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
| | - Peng Jin
- Department of Human Genetics, School of Medicine, Emory University, Atlanta, GA, USA
| | - Jingfei Zhang
- Information Systems and Operations Management, Emory University, Atlanta, GA, USA
| |
Collapse
|
11
|
Yang K, Islas N, Jewell S, Jha A, Radens CM, Pleiss JA, Lynch KW, Barash Y, Choi PS. Machine learning-optimized targeted detection of alternative splicing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.20.614162. [PMID: 39386495 PMCID: PMC11463589 DOI: 10.1101/2024.09.20.614162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
RNA-sequencing (RNA-seq) is widely adopted for transcriptome analysis but has inherent biases which hinder the comprehensive detection and quantification of alternative splicing. To address this, we present an efficient targeted RNA-seq method that greatly enriches for splicing-informative junction-spanning reads. Local Splicing Variation sequencing (LSV-seq) utilizes multiplexed reverse transcription from highly scalable pools of primers anchored near splicing events of interest. Primers are designed using Optimal Prime, a novel machine learning algorithm trained on the performance of thousands of primer sequences. In experimental benchmarks, LSV-seq achieves high on-target capture rates and concordance with RNA-seq, while requiring significantly lower sequencing depth. Leveraging deep learning splicing code predictions, we used LSV-seq to target events with low coverage in GTEx RNA-seq data and newly discover hundreds of tissue-specific splicing events. Our results demonstrate the ability of LSV-seq to quantify splicing of events of interest at high-throughput and with exceptional sensitivity.
Collapse
Affiliation(s)
- Kevin Yang
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Cancer Pathobiology, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Nathaniel Islas
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - San Jewell
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Anupama Jha
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Caleb M. Radens
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
| | - Jeffrey A. Pleiss
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Kristen W. Lynch
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia, PA, USA
| | - Yoseph Barash
- Department of Genetics, University of Pennsylvania, Philadelphia, PA, USA
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Peter S. Choi
- Department of Pathology & Laboratory Medicine, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
- Division of Cancer Pathobiology, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| |
Collapse
|
12
|
Meir AY, Yun H, Hu J, Li J, Liu J, Bever A, Ratanatharathorn A, Song M, Heather Eliassen A, Chibnik L, Koenen K, Pare G, Stampfer MJ, Liang L. Cross omics risk scores of inflammation markers are associated with all-cause mortality: The Canadian Longitudinal Study on Aging. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.24.24313672. [PMID: 39399025 PMCID: PMC11469340 DOI: 10.1101/2024.09.24.24313672] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Inflammation is a critical component of chronic diseases, aging progression, and lifespan. Omics signatures may characterize inflammation status beyond blood biomarkers. We leveraged genetics (Polygenic-Risk-Score; PRS), metabolomics (Metabolomic-Risk-Score; MRS), and epigenetics (Epigenetic-Risk-Score; ERS) to build multi-omics-multi-marker risk scores for inflammation status represented by the level of circulating C-reactive protein (CRP), interleukin 6 (IL6), and tumor necrosis factor alpha (TNFa). We found that multi-omics risk-scores generally outperformed single-omics risk scores in prediction of all-cause mortality in the Canadian Longitudinal Study on Aging. Compared with circulating inflammation biomarkers, some multi-omics risk scores had a higher HR for all cause-mortality when including both score and circulating IL6 in the same model (1-SD IL6 MRS-ERS: HR=1.77 [1.15-2.72] vs. 1-SD circulating IL6 HR=1.11 [0.75,1.66]; 1-SD IL6 PRS-MRS: HR=1.32 [1.21,1.45] vs. 1-SD circulating IL6 HR=1.31 [1.12, 1.53]; 1-SD PRS-MRS-ERS: HR=1.62 [1.04, 2.53] vs. 1-SD circulating IL6: HR=1.16 [0.77, 1.74]). In the Nurses' Health Study (NHS), NHS II, and Health Professional Follow-up Study with available omics, 1-SD of IL6 PRS and 1-SD IL6 PRS-MRS had HR=1.13 [1.00,1.27] and HR=1.13 [1.01,1.27], among individuals >65years without mutual adjustment of the score and circulating IL6. Our study demonstrated that some multi-omics scores for inflammation markers may characterize important inflammation burden for an individual beyond those represented by blood biomarkers and improve our prediction capability for aging process and lifespan.
Collapse
|
13
|
Sashittal P, Zhang RY, Law BK, Strzalkowski A, Schmidt H, Bolondi A, Chan MM, Raphael BJ. Inferring cell differentiation maps from lineage tracing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.09.611835. [PMID: 39314473 PMCID: PMC11419031 DOI: 10.1101/2024.09.09.611835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
During development, mulitpotent cells differentiate through a hierarchy of increasingly restricted progenitor cell types until they realize specialized cell types. A cell differentiation map describes this hierarchy, and inferring these maps is an active area of research spanning traditional single marker lineage studies to data-driven trajectory inference methods on single-cell RNA-seq data. Recent high-throughput lineage tracing technologies profile lineages and cell types at scale, but current methods to infer cell differentiation maps from these data rely on simple models with restrictive assumptions about the developmental process. We introduce a mathematical framework for cell differentiation maps based on the concept of potency, and develop an algorithm, Carta, that infers an optimal cell differentiation map from single-cell lineage tracing data. The key insight in Carta is to balance the trade-off between the complexity of the cell differentiation map and the number of unobserved cell type transitions on the lineage tree. We show that Carta more accurately infers cell differentiation maps on both simulated and real data compared to existing methods. In models of mammalian trunk development and mouse hematopoiesis, Carta identifies important features of development that are not revealed by other methods including convergent differentiation of specialized cell types, progenitor differentiation dynamics, and the refinement of routes of differentiation via new intermediate progenitors.
Collapse
Affiliation(s)
- Palash Sashittal
- Dept. of Computer Science, Princeton University, Princeton; 08544 NJ, USA
| | - Richard Y. Zhang
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
| | - Benjamin K. Law
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
- Dept. of Molecular Biology, Princeton University, Princeton; 08544 NJ, USA
| | | | - Henri Schmidt
- Dept. of Computer Science, Princeton University, Princeton; 08544 NJ, USA
| | - Adriano Bolondi
- Dept. of Genome Regulation, Max Planck Institute for Molecular Genetics; 14195 Berlin, Germany
| | - Michelle M. Chan
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton; 08544 NJ, USA
- Dept. of Molecular Biology, Princeton University, Princeton; 08544 NJ, USA
| | | |
Collapse
|
14
|
Lee K, Cha H, Kim J, Jang Y, Son Y, Joe CY, Kim J, Kim J, Lee SH, Lee S. Dissecting transcriptome signals of anti-PD-1 response in lung adenocarcinoma. Sci Rep 2024; 14:21096. [PMID: 39256604 PMCID: PMC11387489 DOI: 10.1038/s41598-024-72108-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 09/03/2024] [Indexed: 09/12/2024] Open
Abstract
Immune checkpoint blockades are actively adopted in diverse cancer types including metastatic melanoma and lung cancer. Despite of durable response in 20-30% of patients, we still lack molecular markers that could predict the patient responses reliably before treatment. Here we present a composite model for predicting anti-PD-1 response based on tumor mutation burden (TMB) and transcriptome sequencing data of 85 lung adenocarcinoma (LUAD) patients who received anti-PD-(L)1 treatment. We found that TMB was a good predictor (AUC = 0.81) for PD-L1 negative patients (n = 20). For PD-L1 positive patients (n = 65), we built an ensemble model of 100 XGBoost learning machines where gene expression, gene set activities and cell type composition were used as input features. The transcriptome-based models showed excellent accuracy (AUC > 0.9) and highlighted the contribution of T cell activities. Importantly, nonresponder patients with high prediction score turned out to have high CTLA4 expression, which suggested that neoadjuvant CTLA4 combination therapy might be effective for these patients. Our data and analysis results provide valuable insights into developing biomarkers and strategies for treating LUAD patients using immune checkpoint inhibitors.
Collapse
Affiliation(s)
- Kyeongmi Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul, 03760, South Korea
| | - Honghui Cha
- Department of Health Sciences and Technology, Samsung Advanced Institute of Health Science and Technology, Sungkyunkwan University, Seoul, 06351, South Korea
| | - Jaewon Kim
- Ewha Research Center for Systems Biology (ERCSB), Ewha Womans University, Seoul, 03760, South Korea
| | - Yeongjun Jang
- Ewha Research Center for Systems Biology (ERCSB), Ewha Womans University, Seoul, 03760, South Korea
| | - Yelin Son
- Ewha Research Center for Systems Biology (ERCSB), Ewha Womans University, Seoul, 03760, South Korea
| | - Cheol Yong Joe
- Department of Health Sciences and Technology, Samsung Advanced Institute of Health Science and Technology, Sungkyunkwan University, Seoul, 06351, South Korea
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 06351, South Korea
| | - Jaesang Kim
- Department of Life Sciences, Ewha Womans University, Seoul, 03760, South Korea
- Ewha-JAX Cancer Immunotherapy Research Center, Ewha Womans University, Seoul, 03760, South Korea
| | - Jhingook Kim
- Department of Lung Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 06351, South Korea
| | - Se-Hoon Lee
- Department of Health Sciences and Technology, Samsung Advanced Institute of Health Science and Technology, Sungkyunkwan University, Seoul, 06351, South Korea.
- Division of Hematology-Oncology, Department of Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, 06351, South Korea.
| | - Sanghyuk Lee
- Department of Bio-Information Science, Ewha Womans University, Seoul, 03760, South Korea.
- Ewha Research Center for Systems Biology (ERCSB), Ewha Womans University, Seoul, 03760, South Korea.
- Department of Life Sciences, Ewha Womans University, Seoul, 03760, South Korea.
| |
Collapse
|
15
|
Campbell AM, Gavilan RG, Abanto Marin M, Yang C, Hauton C, van Aerle R, Martinez-Urtaza J. Evolutionary dynamics of the successful expansion of pandemic Vibrio parahaemolyticus ST3 in Latin America. Nat Commun 2024; 15:7828. [PMID: 39244587 PMCID: PMC11380683 DOI: 10.1038/s41467-024-52159-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 08/27/2024] [Indexed: 09/09/2024] Open
Abstract
The underlying evolutionary mechanisms driving global expansions of pathogen strains are poorly understood. Vibrio parahaemolyticus is one of only two marine pathogens where variants have emerged in distinct climates globally. The success of a Vibrio parahaemolyticus clone (VpST3) in Latin America- the first spread identified outside its endemic region of tropical Asia- provided an invaluable opportunity to investigate mechanisms of VpST3 expansion into a distinct marine climate. A global collection of VpST3 isolates and novel Latin American isolates were used for evolutionary population genomics, pangenome analysis and combined with oceanic climate data. We found a VpST3 population (LatAm-VpST3) introduced in Latin America well before the emergence of this clone in India, previously considered the onset of the VpST3 epidemic. LatAm-VpST3 underwent successful adaptation to local conditions over its evolutionary divergence from Asian VpST3 isolates, to become dominant in Latin America. Selection signatures were found in genes providing resilience to the distinct marine climate. Core genome mutations and accessory gene presences that promoted survival over long dispersals or increased environmental fitness were associated with environmental conditions. These results provide novel insights into the global expansion of this successful V. parahaemolyticus clone into regions with different climate scenarios.
Collapse
Affiliation(s)
- Amy Marie Campbell
- School of Ocean and Earth Science, University of Southampton, National Oceanography Centre, Southampton, UK
- Centre for Environment, Fisheries and Aquaculture Science (CEFAS), Weymouth, UK
| | - Ronnie G Gavilan
- Centro Nacional de Salud Pública, Instituto Nacional de Salud, Lima, Peru
- Department of Genetics and Microbiology, Autonomous University of Barcelona, Barcelona, Spain
| | - Michel Abanto Marin
- Genomics and Bioinformatics Unit, Scientific and Technological Bioresource Nucleus (BIOREN), Universidad de La Frontera, Temuco, Chile
| | - Chao Yang
- The Center for Microbes, Development and Health, CAS Key Laboratory of Molecular Virology and Immunology, Shanghai Institute of Immunity and Infection, Chinese Academy of Sciences, Shanghai, China
| | - Chris Hauton
- School of Ocean and Earth Science, University of Southampton, National Oceanography Centre, Southampton, UK
| | - Ronny van Aerle
- Centre for Environment, Fisheries and Aquaculture Science (CEFAS), Weymouth, UK
| | - Jaime Martinez-Urtaza
- Centre for Environment, Fisheries and Aquaculture Science (CEFAS), Weymouth, UK.
- Department of Genetics and Microbiology, Autonomous University of Barcelona, Barcelona, Spain.
| |
Collapse
|
16
|
Williams MP, Flegontov P, Maier R, Huber CD. Testing times: disentangling admixture histories in recent and complex demographies using ancient DNA. Genetics 2024; 228:iyae110. [PMID: 39013011 PMCID: PMC11373510 DOI: 10.1093/genetics/iyae110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2024] [Revised: 04/08/2024] [Accepted: 06/11/2024] [Indexed: 07/18/2024] Open
Abstract
Our knowledge of human evolutionary history has been greatly advanced by paleogenomics. Since the 2020s, the study of ancient DNA has increasingly focused on reconstructing the recent past. However, the accuracy of paleogenomic methods in resolving questions of historical and archaeological importance amidst the increased demographic complexity and decreased genetic differentiation remains an open question. We evaluated the performance and behavior of two commonly used methods, qpAdm and the f3-statistic, on admixture inference under a diversity of demographic models and data conditions. We performed two complementary simulation approaches-firstly exploring a wide demographic parameter space under four simple demographic models of varying complexities and configurations using branch-length data from two chromosomes-and secondly, we analyzed a model of Eurasian history composed of 59 populations using whole-genome data modified with ancient DNA conditions such as SNP ascertainment, data missingness, and pseudohaploidization. We observe that population differentiation is the primary factor driving qpAdm performance. Notably, while complex gene flow histories influence which models are classified as plausible, they do not reduce overall performance. Under conditions reflective of the historical period, qpAdm most frequently identifies the true model as plausible among a small candidate set of closely related populations. To increase the utility for resolving fine-scaled hypotheses, we provide a heuristic for further distinguishing between candidate models that incorporates qpAdm model P-values and f3-statistics. Finally, we demonstrate a significant performance increase for qpAdm using whole-genome branch-length f2-statistics, highlighting the potential for improved demographic inference that could be achieved with future advancements in f-statistic estimations.
Collapse
Affiliation(s)
- Matthew P Williams
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| | - Pavel Flegontov
- Department of Biology and Ecology, University of Ostrava, Ostrava 701 03, Czechia
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Robert Maier
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Christian D Huber
- Department of Biology, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
17
|
Wang D, Gazzara MR, Jewell S, Wales-McGrath B, Brown CD, Choi PS, Barash Y. A Deep Dive into Statistical Modeling of RNA Splicing QTLs Reveals New Variants that Explain Neurodegenerative Disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.01.610696. [PMID: 39282456 PMCID: PMC11398334 DOI: 10.1101/2024.09.01.610696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Genome-wide association studies (GWAS) have identified thousands of putative disease causing variants with unknown regulatory effects. Efforts to connect these variants with splicing quantitative trait loci (sQTLs) have provided functional insights, yet sQTLs reported by existing methods cannot explain many GWAS signals. We show current sQTL modeling approaches can be improved by considering alternative splicing representation, model calibration, and covariate integration. We then introduce MAJIQTL, a new pipeline for sQTL discovery. MAJIQTL includes two new statistical methods: a weighted multiple testing approach for sGene discovery and a model for sQTL effect size inference to improve variant prioritization. By applying MAJIQTL to GTEx, we find significantly more sGenes harboring sQTLs with functional significance. Notably, our analysis implicates the novel variant rs582283 in Alzheimer's disease. Using antisense oligonucleotides, we validate this variant's effect by blocking the implicated YBX3 binding site, leading to exon skipping in the gene MS4A3.
Collapse
Affiliation(s)
- David Wang
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania
| | - Matthew R. Gazzara
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania
| | - San Jewell
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
| | | | | | - Peter S. Choi
- Department of Pathology & Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania
- Division of Cancer Pathobiology, The Children’s Hospital of Philadelphia
| | - Yoseph Barash
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
- Department of Computer and Information Sciences, School of Engineering, University of Pennsylvania
| |
Collapse
|
18
|
Lee AMJ, Foong MYM, Song BK, Chew FT. Genomic selection for crop improvement in fruits and vegetables: a systematic scoping review. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2024; 44:60. [PMID: 39267903 PMCID: PMC11391014 DOI: 10.1007/s11032-024-01497-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Accepted: 09/01/2024] [Indexed: 09/15/2024]
Abstract
To ensure the nutritional needs of an expanding global population, it is crucial to optimize the growing capabilities and breeding values of fruit and vegetable crops. While genomic selection, initially implemented in animal breeding, holds tremendous potential, its utilization in fruit and vegetable crops remains underexplored. In this systematic review, we reviewed 63 articles covering genomic selection and its applications across 25 different types of fruit and vegetable crops over the last decade. The traits examined were directly related to the edible parts of the crops and carried significant economic importance. Comparative analysis with WHO/FAO data identified potential economic drivers underlying the study focus of some crops and highlighted crops with potential for further genomic selection research and application. Factors affecting genomic selection accuracy in fruit and vegetable studies are discussed and suggestions made to assist in their implementation into plant breeding schemes. Genetic gain in fruits and vegetables can be improved by utilizing genomic selection to improve selection intensity, accuracy, and integration of genetic variation. However, the reduction of breeding cycle times may not be beneficial in crops with shorter life cycles such as leafy greens as compared to fruit trees. There is an urgent need to integrate genomic selection methods into ongoing breeding programs and assess the actual genomic estimated breeding values of progeny resulting from these breeding programs against the prediction models. Supplementary Information The online version contains supplementary material available at 10.1007/s11032-024-01497-2.
Collapse
Affiliation(s)
- Adrian Ming Jern Lee
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| | - Melissa Yuin Mern Foong
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Beng Kah Song
- School of Science, Monash University Malaysia, Bandar Sunway, 47500 Subang Jaya, Selangor Darul Ehsan Malaysia
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, 14 Science Drive 4, Singapore, 117543 Republic of Singapore
- NUS Agritech Centre, National University of Singapore, 85 Science Park Dr, #01-03, Singapore, 118258 Republic of Singapore
| |
Collapse
|
19
|
Hurton MD, Miller JM, Lee MT. H3K4me2 distinguishes a distinct class of enhancers during the maternal-to-zygotic transition. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.26.609713. [PMID: 39253505 PMCID: PMC11383010 DOI: 10.1101/2024.08.26.609713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
After egg fertilization, an initially silent embryonic genome is transcriptionally activated during the maternal-to-zygotic transition. In zebrafish, maternal vertebrate pluripotency factors Nanog, Pou5f3 (OCT4 homolog), and Sox19b (SOX2 homolog) (NPS) play essential roles in orchestrating embryonic genome activation, acting as "pioneers" that open condensed chromatin and mediate acquisition of activating histone modifications. However, some embryonic gene transcription still occurs in the absence of these factors, suggesting the existence of other mechanisms regulating genome activation. To identify chromatin signatures of these unknown pathways, we profiled the histone modification landscape of zebrafish embryos using CUT&RUN. Our regulatory map revealed two subclasses of enhancers distinguished by presence or absence of H3K4me2. Enhancers lacking H3K4me2 tend to require NPS factors for de novo activation, while enhancers bearing H3K4me2 are epigenetically bookmarked by DNA hypomethylation to recapitulate gamete activity in the embryo, independent of NPS pioneering. Thus, parallel enhancer activation pathways combine to induce transcriptional reprogramming to pluripotency in the early embryo.
Collapse
Affiliation(s)
- Matthew D Hurton
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA 15213 U.S.A
| | - Jennifer M Miller
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA 15213 U.S.A
| | - Miler T Lee
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh PA 15213 U.S.A
| |
Collapse
|
20
|
Hugerth LW, Krog MC, Vomstein K, Du J, Bashir Z, Kaldhusdal V, Fransson E, Engstrand L, Nielsen HS, Schuppe-Koistinen I. Defining Vaginal Community Dynamics: daily microbiome transitions, the role of menstruation, bacteriophages, and bacterial genes. MICROBIOME 2024; 12:153. [PMID: 39160615 PMCID: PMC11331738 DOI: 10.1186/s40168-024-01870-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 07/09/2024] [Indexed: 08/21/2024]
Abstract
BACKGROUND The composition of the vaginal microbiota during the menstrual cycle is dynamic, with some women remaining eu- or dysbiotic and others transitioning between these states. What defines these dynamics, and whether these differences are microbiome-intrinsic or mostly driven by the host is unknown. To address this, we characterized 49 healthy, young women by metagenomic sequencing of daily vaginal swabs during a menstrual cycle. We classified the dynamics of the vaginal microbiome and assessed the impact of host behavior as well as microbiome differences at the species, strain, gene, and phage levels. RESULTS Based on the daily shifts in community state types (CSTs) during a menstrual cycle, the vaginal microbiome was classified into four Vaginal Community Dynamics (VCDs) and reported in a classification tool, named VALODY: constant eubiotic, constant dysbiotic, menses-related, and unstable dysbiotic. The abundance of bacteria, phages, and bacterial gene content was compared between the four VCDs. Women with different VCDs showed significant differences in relative phage abundance and bacterial composition even when assigned to the same CST. Women with unstable VCDs had higher phage counts and were more likely dominated by L. iners. Their Gardnerella spp. strains were also more likely to harbor bacteriocin-coding genes. CONCLUSIONS The VCDs present a novel time series classification that highlights the complexity of varying degrees of vaginal dysbiosis. Knowing the differences in phage gene abundances and the genomic strains present allows a deeper understanding of the initiation and maintenance of permanent dysbiosis. Applying the VCDs to further characterize the different types of microbiome dynamics qualifies the investigation of disease and enables comparisons at individual and population levels. Based on our data, to be able to classify a dysbiotic sample into the accurate VCD, clinicians would need two to three mid-cycle samples and two samples during menses. In the future, it will be important to address whether transient VCDs pose a similar risk profile to persistent dysbiosis with similar clinical outcomes. This framework may aid interdisciplinary translational teams in deciphering the role of the vaginal microbiome in women's health and reproduction. Video Abstract.
Collapse
Affiliation(s)
- Luisa W Hugerth
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Husargatan 3, 75237, Uppsala, Sweden
- Department of Microbiology, Tumor and Cell Biology (MTC), Centre for Translational Microbiome Research, Karolinska Institutet, Nobels Väg 6, 17177, Stockholm, Sweden
| | - Maria Christine Krog
- The Recurrent Pregnancy Loss Unit, The Capital Region, Copenhagen University Hospitals, Rigshospitalet and Hvidovre Hospital, Blegdamsvej 9, 2100 Copenhagen and Kettegård Alle 30, 2650, Hvidovre, Denmark
- Department of Clinical Immunology, Copenhagen University Hospital, Rigshospitalet, Blegdamsvej 9, 2100, Copenhagen, Denmark
- Department of Clinical Medicine, Copenhagen University, Blegdamsvej 3B, 2200, Copenhagen, Denmark
| | - Kilian Vomstein
- The Recurrent Pregnancy Loss Unit, The Capital Region, Copenhagen University Hospitals, Rigshospitalet and Hvidovre Hospital, Blegdamsvej 9, 2100 Copenhagen and Kettegård Alle 30, 2650, Hvidovre, Denmark
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre Hospital, Kettegård Alle 30, 2650, Hvidovre, Denmark
| | - Juan Du
- Department of Microbiology, Tumor and Cell Biology (MTC), Centre for Translational Microbiome Research, Karolinska Institutet, Nobels Väg 6, 17177, Stockholm, Sweden
| | - Zahra Bashir
- The Recurrent Pregnancy Loss Unit, The Capital Region, Copenhagen University Hospitals, Rigshospitalet and Hvidovre Hospital, Blegdamsvej 9, 2100 Copenhagen and Kettegård Alle 30, 2650, Hvidovre, Denmark
- Department of Obstetrics and Gynecology, Region Zealand, Slagelse Hospital, Fælledvej 13, 4200, Slagelse, Denmark
| | - Vilde Kaldhusdal
- Department of Medicine Solna, Division of Infectious Diseases, Karolinska Institutet, Department of Infectious Diseases, Karolinska University Hospital, Center for Molecular Medicine, Stockholm, Sweden
| | - Emma Fransson
- Department of Microbiology, Tumor and Cell Biology (MTC), Centre for Translational Microbiome Research, Karolinska Institutet, Nobels Väg 6, 17177, Stockholm, Sweden
- Department of Women's and Children's Health, Uppsala University, Dag Hammarskjölds Vägäg 20, 75185, Uppsala, Sweden
| | - Lars Engstrand
- Department of Microbiology, Tumor and Cell Biology (MTC), Centre for Translational Microbiome Research, Karolinska Institutet, Nobels Väg 6, 17177, Stockholm, Sweden
| | - Henriette Svarre Nielsen
- The Recurrent Pregnancy Loss Unit, The Capital Region, Copenhagen University Hospitals, Rigshospitalet and Hvidovre Hospital, Blegdamsvej 9, 2100 Copenhagen and Kettegård Alle 30, 2650, Hvidovre, Denmark.
- Department of Clinical Medicine, Copenhagen University, Blegdamsvej 3B, 2200, Copenhagen, Denmark.
- Department of Obstetrics and Gynecology, Copenhagen University Hospital, Hvidovre Hospital, Kettegård Alle 30, 2650, Hvidovre, Denmark.
| | - Ina Schuppe-Koistinen
- Department of Microbiology, Tumor and Cell Biology (MTC), Centre for Translational Microbiome Research, Karolinska Institutet, Nobels Väg 6, 17177, Stockholm, Sweden
| |
Collapse
|
21
|
Johnson OD, Paul S, Gutierrez JA, Russell WK, Ward MC. DNA damage-associated protein co-expression network in cardiomyocytes informs on tolerance to genetic variation and disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.14.607863. [PMID: 39185220 PMCID: PMC11343126 DOI: 10.1101/2024.08.14.607863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Cardiovascular disease (CVD) is associated with both genetic variants and environmental factors. One unifying consequence of the molecular risk factors in CVD is DNA damage, which must be repaired by DNA damage response proteins. However, the impact of DNA damage on global cardiomyocyte protein abundance, and its relationship to CVD risk remains unclear. We therefore treated induced pluripotent stem cell-derived cardiomyocytes with the DNA-damaging agent Doxorubicin (DOX) and a vehicle control, and identified 4,178 proteins that contribute to a network comprising 12 co-expressed modules and 403 hub proteins with high intramodular connectivity. Five modules correlate with DOX and represent distinct biological processes including RNA processing, chromatin regulation and metabolism. DOX-correlated hub proteins are depleted for proteins that vary in expression across individuals due to genetic variation but are enriched for proteins encoded by loss-of-function intolerant genes. While proteins associated with genetic risk for CVD, such as arrhythmia are enriched in specific DOX-correlated modules, DOX-correlated hub proteins are not enriched for known CVD risk proteins. Instead, they are enriched among proteins that physically interact with CVD risk proteins. Our data demonstrate that DNA damage in cardiomyocytes induces diverse effects on biological processes through protein co-expression modules that are relevant for CVD, and that the level of protein connectivity in DNA damage-associated modules influences the tolerance to genetic variation.
Collapse
Affiliation(s)
- Omar D. Johnson
- Biochemistry, Cellular and Molecular Biology Graduate Program, University of Texas Medical Branch, Galveston, Texas, USA
- MD-PhD Combined Degree Program, University of Texas Medical Branch, Galveston, Texas, USA
| | - Sayan Paul
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Jose A. Gutierrez
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - William K. Russell
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| | - Michelle C. Ward
- Department of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas, USA
| |
Collapse
|
22
|
Laperriere SM, Minch B, Weissman JL, Hou S, Yeh YC, Ignacio-Espinoza JC, Ahlgren NA, Moniruzzaman M, Fuhrman JA. Phylogenetic proximity drives temporal succession of marine giant viruses in a five-year metagenomic time-series. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.12.607631. [PMID: 39185240 PMCID: PMC11343133 DOI: 10.1101/2024.08.12.607631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
Nucleocytoplasmic Large DNA Viruses (NCLDVs, also called giant viruses) are widespread in marine systems and infect a broad range of microbial eukaryotes (protists). Recent biogeographic work has provided global snapshots of NCLDV diversity and community composition across the world's oceans, yet little information exists about the guiding 'rules' underpinning their community dynamics over time. We leveraged a five-year monthly metagenomic time-series to quantify the community composition of NCLDVs off the coast of Southern California and characterize these populations' temporal dynamics. NCLDVs were dominated by Algavirales (Phycodnaviruses, 59%) and Imitervirales (Mimiviruses, 36%). We identified clusters of NCLDVs with distinct classes of seasonal and non-seasonal temporal dynamics. Overall, NCLDV population abundances were often highly dynamic with a strong seasonal signal. The Imitervirales group had highest relative abundance in the more oligotrophic late summer and fall, while Algavirales did so in winter. Generally, closely related strains had similar temporal dynamics, suggesting that evolutionary history is a key driver of the temporal niche of marine NCLDVs. However, a few closely-related strains had drastically different seasonal dynamics, suggesting that while phylogenetic proximity often indicates ecological similarity, occasionally phenology can shift rapidly, possibly due to host-switching. Finally, we identified distinct functional content and possible host interactions of two major NCLDV orders-including connections of Imitervirales with primary producers like the diatom Chaetoceros and widespread marine grazers like Paraphysomonas and Spirotrichea ciliates. Together, our results reveal key insights on season-specific effect of phylogenetically distinct giant virus communities on marine protist metabolism, biogeochemical fluxes and carbon cycling.
Collapse
Affiliation(s)
- Sarah M. Laperriere
- Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
| | - Benjamin Minch
- Department of Marine Biology and Ecology, Rosenstiel School of Marine, Atmospheric, and Earth Sciences, University of Miami, Miami, FL, USA
| | - JL Weissman
- Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, USA
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, NY, USA
| | - Shengwei Hou
- Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Yi-Chun Yeh
- Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
| | | | | | - Mohammad Moniruzzaman
- Department of Marine Biology and Ecology, Rosenstiel School of Marine, Atmospheric, and Earth Sciences, University of Miami, Miami, FL, USA
| | - Jed A. Fuhrman
- Department of Biological Sciences, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
23
|
Zou X, Gomez ZW, Reddy TE, Allen AS, Majoros WH. Bayesian Estimation of Allele-Specific Expression in the Presence of Phasing Uncertainty. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.09.607371. [PMID: 39211106 PMCID: PMC11361064 DOI: 10.1101/2024.08.09.607371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Motivation Allele-specific expression (ASE) analyses aim to detect imbalanced expression of maternal versus paternal copies of an autosomal gene. Such allelic imbalance can result from a variety of cis-acting causes, including disruptive mutations within one copy of a gene that impact the stability of transcripts, as well as regulatory variants outside the gene that impact transcription initiation. Current methods for ASE estimation suffer from a number of shortcomings, such as relying on only one variant within a gene, assuming perfect phasing information across multiple variants within a gene, or failing to account for alignment biases and possible genotyping errors. Results We developed BEASTIE, a Bayesian hierarchical model designed for precise ASE quantification at the gene level, based on given genotypes and RNA-Seq data. BEASTIE addresses the complexities of allelic mapping bias, genotyping error, and phasing errors by incorporating empirical phasing error rates derived from Genome-in-a-Bottle individual NA12878. BEASTIE surpasses existing methods in accuracy, especially in scenarios with high phasing errors. This improvement is critical for identifying rare genetic variants often obscured by such errors. Through rigorous validation on simulated data and application to real data from the 1000 Genomes Project, we establish the robustness of BEASTIE. These findings underscore the value of BEASTIE in revealing patterns of ASE across gene sets and pathways. Availability and Implementation The software is freely available from https://github.com/x811zou/BEASTIE . BEASTIE is available as Python source code and as a Docker image. Supplementary information Additional information is available online.
Collapse
|
24
|
Trang KB, Sharma P, Cook L, Mount Z, Thomas RM, Kulkarni NN, Pahl MC, Pippin JA, Su C, Kaestner KH, O'Brien JM, Wagley Y, Hankenson KD, Jermusyk A, Hoskins JW, Amundadottir LT, Xu M, Brown KM, Anderson SA, Yang W, Titchenell PM, Seale P, Zemel BS, Chesi A, Romberg N, Levings MK, Grant SFA, Wells AD. 3D chromatin-based variant-to-gene maps across 57 human cell types reveal the cellular and genetic architecture of autoimmune disease susceptibility. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.08.12.24311676. [PMID: 39185517 PMCID: PMC11343244 DOI: 10.1101/2024.08.12.24311676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
A portion of the genetic basis for many common autoimmune disorders has been uncovered by genome-wide association studies (GWAS), but GWAS do not reveal causal variants, effector genes, or the cell types impacted by disease-associated variation. We have generated 3D genomic datasets consisting of promoter-focused Capture-C, Hi-C, ATAC-seq, and RNA-seq and integrated these data with GWAS of 16 autoimmune traits to physically map disease-associated variants to the effector genes they likely regulate in 57 human cell types. These 3D maps of gene cis-regulatory architecture are highly powered to identify the cell types most likely impacted by disease-associated genetic variation compared to 1D genomic features, and tend to implicate different effector genes than eQTL approaches in the same cell types. Most of the variants implicated by these cis-regulatory architectures are highly trait-specific, but nearly half of the target genes connected to these variants are shared across multiple autoimmune disorders in multiple cell types, suggesting a high level of genetic diversity and complexity among autoimmune diseases that nonetheless converge at the level of target gene and cell type. Substantial effector gene sharing led to the common enrichment of similar biological networks across disease and cell types. However, trait-specific pathways representing potential areas for disease-specific intervention were identified. To test this, we pharmacologically validated squalene synthase, a cholesterol biosynthetic enzyme encoded by the FDFT1 gene implicated by our approach in MS and SLE, as a novel immunomodulatory drug target controlling inflammatory cytokine production by human T cells. These data represent a comprehensive resource for basic discovery of gene cis-regulatory mechanisms, and the analyses reported reveal mechanisms by which autoimmune-associated variants act to regulate gene expression, function, and pathology across multiple, distinct tissues and cell types.
Collapse
Affiliation(s)
- Khanh B Trang
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Prabhat Sharma
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Laura Cook
- Department of Microbiology and Immunology, University of Melbourne, at the Peter Doherty Institute for Infection and Immunity, Melbourne, VIC, Australia
- Department of Critical Care, Melbourne Medical School, University of Melbourne, Melbourne, VIC, Australia
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Zachary Mount
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Rajan M Thomas
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Nikhil N Kulkarni
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthew C Pahl
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - James A Pippin
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Chun Su
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Klaus H Kaestner
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Joan M O'Brien
- Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, PA, USA
- Penn Medicine Center for Ophthalmic Genetics in Complex Disease
| | - Yadav Wagley
- Department of Orthopedic Surgery University of Michigan Medical School Ann Arbor, MI, USA
| | - Kurt D Hankenson
- Department of Orthopedic Surgery University of Michigan Medical School Ann Arbor, MI, USA
| | - Ashley Jermusyk
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Jason W Hoskins
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Laufey T Amundadottir
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Mai Xu
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Kevin M Brown
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Stewart A Anderson
- Department of Child and Adolescent Psychiatry, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Wenli Yang
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Paul M Titchenell
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Patrick Seale
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Babette S Zemel
- Division of Gastroenterology, Hepatology, and Nutrition, Children's Hospital of Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Alessandra Chesi
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Neil Romberg
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of Allergy and Immunology, Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Immunology and Immune Health, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Megan K Levings
- Department of Surgery, University of British Columbia, Vancouver, BC, Canada
- BC Children's Hospital Research Institute, Vancouver, BC, Canada
- School of Biomedical Engineering, University of British Columbia, Vancouver, BC, Canada
| | - Struan F A Grant
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division Endocrinology and Diabetes, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Andrew D Wells
- Center for Spatial and Functional Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Immunology and Immune Health, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Pathology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
25
|
Zhao X, Chen Z, Wang H, Sun H. Occlusion enhanced pan-cancer classification via deep learning. BMC Bioinformatics 2024; 25:260. [PMID: 39118043 PMCID: PMC11308240 DOI: 10.1186/s12859-024-05870-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 07/12/2024] [Indexed: 08/10/2024] Open
Abstract
Quantitative measurement of RNA expression levels through RNA-Seq is an ideal replacement for conventional cancer diagnosis via microscope examination. Currently, cancer-related RNA-Seq studies focus on two aspects: classifying the status and tissue of origin of a sample and discovering marker genes. Existing studies typically identify marker genes by statistically comparing healthy and cancer samples. However, this approach overlooks marker genes with low expression level differences and may be influenced by experimental results. This paper introduces "GENESO," a novel framework for pan-cancer classification and marker gene discovery using the occlusion method in conjunction with deep learning. we first trained a baseline deep LSTM neural network capable of distinguishing the origins and statuses of samples utilizing RNA-Seq data. Then, we propose a novel marker gene discovery method called "Symmetrical Occlusion (SO)". It collaborates with the baseline LSTM network, mimicking the "gain of function" and "loss of function" of genes to evaluate their importance in pan-cancer classification quantitatively. By identifying the genes of utmost importance, we then isolate them to train new neural networks, resulting in higher-performance LSTM models that utilize only a reduced set of highly relevant genes. The baseline neural network achieves an impressive validation accuracy of 96.59% in pan-cancer classification. With the help of SO, the accuracy of the second network reaches 98.30%, while using 67% fewer genes. Notably, our method excels in identifying marker genes that are not differentially expressed. Moreover, we assessed the feasibility of our method using single-cell RNA-Seq data, employing known marker genes as a validation test.
Collapse
Grants
- 14106521 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14100620 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14105823 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14115319 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 2141109 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 2141157 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 2141261 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14105123 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14103522 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14120420 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
- 14120619 General Research Funds (GRF) from the Research Grants Council (RGC), University Grants Committee of the Hong Kong Special Administrative Region, China.
Collapse
Affiliation(s)
- Xing Zhao
- Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Guangdong, People's Republic of China
| | - Zigui Chen
- Department of Microbiology, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Huating Wang
- Department of Orthopaedics and Traumatology, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China
| | - Hao Sun
- Warshel Institute for Computational Biology, The Chinese University of Hong Kong, Shenzhen, Guangdong, People's Republic of China.
- Department of Chemical Pathology, Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong, People's Republic of China.
| |
Collapse
|
26
|
Crestani C, Forde TL, Bell J, Lycett SJ, Oliveira LMA, Pinto TCA, Cobo-Ángel CG, Ceballos-Márquez A, Phuoc NN, Sirimanapong W, Chen SL, Jamrozy D, Bentley SD, Fontaine M, Zadoks RN. Genomic and functional determinants of host spectrum in Group B Streptococcus. PLoS Pathog 2024; 20:e1012400. [PMID: 39133742 PMCID: PMC11341095 DOI: 10.1371/journal.ppat.1012400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2024] [Revised: 08/22/2024] [Accepted: 07/08/2024] [Indexed: 08/24/2024] Open
Abstract
Group B Streptococcus (GBS) is a major human and animal pathogen that threatens public health and food security. Spill-over and spill-back between host species is possible due to adaptation and amplification of GBS in new niches but the evolutionary and functional mechanisms underpinning those phenomena are poorly known. Based on analysis of 1,254 curated genomes from all major GBS host species and six continents, we found that the global GBS population comprises host-generalist, host-adapted and host-restricted sublineages, which are found across host groups, preferentially within one host group, or exclusively within one host group, respectively, and show distinct levels of recombination. Strikingly, the association of GBS genomes with the three major host groups (humans, cattle, fish) is driven by a single accessory gene cluster per host, regardless of sublineage or the breadth of host spectrum. Moreover, those gene clusters are shared with other streptococcal species occupying the same niche and are functionally relevant for host tropism. Our findings demonstrate (1) the heterogeneity of genome plasticity within a bacterial species of public health importance, enabling the identification of high-risk clones; (2) the contribution of inter-species gene transmission to the evolution of GBS; and (3) the importance of considering the role of animal hosts, and the accessory gene pool associated with their microbiota, in the evolution of multi-host bacterial pathogens. Collectively, these phenomena may explain the adaptation and clonal expansion of GBS in animal reservoirs and the risk of spill-over and spill-back between animals and humans.
Collapse
Affiliation(s)
- Chiara Crestani
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow, Scotland, United Kingdom
| | - Taya L. Forde
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow, Scotland, United Kingdom
| | - John Bell
- Moredun Research Institute, Penicuik, Scotland, United Kingdom
| | - Samantha J. Lycett
- The Roslin Institute, University of Edinburgh, Midlothian, Scotland, United Kingdom
| | - Laura M. A. Oliveira
- Instituto de Microbiologia Paulo de Goes, Federal University of Rio de Janeiro, Rio de Janeiro, State of Rio de Janeiro, Brazil
| | - Tatiana C. A. Pinto
- Instituto de Microbiologia Paulo de Goes, Federal University of Rio de Janeiro, Rio de Janeiro, State of Rio de Janeiro, Brazil
| | | | | | - Nguyen N. Phuoc
- Faculty of Fisheries, University of Agriculture and Forestry, Hue University, Hue, Vietnam
| | - Wanna Sirimanapong
- Faculty of Veterinary Science, Mahidol University, Nakhon Pathom, Thailand
| | - Swaine L. Chen
- Infectious Diseases Translational Research Programme, Department of Medicine, Division of Infectious Diseases, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Laboratory of Bacterial Genomics, Genome Institute of Singapore, Singapore
| | - Dorota Jamrozy
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton, England, United Kingdom
| | - Stephen D. Bentley
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton, England, United Kingdom
| | | | - Ruth N. Zadoks
- Institute of Biodiversity, Animal Health & Comparative Medicine, University of Glasgow, Glasgow, Scotland, United Kingdom
- Moredun Research Institute, Penicuik, Scotland, United Kingdom
- Sydney School of Veterinary Science, Faculty of Science, University of Sydney, Camden, NSW, Australia
| |
Collapse
|
27
|
Zhang L, Zhou XM, Mallory X. SCCNAInfer: a robust and accurate tool to infer the absolute copy number on scDNA-seq data. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae454. [PMID: 39067018 DOI: 10.1093/bioinformatics/btae454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Revised: 06/13/2024] [Accepted: 07/26/2024] [Indexed: 07/30/2024]
Abstract
MOTIVATION Copy number alterations (CNAs) play an important role in disease progression, especially in cancer. Single-cell DNA sequencing (scDNA-seq) facilitates the detection of CNAs of each cell that is sequenced at a shallow and uneven coverage. However, the state-of-the-art CNA detection tools based on scDNA-seq are still subject to genome-wide errors due to the wrong estimation of the ploidy. RESULTS We developed SCCNAInfer, a computational tool that utilizes the subclonal signal inside the tumor cells to more accurately infer each cell's ploidy and CNAs. Given the segmentation result of an existing CNA detection method, SCCNAInfer clusters the cells, infers the ploidy of each subclone, refines the read count by bin clustering, and accurately infers the CNAs for each cell. Both simulated and real datasets show that SCCNAInfer consistently improves upon the state-of-the-art CNA detection tools such as Aneufinder, Ginkgo, SCOPE and SeCNV. AVAILABILITY AND IMPLEMENTATION SCCNAInfer is freely available at https://github.com/compbio-mallory/SCCNAInfer. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liting Zhang
- Department of Computer Science, Florida State University, Florida 32304, USA
| | - Xin Maizie Zhou
- Department of Biomedical Engineering, Vanderbilt University, Tennessee 37235, USA
| | - Xian Mallory
- Department of Computer Science, Florida State University, Florida 32304, USA
| |
Collapse
|
28
|
Wang Y, Yao J, Zhang Z, Wei L, Wang S. Generation of novel lipid metabolism-based signatures to predict prognosis and immunotherapy response for colorectal adenocarcinoma. Sci Rep 2024; 14:17158. [PMID: 39060344 PMCID: PMC11282063 DOI: 10.1038/s41598-024-67549-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 07/12/2024] [Indexed: 07/28/2024] Open
Abstract
Lipid metabolism reprogramming involves in epithelial-mesenchymal transition (EMT), cancer stemness and immune checkpoints (ICs), which influence the metastasis of cancer. This study aimed to generate lipid metabolism-based signatures to predict prognosis, immunotherapy and chemotherapy response for colorectal adenocarcinoma (COAD). Transcriptome data and clinical information of COAD patients were collected from the cancer genome atlas (TCGA) database. The expression of EMT-, stem cell-, and IC-related genes were assessed between COAD and control samples. Modules and genes correlated EMT, ICs and stemness signatures were identified through weighted gene co-expression network analysis (WGCNA). Prognostic signatures were generated and then the distribution of risk genes was evaluated using single-cell RNA sequencing (scRNA-seq) data from GSE132465 dataset. COAD patients exhibited increased EMT score and stemness along with decreased ICs. Next, 12 hub genes (PIK3CG, ALOX5AP, PIK3R5, TNFAIP8L2, DPEP2, PIK3CD, PIK3R6, GGT5, ELOVL4, PTGIS, CYP7B1 and PRKD1) were found within green and yellow modules correlated with EMT, stemness and ICs. Lipid metabolism-based prognostic signatures were generated based on PIK3CG, GGT5 and PTGIS. Patients with high-risk group had poor prognosis, elevated ESTIMATEScore and StromalScore, 100% mutation rate and higher TIDE score. Samples in low-risk group had more immunogenicity on ICIs. Notably, PIK3CG was expressed in B cells, while GGT5 and PTGIS were expressed in stromal cells. This study generates lipid metabolism-based signatures correlated with EMT, stemness and ICs for predicting prognosis of COAD, and provides potential therapeutic targets for immunotherapy in COAD.
Collapse
Affiliation(s)
- Yi Wang
- Department of Oncology and Hematology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou, 215127, China
| | - Jun Yao
- Department of General Surgery, The Fourth Affiliated Hospital of Soochow University, Suzhou, 215127, China
| | - Zhe Zhang
- Department of General Surgery, The Fourth Affiliated Hospital of Soochow University, Suzhou, 215127, China
| | - Luxin Wei
- Department of General Surgery, The Fourth Affiliated Hospital of Soochow University, Suzhou, 215127, China
| | - Sheng Wang
- Department of General Surgery, The Fourth Affiliated Hospital of Soochow University, Suzhou, 215127, China.
| |
Collapse
|
29
|
Lause J, Ziegenhain C, Hartmanis L, Berens P, Kobak D. Compound models and Pearson residuals for single-cell RNA-seq data without UMIs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.08.02.551637. [PMID: 37577688 PMCID: PMC10418209 DOI: 10.1101/2023.08.02.551637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Recent work employed Pearson residuals from Poisson or negative binomial models to normalize UMI data. To extend this approach to non-UMI data, we model the additional amplification step with a compound distribution: we assume that sequenced RNA molecules follow a negative binomial distribution, and are then replicated following an amplification distribution. We show how this model leads to compound Pearson residuals, which yield meaningful gene selection and embeddings of Smart-seq2 datasets. Further, we suggest that amplification distributions across several sequencing protocols can be described by a broken power law. The resulting compound model captures previously unexplained overdispersion and zero-inflation patterns in non-UMI data.
Collapse
|
30
|
Trang KB, Chesi A, Toikumo S, Pippin JA, Pahl MC, O’Brien JM, Amundadottir LT, Brown KM, Yang W, Welles J, Santoleri D, Titchenell PM, Seale P, Zemel BS, Wagley Y, Hankenson KD, Kaestner KH, Anderson SA, Kayser MS, Wells AD, Kranzler HR, Kember RL, Grant SF. Shared and unique 3D genomic features of substance use disorders across multiple cell types. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.18.24310649. [PMID: 39072016 PMCID: PMC11275669 DOI: 10.1101/2024.07.18.24310649] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Recent genome-wide association studies (GWAS) have revealed shared genetic components among alcohol, opioid, tobacco and cannabis use disorders. However, the extent of the underlying shared causal variants and effector genes, along with their cellular context, remain unclear. We leveraged our existing 3D genomic datasets comprising high-resolution promoter-focused Capture-C/Hi-C, ATAC-seq and RNA-seq across >50 diverse human cell types to focus on genomic regions that coincide with GWAS loci. Using stratified LD regression, we determined the proportion of genomewide SNP heritability attributable to the features assayed across our cell types by integrating recent GWAS summary statistics for the relevant traits: alcohol use disorder (AUD), tobacco use disorder (TUD), opioid use disorder (OUD) and cannabis use disorder (CanUD). Statistically significant enrichments (P<0.05) were observed in 14 specific cell types, with heritability reaching 9.2-fold for iPSC-derived cortical neurons and neural progenitors, confirming that they are crucial cell types for further functional exploration. Additionally, several pancreatic cell types, notably pancreatic beta cells, showed enrichment for TUD, with heritability enrichments up to 4.8-fold, suggesting genomic overlap with metabolic processes. Further investigation revealed significant positive genetic correlations between T2D with both TUD and CanUD (FDR<0.05) and a significant negative genetic correlation with AUD. Interestingly, after partitioning the heritability for each cell type's cis-regulatory elements, the correlation between T2D and TUD for pancreatic beta cells was greater (r=0.2) than the global genetic correlation value. Our study provides new genomic insights into substance use disorders and implicates cell types where functional follow-up studies could reveal causal variant-gene mechanisms underpinning these disorders.
Collapse
Affiliation(s)
- Khanh B. Trang
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Alessandra Chesi
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Sylvanus Toikumo
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - James A. Pippin
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Matthew C. Pahl
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| | - Joan M. O’Brien
- Scheie Eye Institute, Department of Ophthalmology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, PA, USA
- Penn Medicine Center for Ophthalmic Genetics in Complex Disease, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, PA, USA
| | - Laufey T. Amundadottir
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Kevin M. Brown
- Laboratory of Translational Genomics, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, USA
| | - Wenli Yang
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Jaclyn Welles
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Dominic Santoleri
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Paul M. Titchenell
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Physiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Patrick Seale
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Cell and Developmental Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Babette S. Zemel
- Division of Gastroenterology, Hepatology, and Nutrition, Children’s Hospital of Philadelphia, PA, USA
- Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Yadav Wagley
- Department of Orthopedic Surgery, University of Michigan Medical School Ann Arbor, MI, USA
| | - Kurt D. Hankenson
- Department of Orthopedic Surgery, University of Michigan Medical School Ann Arbor, MI, USA
| | - Klaus H. Kaestner
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Stewart A. Anderson
- Department of Child and Adolescent Psychiatry, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Matthew S. Kayser
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Neuroscience, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Chronobiology Sleep Institute, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Andrew D. Wells
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Institute for Immunology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Henry R. Kranzler
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Rachel L. Kember
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Struan F.A. Grant
- Center for Spatial and Functional Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Division of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Institute for Diabetes, Obesity and Metabolism, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Division of Endocrinology and Diabetes, The Children’s Hospital of Philadelphia, Philadelphia, PA, USA
| |
Collapse
|
31
|
Le Saux O, Ardin M, Berthet J, Barrin S, Bourhis M, Cinier J, Lounici Y, Treilleux I, Just PA, Bataillon G, Savoye AM, Mouret-Reynier MA, Coquan E, Derbel O, Jeay L, Bouizaguen S, Labidi-Galy I, Tabone-Eglinger S, Ferrari A, Thomas E, Ménétrier-Caux C, Tartour E, Galy-Fauroux I, Stern MH, Terme M, Caux C, Dubois B, Ray-Coquard I. Immunomic longitudinal profiling of the NeoPembrOv trial identifies drivers of immunoresistance in high-grade ovarian carcinoma. Nat Commun 2024; 15:5932. [PMID: 39013886 PMCID: PMC11252308 DOI: 10.1038/s41467-024-47000-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 03/18/2024] [Indexed: 07/18/2024] Open
Abstract
PD-1/PD-L1 blockade has so far shown limited survival benefit for high-grade ovarian carcinomas. By using paired samples from the NeoPembrOv randomized phase II trial (NCT03275506), for which primary outcomes are published, and by combining RNA-seq and multiplexed immunofluorescence staining, we explore the impact of NeoAdjuvant ChemoTherapy (NACT) ± Pembrolizumab (P) on the tumor environment, and identify parameters that correlated with response to immunotherapy as a pre-planned exploratory analysis. Indeed, i) combination therapy results in a significant increase in intraepithelial CD8+PD-1+ T cells, ii) combining endothelial and monocyte gene signatures with the CD8B/FOXP3 expression ratio is predictive of response to NACT + P with an area under the curve of 0.93 (95% CI 0.85-1.00) and iii) high CD8B/FOXP3 and high CD8B/ENTPD1 ratios are significantly associated with positive response to NACT + P, while KDR and VEGFR2 expression are associated with resistance. These results indicate that targeting regulatory T cells and endothelial cells, especially VEGFR2+ endothelial cells, could overcome immune resistance of ovarian cancers.
Collapse
Affiliation(s)
- Olivia Le Saux
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
- National Investigators Group for Ovarian and Breast Cancer Studies, Paris, France
- Department of Medical Oncology, Centre Léon Bérard, 69008, Lyon, France
| | - Maude Ardin
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
| | - Justine Berthet
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
- Lyon Immunotherapy for Cancer Laboratory (LICL), Cancer Research Center of Lyon, Centre Léon Bérard, 69008, Lyon, France
| | - Sarah Barrin
- Lyon Immunotherapy for Cancer Laboratory (LICL), Cancer Research Center of Lyon, Centre Léon Bérard, 69008, Lyon, France
| | - Morgane Bourhis
- Université Paris Cité, Inserm, PARCC, F-75015, Paris, France
| | - Justine Cinier
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
| | - Yasmine Lounici
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
| | | | | | - Guillaume Bataillon
- Department of Anatomopathology, University hospital of Toulouse, Toulouse, France
| | - Aude-Marie Savoye
- National Investigators Group for Ovarian and Breast Cancer Studies, Paris, France
- Department of Medical Oncology, Institut Jean Godinot, Reims, France
| | - Marie-Ange Mouret-Reynier
- National Investigators Group for Ovarian and Breast Cancer Studies, Paris, France
- Department of Medical Oncology, Centre Jean Perrin, Clermont-Ferrand, France
| | - Elodie Coquan
- National Investigators Group for Ovarian and Breast Cancer Studies, Paris, France
- Department of Medical Oncology, Centre François Baclesse, Caen, France
| | - Olfa Derbel
- Department of Medical Oncology, Hôpital Privé Jean Mermoz, Lyon, France
| | - Louis Jeay
- Keen Eye Technologies-Paris, France, now Tribun Health, Paris, France
| | | | - Intidhar Labidi-Galy
- Department of Oncology, Hôpitaux universitaires de Genève, Faculty of Medecine, Center of Translational Research in Onco-Hematology, Swiss Cancer Center Leman, Geneva, Switzerland
| | | | - Anthony Ferrari
- Synergie Lyon Cancer, Gilles Thomas Bioinformatics Platform, Centre Léon Bérard, CEDEX 08, F-69373, Lyon, France
| | - Emilie Thomas
- Synergie Lyon Cancer, Gilles Thomas Bioinformatics Platform, Centre Léon Bérard, CEDEX 08, F-69373, Lyon, France
| | - Christine Ménétrier-Caux
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
- Lyon Immunotherapy for Cancer Laboratory (LICL), Cancer Research Center of Lyon, Centre Léon Bérard, 69008, Lyon, France
| | - Eric Tartour
- Université Paris Cité, Inserm, PARCC, F-75015, Paris, France
| | | | - Marc-Henri Stern
- Inserm U830, DNA Repair and Uveal Melanoma (D.R.U.M.) Team, Institut Curie, PSL Research University, 75005, Paris, France
| | - Magali Terme
- Université Paris Cité, Inserm, PARCC, F-75015, Paris, France
| | - Christophe Caux
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France
- Lyon Immunotherapy for Cancer Laboratory (LICL), Cancer Research Center of Lyon, Centre Léon Bérard, 69008, Lyon, France
| | - Bertrand Dubois
- "Cancer Immune Surveillance and Therapeutic Targeting" Laboratory, Cancer Research Center of Lyon, INSERM 1052-CNRS 5286, Centre Léon Bérard, Université de Lyon, Université Claude Bernard Lyon 1, 69008, Lyon, France.
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France.
- Lyon Immunotherapy for Cancer Laboratory (LICL), Cancer Research Center of Lyon, Centre Léon Bérard, 69008, Lyon, France.
| | - Isabelle Ray-Coquard
- Lyon University, Université Claude Bernard Lyon 1, Centre Léon Bérard, 69008, Lyon, France.
- National Investigators Group for Ovarian and Breast Cancer Studies, Paris, France.
- Department of Medical Oncology, Centre Léon Bérard, 69008, Lyon, France.
| |
Collapse
|
32
|
Qiu Z, Zhu Y, Zhang Q, Qiao X, Mu R, Xu Z, Yan Y, Wang F, Zhang T, Zhuang WQ, Yu K. Unravelling biosynthesis and biodegradation potentials of microbial dark matters in hypersaline lakes. ENVIRONMENTAL SCIENCE AND ECOTECHNOLOGY 2024; 20:100359. [PMID: 39221074 PMCID: PMC11361885 DOI: 10.1016/j.ese.2023.100359] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/26/2023] [Accepted: 11/26/2023] [Indexed: 09/04/2024]
Abstract
Biosynthesis and biodegradation of microorganisms critically underpin the development of biotechnology, new drugs and therapies, and environmental remediation. However, most uncultured microbial species along with their metabolic capacities in extreme environments, remain obscured. Here we unravel the metabolic potential of microbial dark matters (MDMs) in four deep-inland hypersaline lakes in Xinjiang, China. Utilizing metagenomic binning, we uncovered a rich diversity of 3030 metagenome-assembled genomes (MAGs) across 82 phyla, revealing a substantial portion, 2363 MAGs, as previously unclassified at the genus level. These unknown MAGs displayed unique distribution patterns across different lakes, indicating a strong correlation with varied physicochemical conditions. Our analysis revealed an extensive array of 9635 biosynthesis gene clusters (BGCs), with a remarkable 9403 being novel, suggesting untapped biotechnological potential. Notably, some MAGs from potentially new phyla exhibited a high density of these BGCs. Beyond biosynthesis, our study also identified novel biodegradation pathways, including dehalogenation, anaerobic ammonium oxidation (Anammox), and degradation of polycyclic aromatic hydrocarbons (PAHs) and plastics, in previously unknown microbial clades. These findings significantly enrich our understanding of biosynthesis and biodegradation processes and open new avenues for biotechnological innovation, emphasizing the untapped potential of microbial diversity in hypersaline environments.
Collapse
Affiliation(s)
- Zhiguang Qiu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, 518055, China
| | - Yuanyuan Zhu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Qing Zhang
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Xuejiao Qiao
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Rong Mu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
| | - Zheng Xu
- Southern University of Sciences and Technology Yantian Hospital, Shenzhen, 518081, China
- Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, China
| | - Yan Yan
- State Key Laboratory of Isotope Geochemistry, CAS Center for Excellence in Deep Earth Science, Guangzhou Institute of Geochemistry, Chinese Academy of Sciences, Guangzhou, 510640, China
| | - Fan Wang
- School of Atmospheric Sciences, Sun Yat-sen University, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Zhuhai, 519082, China
| | - Tong Zhang
- Department of Civil Engineering, University of Hong Kong, 999077, Hong Kong, China
| | - Wei-Qin Zhuang
- Department of Civil and Environmental Engineering, Faculty of Engineering, University of Auckland, New Zealand
| | - Ke Yu
- School of Environment and Energy, Peking University Shenzhen Graduate School, Shenzhen, 518055, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, 518055, China
| |
Collapse
|
33
|
Schuurmans IK, Smajlagic D, Baltramonaityte V, Malmberg ALK, Neumann A, Creasey N, Felix JF, Tiemeier H, Pingault JB, Czamara D, Raïkkönen K, Page CM, Lyle R, Havdahl A, Lahti J, Walton E, Bekkhus M, Cecil CAM. Genetic susceptibility to neurodevelopmental conditions associates with neonatal DNA methylation patterns in the general population: an individual participant data meta-analysis. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.01.24309384. [PMID: 39006433 PMCID: PMC11245083 DOI: 10.1101/2024.07.01.24309384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Background Autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), and schizophrenia (SCZ) are highly heritable and linked to disruptions in foetal (neuro)development. While epigenetic processes are considered an important underlying pathway between genetic susceptibility and neurodevelopmental conditions, it is unclear (i) whether genetic susceptibility to these conditions is associated with epigenetic patterns, specifically DNA methylation (DNAm), already at birth; (ii) to what extent DNAm patterns are unique or shared across conditions, and (iii) whether these neonatal DNAm patterns can be leveraged to enhance genetic prediction of (neuro)developmental outcomes. Methods We conducted epigenome-wide meta-analyses of genetic susceptibility to ASD, ADHD, and schizophrenia, quantified using polygenic scores (PGSs) on cord blood DNAm, using four population-based cohorts (n pooled=5,802), all North European. Heterogeneity statistics were used to estimate overlap in DNAm patterns between PGSs. Subsequently, DNAm-based measures of PGSs were built in a target sample, and used as predictors to test incremental variance explained over PGS in 130 (neuro)developmental outcomes spanning birth to 14 years. Outcomes In probe-level analyses, SCZ-PGS associated with neonatal DNAm at 246 loci (p<9×10-8), predominantly in the major histocompatibility complex. Functional characterization of these DNAm loci confirmed strong genetic effects, significant blood-brain concordance and enrichment for immune-related pathways. 8 loci were identified for ASD-PGS (mapping to FDFT1 and MFHAS1), and none for ADHD-PGS. Regional analyses indicated a large number of differentially methylated regions for all PGSs (SCZ-PGS: 157, ASD-PGS: 130, ADHD-PGS: 166). DNAm signals showed little overlap between PGSs. We found suggestive evidence that incorporating DNAm-based measures of genetic susceptibility at birth increases explained variance for several child cognitive and motor outcomes over and above PGS. Interpretation Genetic susceptibility for neurodevelopmental conditions, particularly schizophrenia, is detectable in cord blood DNAm at birth in a population-based sample, with largely distinct DNAm patterns between PGSs. These findings support an early-origins perspective on schizophrenia. Funding HorizonEurope; European Research Council.
Collapse
Affiliation(s)
- I K Schuurmans
- Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - D Smajlagic
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
| | | | - A L K Malmberg
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - A Neumann
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - N Creasey
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Clinical, Educational, and Health Psychology, Division of Psychology & Language Sciences, University College London, London, UK
| | - J F Felix
- The Generation R Study Group, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Pediatrics, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - H Tiemeier
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - J B Pingault
- Department of Clinical, Educational, and Health Psychology, Division of Psychology & Language Sciences, University College London, London, UK
- Social, Genetic & Developmental Psychiatry (SGDP) Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - D Czamara
- Max-Planck-Institute of Psychiatry, Department Genes and Environment, Munich, Germany
| | - K Raïkkönen
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
- Department of Obstetrics and Gynecology, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - C M Page
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
| | - R Lyle
- Centre for Fertility and Health, Norwegian Institute of Public Health, Oslo, Norway
- Department of Medical Genetics, Oslo University Hospital, Oslo, Norway
| | - A Havdahl
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
- PsychGen Centre for Genetic Epidemiology and Mental Health, Norwegian Institute of Public Health, Oslo, Norway
- Nic Waals Institute, Lovisenberg Diaconal Hospital, Oslo, Norway
| | - J Lahti
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - E Walton
- Department of Psychology, University of Bath, Bath, United Kingdom
| | - M Bekkhus
- PROMENTA Research Centre, Department of Psychology, University of Oslo, Oslo, Norway
| | - C A M Cecil
- Department of Epidemiology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Child and Adolescent Psychiatry and Psychology, Erasmus MC University Medical Center Rotterdam, Rotterdam, the Netherlands
- Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
| |
Collapse
|
34
|
Arbesfeld JA, Da EY, Stevenson JS, Kuzma K, Paul A, Farris T, Capodanno BJ, Grindstaff SB, Riehle K, Saraiva-Agostinho N, Safer JF, Milosavljevic A, Foreman J, Firth HV, Hunt SE, Iqbal S, Cline MS, Rubin AF, Wagner AH. Mapping MAVE data for use in human genomics applications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.06.20.545702. [PMID: 38979347 PMCID: PMC11230167 DOI: 10.1101/2023.06.20.545702] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
The large-scale experimental measures of variant functional assays submitted to MaveDB have the potential to provide key information for resolving variants of uncertain significance, but the reporting of results relative to assayed sequence hinders their downstream utility. The Atlas of Variant Effects Alliance mapped multiplexed assays of variant effect data to human reference sequences, creating a robust set of machine-readable homology mappings. This method processed approximately 2.5 million protein and genomic variants in MaveDB, successfully mapping 98.61% of examined variants and disseminating data to resources such as the UCSC Genome Browser and Ensembl Variant Effect Predictor.
Collapse
Affiliation(s)
- Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Estelle Y Da
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
| | - James S Stevenson
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Kori Kuzma
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Anika Paul
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
| | - Tierra Farris
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX
| | | | | | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX
| | - Nuno Saraiva-Agostinho
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Jordan F Safer
- The Center for the Development of Therapeutics, The Broad Institute of MIT and Harvard, Cambridge, MA
| | | | - Julia Foreman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Helen V Firth
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Sumaiya Iqbal
- The Center for the Development of Therapeutics, The Broad Institute of MIT and Harvard, Cambridge, MA
| | - Melissa S Cline
- BRCA Exchange, University of California Santa Cruz, Santa Cruz, CA
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
- Department of Medical Biology, University of Melbourne, Parkville, Australia
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH
- Department of Pediatrics and Biomedical Informatics, The Ohio State University, Columbus, OH
| |
Collapse
|
35
|
Schuurmans IK, Mulder RH, Baltramonaityte V, Lahtinen A, Qiuyu F, Rothmann LM, Staginnus M, Tuulari J, Burt SA, Buss C, Craig JM, Donald KA, Felix JF, Freeman TP, Grassi-Oliveira R, Huels A, Hyde LW, Jones SA, Karlsson H, Karlsson L, Koen N, Lawn W, Mitchell C, Monk CS, Mooney MA, Muetzel R, Nigg JT, Belangero SIN, Notterman D, O'Connor T, O'Donnell KJ, Pan PM, Paunio T, Ryabinin P, Saffery R, Salum GA, Seal M, Silk TJ, Stein DJ, Zar H, Walton E, Cecil CAM. Consortium Profile: The Methylation, Imaging and NeuroDevelopment (MIND) Consortium. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.06.23.24309353. [PMID: 38978656 PMCID: PMC11230303 DOI: 10.1101/2024.06.23.24309353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Epigenetic processes, such as DNA methylation, show potential as biological markers and mechanisms underlying gene-environment interplay in the prediction of mental health and other brain-based phenotypes. However, little is known about how peripheral epigenetic patterns relate to individual differences in the brain itself. An increasingly popular approach to address this is by combining epigenetic and neuroimaging data; yet, research in this area is almost entirely comprised of cross-sectional studies in adults. To bridge this gap, we established the Methylation, Imaging and NeuroDevelopment (MIND) Consortium, which aims to bring a developmental focus to the emerging field of Neuroimaging Epigenetics by (i) promoting collaborative, adequately powered developmental research via multi-cohort analyses; (ii) increasing scientific rigor through the establishment of shared pipelines and open science practices; and (iii) advancing our understanding of DNA methylation-brain dynamics at different developmental periods (from birth to emerging adulthood), by leveraging data from prospective, longitudinal pediatric studies. MIND currently integrates 15 cohorts worldwide, comprising (repeated) measures of DNA methylation in peripheral tissues (blood, buccal cells, and saliva) and neuroimaging by magnetic resonance imaging across up to five time points over a period of up to 21 years (Npooled DNAm = 11,299; Npooled neuroimaging = 10,133; Npooled combined = 4,914). By triangulating associations across multiple developmental time points and study types, we hope to generate new insights into the dynamic relationships between peripheral DNA methylation and the brain, and how these ultimately relate to neurodevelopmental and psychiatric phenotypes.
Collapse
|
36
|
Zou J, Li Z, Carleton N, Oesterreich S, Lee AV, Tseng GC. Mutual information for detecting multi-class biomarkers when integrating multiple bulk or single-cell transcriptomic studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.11.598484. [PMID: 38915481 PMCID: PMC11195192 DOI: 10.1101/2024.06.11.598484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Motivation Biomarker detection plays a pivotal role in biomedical research. Integrating omics studies from multiple cohorts can enhance statistical power, accuracy and robustness of the detection results. However, existing methods for horizontally combining omics studies are mostly designed for two-class scenarios (e.g., cases versus controls) and are not directly applicable for studies with multi-class design (e.g., samples from multiple disease subtypes, treatments, tissues, or cell types). Results We propose a statistical framework, namely Mutual Information Concordance Analysis (MICA), to detect biomarkers with concordant multi-class expression pattern across multiple omics studies from an information theoretic perspective. Our approach first detects biomarkers with concordant multi-class patterns across partial or all of the omics studies using a global test by mutual information. A post hoc analysis is then performed for each detected biomarkers and identify studies with concordant pattern. Extensive simulations demonstrate improved accuracy and successful false discovery rate control of MICA compared to an existing MCC method. The method is then applied to two practical scenarios: four tissues of mouse metabolism-related transcriptomic studies, and three sources of estrogen treatment expression profiles. Detected biomarkers by MICA show intriguing biological insights and functional annotations. Additionally, we implemented MICA for single-cell RNA-Seq data for tumor progression biomarkers, highlighting critical roles of ribosomal function in the tumor microenvironment of triple-negative breast cancer and underscoring the potential of MICA for detecting novel therapeutic targets. Availability https://github.com/jianzou75/MICA.
Collapse
Affiliation(s)
- Jian Zou
- Department of Statistics, School of Public Health, Chongqing Medical University, Chongqing, 400016, Chongqing, China
| | - Zheqi Li
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, 02215, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Boston, 02215, Massachusetts, USA
| | - Neil Carleton
- Women’s Cancer Research Center, UPMC Hillman Cancer Center (HCC), Pittsburgh, 15232, Pennsylvania, USA
- Magee-Womens Research Institute, Pittsburgh, 15213, Pennsylvania, USA
- Medical Scientist Training Program, School of Medicine, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
| | - Steffi Oesterreich
- Women’s Cancer Research Center, UPMC Hillman Cancer Center (HCC), Pittsburgh, 15232, Pennsylvania, USA
- Magee-Womens Research Institute, Pittsburgh, 15213, Pennsylvania, USA
- Department of Pharmacology & Chemical Biology, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
| | - Adrian V. Lee
- Women’s Cancer Research Center, UPMC Hillman Cancer Center (HCC), Pittsburgh, 15232, Pennsylvania, USA
- Magee-Womens Research Institute, Pittsburgh, 15213, Pennsylvania, USA
- Department of Pharmacology & Chemical Biology, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
| | - George C. Tseng
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, 15213, Pennsylvania, USA
| |
Collapse
|
37
|
Yu D, Stothard P, Neumann NF. Emergence of potentially disinfection-resistant, naturalized Escherichia coli populations across food- and water-associated engineered environments. Sci Rep 2024; 14:13478. [PMID: 38866876 PMCID: PMC11169474 DOI: 10.1038/s41598-024-64241-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 06/06/2024] [Indexed: 06/14/2024] Open
Abstract
The Escherichia coli species is comprised of several 'ecotypes' inhabiting a wide range of host and natural environmental niches. Recent studies have suggested that novel naturalized ecotypes have emerged across wastewater treatment plants and meat processing facilities. Phylogenetic and multilocus sequence typing analyses clustered naturalized wastewater and meat plant E. coli strains into two main monophyletic clusters corresponding to the ST635 and ST399 sequence types, with several serotypes identified by serotyping, potentially representing distinct lineages that have naturalized across wastewater treatment plants and meat processing facilities. This evidence, taken alongside ecotype prediction analyses that distinguished the naturalized strains from their host-associated counterparts, suggests these strains may collectively represent a novel ecotype that has recently emerged across food- and water-associated engineered environments. Interestingly, pan-genomic analyses revealed that the naturalized strains exhibited an abundance of biofilm formation, defense, and disinfection-related stress resistance genes, but lacked various virulence and colonization genes, indicating that their naturalization has come at the cost of fitness in the original host environment.
Collapse
Affiliation(s)
- Daniel Yu
- School of Public Health, University of Alberta, Edmonton, AB, Canada.
- Antimicrobial Resistance-One Health Consortium, Calgary, AB, Canada.
| | - Paul Stothard
- Department of Agriculture, Food and Nutritional Sciences, University of Alberta, Edmonton, AB, Canada
| | - Norman F Neumann
- School of Public Health, University of Alberta, Edmonton, AB, Canada
- Antimicrobial Resistance-One Health Consortium, Calgary, AB, Canada
| |
Collapse
|
38
|
Chen M. Beyond variability: a novel gene expression stability metric to unveil homeostasis and regulation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.28.596283. [PMID: 38854149 PMCID: PMC11160662 DOI: 10.1101/2024.05.28.596283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
The concept of gene expression stability within a homeostatic cell is explored through the gene homeostasis Z-index, a measure that highlights genes under active regulation in response to internal and external stimuli. This index reveals distinct regulatory activities and patterns in different organs, such as enhanced synaptic transmission in pancreatic islets. The research indicates that traditional mean-based methods may miss these nuances, underlining the significance of new metrics in identifying gene regulation specifics in cellular adaptation.
Collapse
|
39
|
Zhang Z, Xiao J, Wang H, Yang C, Huang Y, Yue Z, Chen Y, Han L, Yin K, Lyu A, Fang X, Zhang L. Exploring high-quality microbial genomes by assembling short-reads with long-range connectivity. Nat Commun 2024; 15:4631. [PMID: 38821971 PMCID: PMC11143213 DOI: 10.1038/s41467-024-49060-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Accepted: 05/17/2024] [Indexed: 06/02/2024] Open
Abstract
Although long-read sequencing enables the generation of complete genomes for unculturable microbes, its high cost limits the widespread adoption of long-read sequencing in large-scale metagenomic studies. An alternative method is to assemble short-reads with long-range connectivity, which can be a cost-effective way to generate high-quality microbial genomes. Here, we develop Pangaea, a bioinformatic approach designed to enhance metagenome assembly using short-reads with long-range connectivity. Pangaea leverages connectivity derived from physical barcodes of linked-reads or virtual barcodes by aligning short-reads to long-reads. Pangaea utilizes a deep learning-based read binning algorithm to assemble co-barcoded reads exhibiting similar sequence contexts and abundances, thereby improving the assembly of high- and medium-abundance microbial genomes. Pangaea also leverages a multi-thresholding algorithm strategy to refine assembly for low-abundance microbes. We benchmark Pangaea on linked-reads and a combination of short- and long-reads from simulation data, mock communities and human gut metagenomes. Pangaea achieves significantly higher contig continuity as well as more near-complete metagenome-assembled genomes (NCMAGs) than the existing assemblers. Pangaea also generates three complete and circular NCMAGs on the human gut microbiomes.
Collapse
Grants
- This research was partially supported by the Young Collaborative Research Grant (C2004-23Y, L.Z.), HMRF (11221026, L.Z.), the open project of BGI-Shenzhen, Shenzhen 518000, China (BGIRSZ20220012, L.Z.), the Hong Kong Research Grant Council Early Career Scheme (HKBU 22201419, L.Z.), HKBU Start-up Grant Tier 2 (RC-SGT2/19-20/SCI/007, L.Z.), HKBU IRCMS (No. IRCMS/19-20/D02, L.Z.).
- This research was partially supported by the open project of BGI-Shenzhen, Shenzhen 518000, China (BGIRSZ20220014, KJ.Y.).
- The study were partially supported by the Science Technology and Innovation Committee of Shenzhen Municipality, China (SGDX20190919142801722, XD.F.),
Collapse
Affiliation(s)
- Zhenmiao Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
| | - Jin Xiao
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
| | - Hongbo Wang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
| | - Chao Yang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
| | | | - Zhen Yue
- BGI Research, Sanya, 572025, China
| | - Yang Chen
- State Key Laboratory of Dampness Syndrome of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese, Guangzhou, China
| | - Lijuan Han
- Department of Scientific Research, Kangmeihuada GeneTech Co., Ltd (KMHD), Shenzhen, China
| | - Kejing Yin
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China
| | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China
| | - Xiaodong Fang
- BGI Research, Shenzhen, 518083, China
- BGI Research, Sanya, 572025, China
- Department of Scientific Research, Kangmeihuada GeneTech Co., Ltd (KMHD), Shenzhen, China
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Hong Kong, China.
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen, China.
| |
Collapse
|
40
|
Hera MR, Koslicki D. Cosine Similarity Estimation Using FracMinHash: Theoretical Analysis, Safety Conditions, and Implementation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.24.595805. [PMID: 38854044 PMCID: PMC11160586 DOI: 10.1101/2024.05.24.595805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Motivation The increasing number and volume of genomic and metagenomic data necessitates scalable and robust computational models for precise analysis. Sketching techniques utilizing k -mers from a biological sample have proven to be useful for large-scale analyses. In recent years, FracMinHash has emerged as a popular sketching technique and has been used in several useful applications. Recent studies on FracMinHash proved unbiased estimators for the containment and Jaccard indices. However, theoretical investigations for other metrics, such as the cosine similarity, are still lacking. Theoretical contributions In this paper, we present a theoretical framework for estimating cosine similarity from FracMinHash sketches. We establish conditions under which this estimation is sound, and recommend a minimum scale factor s for accurate results. Experimental evidence supports our theoretical findings. Practical contributions We also present frac-kmc, a fast and efficient FracMinHash sketch generator program. frac-kmc is the fastest known FracMinHash sketch generator, delivering accurate and precise results for cosine similarity estimation on real data. We show that by computing FracMinHash sketches using frac-kmc, we can estimate pairwise cosine similarity speedily and accurately on real data. frac-kmc is freely available here: https://github.com/KoslickiLab/frac-kmc/.
Collapse
Affiliation(s)
- Mahmudur Rahman Hera
- School of Electrical Engineering and Computer Science, Pennsylvania State University, USA
| | - David Koslicki
- School of Electrical Engineering and Computer Science, Pennsylvania State University, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, USA
- Department of Biology, Pennsylvania State University, USA
| |
Collapse
|
41
|
Mohd Talkah NS, Aziz NAKA, Rahim MFA, Hanafi NFF, Ahmad Mokhtar MA, Othman AS. The chloroplast genome inheritance pattern of the Deli-Nigerian prospection material (NPM) × Yangambi population of Elaeis guineensis Jacq. PeerJ 2024; 12:e17335. [PMID: 38818457 PMCID: PMC11138521 DOI: 10.7717/peerj.17335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 04/15/2024] [Indexed: 06/01/2024] Open
Abstract
Background The chloroplast genome has the potential to be genetically engineered to enhance the agronomic value of major crops. As a crop plant with major economic value, it is important to understand every aspect of the genetic inheritance pattern among Elaeis guineensis individuals to ensure the traceability of agronomic traits. Methods Two parental E. guineensis individuals and 23 of their F1 progenies were collected and sequenced using the next-generation sequencing (NGS) technique on the Illumina platform. Chloroplast genomes were assembled de novo from the cleaned raw reads and aligned to check for variations. The sequences were compared and analyzed with programming language scripting and relevant bioinformatic softwares. Simple sequence repeat (SSR) loci were determined from the chloroplast genome. Results The chloroplast genome assembly resulted in 156,983 bp, 156,988 bp, 156,982 bp, and 156,984 bp. The gene content and arrangements were consistent with the reference genome published in the GenBank database. Seventy-eight SSRs were detected in the chloroplast genome, with most located in the intergenic spacer region.The chloroplast genomes of 17 F1 progenies were exact copies of the maternal parent, while six individuals showed a single variation in the sequence. Despite the significant variation displayed by the male parent, all the nucleotide variations were synonymous. This study show highly conserve gene content and sequence in Elaeis guineensis chloroplast genomes. Maternal inheritance of chloroplast genome among F1 progenies are robust with a low possibility of mutations over generations. The findings in this study can enlighten inheritance pattern of Elaeis guineensis chloroplast genome especially among crops' scientists who consider using chloroplast genome for agronomic trait modifications.
Collapse
Affiliation(s)
| | | | | | | | | | - Ahmad Sofiman Othman
- School of Biological Sciences, Universiti Sains Malaysia, Minden, Pulau Pinang, Malaysia
- Centre of Chemical Biology, Universiti Sains Malaysia, Bayan Baru, Pulau Pinang, Malaysia
| |
Collapse
|
42
|
Chao KH, Heinz JM, Hoh C, Mao A, Shumate A, Pertea M, Salzberg SL. Combining DNA and protein alignments to improve genome annotation with LiftOn. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.16.593026. [PMID: 38798552 PMCID: PMC11118573 DOI: 10.1101/2024.05.16.593026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
As the number and variety of assembled genomes continues to grow, the number of annotated genomes is falling behind, particularly for eukaryotes. DNA-based mapping tools help to address this challenge, but they are only able to transfer annotation between closely-related species. Here we introduce LiftOn, a homology-based software tool that integrates DNA and protein alignments to enhance the accuracy of genome-scale annotation and to allow mapping between relatively distant species. LiftOn's protein-centric algorithm considers both types of alignments, chooses optimal open reading frames, resolves overlapping gene loci, and finds additional gene copies where they exist. LiftOn can reliably transfer annotation between genomes representing members of the same species, as we demonstrate on human, mouse, honey bee, rice, and Arabidopsis thaliana. It can further map annotation effectively across species pairs as far apart as mouse and rat or Drosophila melanogaster and D. erecta.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Jakob M. Heinz
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | - Celine Hoh
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alaina Shumate
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Mihaela Pertea
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| |
Collapse
|
43
|
Wang Z, Wang M, Hu L, He G, Nie S. Evolutionary profiles and complex admixture landscape in East Asia: New insights from modern and ancient Y chromosome variation perspectives. Heliyon 2024; 10:e30067. [PMID: 38756579 PMCID: PMC11096704 DOI: 10.1016/j.heliyon.2024.e30067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 04/08/2024] [Accepted: 04/18/2024] [Indexed: 05/18/2024] Open
Abstract
Human Y-chromosomes are characterized by nonrecombination and uniparental inheritance, carrying traces of human history evolution and admixture. Large-scale population-specific genomic sources based on advanced sequencing technologies have revolutionized our understanding of human Y chromosome diversity and its anthropological and forensic applications. Here, we reviewed and meta-analyzed the Y chromosome genetic diversity of modern and ancient people from China and summarized the patterns of founding lineages of spatiotemporally different populations associated with their origin, expansion, and admixture. We emphasized the strong association between our identified founding lineages and language-related human dispersal events correlated with the Sino-Tibetan, Altaic, and southern Chinese multiple-language families related to the Hmong-Mien, Tai-Kadai, Austronesian, and Austro-Asiatic languages. We subsequently summarize the recent advances in translational applications in forensic and anthropological science, including paternal biogeographical ancestry inference (PBGAI), surname investigation, and paternal history reconstruction. Whole-Y sequencing or high-resolution panels with high coverage of terminal Y chromosome lineages are essential for capturing the genomic diversity of ethnolinguistically diverse East Asians. Generally, we emphasized the importance of including more ethnolinguistically diverse, underrepresented modern and spatiotemporally different ancient East Asians in human genetic research for a comprehensive understanding of the paternal genetic landscape of East Asians with a detailed time series and for the reconstruction of a reference database in the PBGAI, even including new technology innovations of Telomere-to-Telomere (T2T) for new genetic variation discovery.
Collapse
Affiliation(s)
- Zhiyong Wang
- School of Forensic Medicine, Kunming Medical University, Kunming, 650500, China
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu, 610000, China
- Center for Archaeological Science, Sichuan University, Chengdu, 610000, China
| | - Mengge Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu, 610000, China
- Center for Archaeological Science, Sichuan University, Chengdu, 610000, China
- Faculty of Forensic Medicine, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510275, China
| | - Liping Hu
- School of Forensic Medicine, Kunming Medical University, Kunming, 650500, China
| | - Guanglin He
- Institute of Rare Diseases, West China Hospital of Sichuan University, Sichuan University, Chengdu, 610000, China
- Center for Archaeological Science, Sichuan University, Chengdu, 610000, China
| | - Shengjie Nie
- School of Forensic Medicine, Kunming Medical University, Kunming, 650500, China
| |
Collapse
|
44
|
Supakar T, Herring-Nicholas A, Josephs EA. Compartmentalized CRISPR Reactions (CCR) for High-Throughput Screening of Guide RNA Potency and Specificity. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.07.592954. [PMID: 38766102 PMCID: PMC11100742 DOI: 10.1101/2024.05.07.592954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
CRISPR ribonucleoproteins (RNPs) use a variable segment in their guide RNA (gRNA) called a spacer to determine the DNA sequence at which the effector protein will exhibit nuclease activity and generate target-specific genetic mutations. However, nuclease activity with different gRNAs can vary considerably, in a spacer sequence-dependent manner that can be difficult to predict. While computational tools are helpful in predicting a CRISPR effector's activity and/or potential for off-target mutagenesis with different gRNAs, individual gRNAs must still be validated in vitro prior to their use. Here, we present compartmentalized CRISPR reactions (CCR) for screening large numbers of spacer/target/off-target combinations simultaneously in vitro for both CRISPR effector activity and specificity, by confining the complete CRISPR reaction of gRNA transcription, RNP formation, and CRISPR target cleavage within individual water-in-oil microemulsions. With CCR, large numbers of the candidate gRNAs (output by computational design tools) can be immediately validated in parallel, and we show that CCR can be used to screen hundreds of thousands of extended gRNA (x-gRNAs) variants that can completely block cleavage at off-target sequences while maintaining high levels of on-target activity. We expect CCR can help to streamline the gRNA generation and validation processes for applications in biological and biomedical research.
Collapse
Affiliation(s)
- Tinku Supakar
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| | - Ashley Herring-Nicholas
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| | - Eric A. Josephs
- T. Supakar, A. H. Nicholas, E. A. Josephs Department of Nanoscience, Joint School of Nanoscience and Nanoengineering, University of North Carolina at Greensboro Greensboro, NC, USA 27401
| |
Collapse
|
45
|
McDonnell E, Orr SE, Barter MJ, Rux D, Brumwell A, Wrobel N, Murphy L, Overmann LM, Sorial AK, Young DA, Soul J, Rice SJ. Epigenetic mechanisms of osteoarthritis risk in human skeletal development. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.05.24306832. [PMID: 38766055 PMCID: PMC11100852 DOI: 10.1101/2024.05.05.24306832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
The epigenome, including the methylation of cytosine bases at CG dinucleotides, is intrinsically linked to transcriptional regulation. The tight regulation of gene expression during skeletal development is essential, with ~1/500 individuals born with skeletal abnormalities. Furthermore, increasing evidence is emerging to link age-associated complex genetic musculoskeletal diseases, including osteoarthritis (OA), to developmental factors including joint shape. Multiple studies have shown a functional role for DNA methylation in the genetic mechanisms of OA risk using articular cartilage samples taken from aged patients. Despite this, our knowledge of temporal changes to the methylome during human cartilage development has been limited. We quantified DNA methylation at ~700,000 individual CpGs across the epigenome of developing human articular cartilage in 72 samples ranging from 7-21 post-conception weeks, a time period that includes cavitation of the developing knee joint. We identified significant changes in 8% of all CpGs, and >9400 developmental differentially methylated regions (dDMRs). The largest hypermethylated dDMRs mapped to transcriptional regulators of early skeletal patterning including MEIS1 and IRX1. Conversely, the largest hypomethylated dDMRs mapped to genes encoding extracellular matrix proteins including SPON2 and TNXB and were enriched in chondrocyte enhancers. Significant correlations were identified between the expression of these genes and methylation within the hypomethylated dDMRs. We further identified 811 CpGs at which significant dimorphism was present between the male and female samples, with the majority (68%) being hypermethylated in female samples. Following imputation, we captured the genotype of these samples at >5 million variants and performed epigenome-wide methylation quantitative trait locus (mQTL) analysis. Colocalization analysis identified 26 loci at which genetic variants exhibited shared impacts upon methylation and OA genetic risk. This included loci which have been previously reported to harbour OA-mQTLs (including GDF5 and ALDH1A2), yet the majority (73%) were novel (including those mapping to CHST3, FGF1 and TEAD1). To our knowledge, this is the first extensive study of DNA methylation across human articular cartilage development. We identify considerable methylomic plasticity within the development of knee cartilage and report active epigenomic mediators of OA risk operating in prenatal joint tissues.
Collapse
Affiliation(s)
- Euan McDonnell
- Computational Biology Facility, University of Liverpool, MerseyBio, Crown Street, United Kingdom
| | - Sarah E Orr
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - Matthew J Barter
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - Danielle Rux
- Orthopaedic Surgery, UConn Health, Farmington, Connecticut, USA
| | - Abby Brumwell
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - Nicola Wrobel
- Edinburgh Clinical Research Facility, University of Edinburgh, Edinburgh, United Kingdom
| | - Lee Murphy
- Edinburgh Clinical Research Facility, University of Edinburgh, Edinburgh, United Kingdom
| | - Lynne M Overmann
- Human Developmental Biology Resource, Newcastle University, International Centre for Life, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - Antony K Sorial
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - David A Young
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| | - Jamie Soul
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, United Kingdom
| | - Sarah J Rice
- Biosciences Institute, Newcastle University, Central Parkway, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
46
|
Jin L, Liyanage R, Duan D, Chen SJ. Machine learning-inferred and energy landscape-guided analyses reveal kinetic determinants of CRISPR/Cas9 gene editing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.30.591525. [PMID: 38746227 PMCID: PMC11092603 DOI: 10.1101/2024.04.30.591525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The CRISPR/Cas nucleases system is widely considered the most important tool in genome engineering. However, current methods for predicting on/off-target effects and designing guide RNA (gRNA) rely on purely data-driven approaches or focus solely on the system's thermal equilibrium properties. Nonetheless, experimental evidence suggests that the process is kinetically controlled rather than being in equilibrium. In this study, we utilized a vast amount of available data and combined random forest, a supervised ensemble learning algorithm, and free energy landscape analysis to investigate the kinetic pathways of R-loop formation in the CRISPR/Cas9 system and the intricate molecular interactions between DNA and the Cas9 RuvC and HNH domains. The study revealed (a) a novel three-state kinetic mechanism, (b) the unfolding of the activation state of the R-loop being the most crucial kinetic determinant and the key predictor for on- and off-target cleavage efficiencies, and (c) the nucleotides from positions +13 to +16 being the kinetically critical nucleotides. The results provide a biophysical rationale for the design of a kinetic strategy for enhancing CRISPR/Cas9 gene editing accuracy and efficiency.
Collapse
|
47
|
Hadar N, Dolgin V, Oustinov K, Yogev Y, Poleg T, Safran A, Freund O, Agam N, Jean MM, Proskorovski-Ohayon R, Wormser O, Drabkin M, Halperin D, Eskin-Schwartz M, Narkis G, Sued-Hendrickson S, Aminov I, Gombosh M, Aharoni S, Birk OS. VARista: a free web platform for streamlined whole-genome variant analysis across T2T, hg38, and hg19. Hum Genet 2024; 143:695-701. [PMID: 38607411 DOI: 10.1007/s00439-024-02671-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 03/24/2024] [Indexed: 04/13/2024]
Abstract
With the increasing importance of genomic data in understanding genetic diseases, there is an essential need for efficient and user-friendly tools that simplify variant analysis. Although multiple tools exist, many present barriers such as steep learning curves, limited reference genome compatibility, or costs. We developed VARista, a free web-based tool, to address these challenges and provide a streamlined solution for researchers, particularly those focusing on rare monogenic diseases. VARista offers a user-centric interface that eliminates much of the technical complexity typically associated with variant analysis. The tool directly supports VCF files generated using reference genomes hg19, hg38, and the emerging T2T, with seamless remapping capabilities between them. Features such as gene summaries and links, tissue and cell-specific gene expression data for both adults and fetuses, as well as automated PCR design and integration with tools such as SpliceAI and AlphaMissense, enable users to focus on the biology and the case itself. As we demonstrate, VARista proved effective in narrowing down potential disease-causing variants, prioritizing them effectively, and providing meaningful biological context, facilitating rapid decision-making. VARista stands out as a freely available and comprehensive tool that consolidates various aspects of variant analysis into a single platform that embraces the forefront of genomic advancements. Its design inherently supports a shift in focus from technicalities to critical thinking, thereby promoting better-informed decisions in genetic disease research. Given its unique capabilities and user-centric design, VARista has the potential to become an essential asset for the genomic research community. https://VARista.link.
Collapse
Affiliation(s)
- Noam Hadar
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Vadim Dolgin
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Katya Oustinov
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yuval Yogev
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Tomer Poleg
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Amit Safran
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ofek Freund
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Nadav Agam
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Matan M Jean
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Regina Proskorovski-Ohayon
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ohad Wormser
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Max Drabkin
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Daniel Halperin
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Marina Eskin-Schwartz
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
- Genetics Institute, Soroka University Medical Center, Beer-Sheva, Israel
| | - Ginat Narkis
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
- Genetics Institute, Soroka University Medical Center, Beer-Sheva, Israel
| | - Sufa Sued-Hendrickson
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ilana Aminov
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Maya Gombosh
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Sarit Aharoni
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Ohad S Birk
- The Morris Kahn Laboratory of Human Genetics at the National Institute of Biotechnology in the Negev and Faculty of Health Sciences, Ben-Gurion University of the Negev, Beer Sheva, Israel.
- Genetics Institute, Soroka University Medical Center, Beer-Sheva, Israel.
| |
Collapse
|
48
|
Kisakol B, Matveeva A, Salvucci M, Kel A, McDonough E, Ginty F, Longley DB, Prehn JHM. Identification of unique rectal cancer-specific subtypes. Br J Cancer 2024; 130:1809-1818. [PMID: 38532103 PMCID: PMC11130168 DOI: 10.1038/s41416-024-02656-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 03/05/2024] [Accepted: 03/08/2024] [Indexed: 03/28/2024] Open
Abstract
BACKGROUND Existing colorectal cancer subtyping methods were generated without much consideration of potential differences in expression profiles between colon and rectal tissues. Moreover, locally advanced rectal cancers at resection often have received neoadjuvant chemoradiotherapy which likely has a significant impact on gene expression. METHODS We collected mRNA expression profiles for rectal and colon cancer samples (n = 2121). We observed that (i) Consensus Molecular Subtyping (CMS) had a different prognosis in treatment-naïve rectal vs. colon cancers, and (ii) that neoadjuvant chemoradiotherapy exposure produced a strong shift in CMS subtypes in rectal cancers. We therefore clustered 182 untreated rectal cancers to find rectal cancer-specific subtypes (RSSs). RESULTS We identified three robust subtypes. We observed that RSS1 had better, and RSS2 had worse disease-free survival. RSS1 showed high expression of MYC target genes and low activity of angiogenesis genes. RSS2 exhibited low regulatory T cell abundance, strong EMT and angiogenesis signalling, and high activation of TGF-β, NF-κB, and TNF-α signalling. RSS3 was characterised by the deactivation of EGFR, MAPK and WNT pathways. CONCLUSIONS We conclude that RSS subtyping allows for more accurate prognosis predictions in rectal cancers than CMS subtyping and provides new insight into targetable disease pathways within these subtypes.
Collapse
Affiliation(s)
- Batuhan Kisakol
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
- Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
| | - Anna Matveeva
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
- Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
| | - Manuela Salvucci
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
- Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin, 2, Ireland
| | | | | | | | - Daniel B Longley
- Centre for Cancer Research & Cell Biology, Queen's University Belfast, Belfast, Northern Ireland, UK
| | - Jochen H M Prehn
- Department of Physiology and Medical Physics, Royal College of Surgeons in Ireland, Dublin, 2, Ireland.
- Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin, 2, Ireland.
| |
Collapse
|
49
|
Christiansen C, Potier L, Martin TC, Villicaña S, Castillo-Fernandez JE, Mangino M, Menni C, Tsai PC, Campbell PJ, Mullin S, Ordoñana JR, Monteagudo O, Sachdev PS, Mather KA, Trollor JN, Pietilainen KH, Ollikainen M, Dalgård C, Kyvik K, Christensen K, van Dongen J, Willemsen G, Boomsma DI, Magnusson PKE, Pedersen NL, Wilson SG, Grundberg E, Spector TD, Bell JT. Enhanced resolution profiling in twins reveals differential methylation signatures of type 2 diabetes with links to its complications. EBioMedicine 2024; 103:105096. [PMID: 38574408 PMCID: PMC11004697 DOI: 10.1016/j.ebiom.2024.105096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Revised: 03/18/2024] [Accepted: 03/18/2024] [Indexed: 04/06/2024] Open
Abstract
BACKGROUND Type 2 diabetes (T2D) susceptibility is influenced by genetic and environmental factors. Previous findings suggest DNA methylation as a potential mechanism in T2D pathogenesis and progression. METHODS We profiled DNA methylation in 248 blood samples from participants of European ancestry from 7 twin cohorts using a methylation sequencing platform targeting regulatory genomic regions encompassing 2,048,698 CpG sites. FINDINGS We find and replicate 3 previously unreported T2D differentially methylated CpG positions (T2D-DMPs) at FDR 5% in RGL3, NGB and OTX2, and 20 signals at FDR 25%, of which 14 replicated. Integrating genetic variation and T2D-discordant monozygotic twin analyses, we identify both genetic-based and genetic-independent T2D-DMPs. The signals annotate to genes with established GWAS and EWAS links to T2D and its complications, including blood pressure (RGL3) and eye disease (OTX2). INTERPRETATION The results help to improve our understanding of T2D disease pathogenesis and progression and may provide biomarkers for its complications. FUNDING Funding acknowledgements for each cohort can be found in the Supplementary Note.
Collapse
Affiliation(s)
| | - Louis Potier
- APHP, Paris Cité University, INSERM, Paris, France
| | | | | | | | | | | | - Pei-Chien Tsai
- King's College London, UK; Department of Biomedical Sciences, Chang Gung University, Taoyuan City, Taiwan; Molecular Infectious Disease Research Center, Chang Gung Memorial Hospital, Taoyuan City, Taiwan
| | - Purdey J Campbell
- Department of Endocrinology & Diabetes, Sir Charles Gairdner Hospital, Nedlands, WA, Australia
| | - Shelby Mullin
- Department of Endocrinology & Diabetes, Sir Charles Gairdner Hospital, Nedlands, WA, Australia; School of Biomedical Sciences, University of Western Australia, Crawley, WA, 6009, Australia
| | | | | | | | | | | | - Kirsi H Pietilainen
- Obesity Research Unit, Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Finland; HealthyWeightHub, Abdominal Center, Helsinki University Hospital and University of Helsinki, Finland
| | - Miina Ollikainen
- Minerva Foundation Institute for Medical Research, Helsinki, Finland; Institute for Molecular Medicine Finland, FIMM, HiLIFE, University of Helsinki, Finland
| | | | | | | | - Jenny van Dongen
- Department of Biological Psychology, Vrije Universiteit Amsterdam, the Netherlands
| | - Gonneke Willemsen
- Department of Biological Psychology, Vrije Universiteit Amsterdam, the Netherlands
| | - Dorret I Boomsma
- Department of Biological Psychology, Vrije Universiteit Amsterdam, the Netherlands
| | | | | | - Scott G Wilson
- King's College London, UK; Department of Endocrinology & Diabetes, Sir Charles Gairdner Hospital, Nedlands, WA, Australia; School of Biomedical Sciences, University of Western Australia, Crawley, WA, 6009, Australia
| | | | | | | |
Collapse
|
50
|
Liaw YC, Matsuda K, Liaw YP. Identification of an novel genetic variant associated with osteoporosis: insights from the Taiwan Biobank Study. JBMR Plus 2024; 8:ziae028. [PMID: 38655459 PMCID: PMC11037432 DOI: 10.1093/jbmrpl/ziae028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Revised: 02/18/2024] [Accepted: 03/01/2024] [Indexed: 04/26/2024] Open
Abstract
Purpose The purpose of this study was to identify new independent significant SNPs associated with osteoporosis using data from the Taiwan Biobank (TWBB). Material and Methods The dataset was divided into discovery (60%) and replication (40%) subsets. Following data quality control, genome-wide association study (GWAS) analysis was performed, adjusting for sex, age, and the top 5 principal components, employing the Scalable and Accurate Implementation of the Generalized mixed model approach. This was followed by a meta-analysis of TWBB1 and TWBB2. The Functional Mapping and Annotation (FUMA) platform was used to identify osteoporosis-associated loci. Manhattan and quantile-quantile plots were generated using the FUMA platform to visualize the results. Independent significant SNPs were selected based on genome-wide significance (P < 5 × 10-8) and independence from each other (r2 < 0.6) within a 1 Mb window. Positional, eQTL(expression quantitative trait locus), and Chromatin interaction mapping were used to map SNPs to genes. Results A total of 29 084 individuals (3154 osteoporosis cases and 25 930 controls) were used for GWAS analysis (TWBB1 data), and 18 918 individuals (1917 cases and 17 001 controls) were utilized for replication studies (TWBB2 data). We identified a new independent significant SNP for osteoporosis in TWBB1, with the lead SNP rs76140829 (minor allele frequency = 0.055, P-value = 1.15 × 10-08). Replication of the association was performed in TWBB2, yielding a P-value of 6.56 × 10-3. The meta-analysis of TWBB1 and TWBB2 data demonstrated a highly significant association for SNP rs76140829 (P-value = 7.52 × 10-10). In the positional mapping of rs76140829, 6 genes (HABP2, RP11-481H12.1, RNU7-165P, RP11-139 K1.2, RP11-57H14.3, and RP11-214 N15.5) were identified through chromatin interaction mapping in mesenchymal stem cells. Conclusions Our GWAS analysis using the Taiwan Biobank dataset unveils rs76140829 in the VTI1A gene as a key risk variant associated with osteoporosis. This finding expands our understanding of the genetic basis of osteoporosis and highlights the potential regulatory role of this SNP in mesenchymal stem cells.
Collapse
Affiliation(s)
- Yi-Ching Liaw
- Department of Computational Biology and Medical Sciences, Laboratory of Clinical Genome Sequencing, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo 108-8639, Japan
- Department of Public Health and Institute of Public Health, Chung Shan Medical University, Taichung 40201, Taiwan
| | - Koichi Matsuda
- Department of Computational Biology and Medical Sciences, Laboratory of Clinical Genome Sequencing, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo 108-8639, Japan
- Institute of Medical Science, The University of Tokyo, Laboratory of Genome Technology, Human Genome Center, Tokyo 108-8639, Japan
| | - Yung-Po Liaw
- Department of Public Health and Institute of Public Health, Chung Shan Medical University, Taichung 40201, Taiwan
- Institute of Medicine, Chung Shan Medical University, Taichung 40201, Taiwan
- Department of Medical Imaging, Chung Shan Medical University Hospital, Taichung 40201, Taiwan
| |
Collapse
|