1
|
Jia Y, Qi X, Ma M, Cheng S, Cheng B, Liang C, Guo X, Zhang F. Integrating genome-wide association study with regulatory SNP annotations identified novel candidate genes for osteoporosis. Bone Joint Res 2023; 12:147-154. [PMID: 37051837 PMCID: PMC10003063 DOI: 10.1302/2046-3758.122.bjr-2022-0206.r1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/14/2023] Open
Abstract
Osteoporosis (OP) is a metabolic bone disease, characterized by a decrease in bone mineral density (BMD). However, the research of regulatory variants has been limited for BMD. In this study, we aimed to explore novel regulatory genetic variants associated with BMD. We conducted an integrative analysis of BMD genome-wide association study (GWAS) and regulatory single nucleotide polymorphism (rSNP) annotation information. Firstly, the discovery GWAS dataset and replication GWAS dataset were integrated with rSNP annotation database to obtain BMD associated SNP regulatory elements and SNP regulatory element-target gene (E-G) pairs, respectively. Then, the common genes were further subjected to HumanNet v2 to explore the biological effects. Through discovery and replication integrative analysis for BMD GWAS and rSNP annotation database, we identified 36 common BMD-associated genes for BMD irrespective of regulatory elements, such as FAM3C (pdiscovery GWAS = 1.21 × 10-25, preplication GWAS = 1.80 × 10-12), CCDC170 (pdiscovery GWAS = 1.23 × 10-11, preplication GWAS = 3.22 × 10-9), and SOX6 (pdiscovery GWAS = 4.41 × 10-15, preplication GWAS = 6.57 × 10-14). Then, for the 36 common target genes, multiple gene ontology (GO) terms were detected for BMD such as positive regulation of cartilage development (p = 9.27 × 10-3) and positive regulation of chondrocyte differentiation (p = 9.27 × 10-3). We explored the potential roles of rSNP in the genetic mechanisms of BMD and identified multiple candidate genes. Our study results support the implication of regulatory genetic variants in the development of OP.
Collapse
Affiliation(s)
- Yumeng Jia
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Xin Qi
- Precision Medicine Center, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
| | - Mei Ma
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Shiqiang Cheng
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Bolun Cheng
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Chujun Liang
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Xiong Guo
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| | - Feng Zhang
- School of Public Health, Health Science Center, Xi'an Jiaotong University, Xi'an, China
| |
Collapse
|
2
|
Jagoda E, Marnetto D, Senevirathne G, Gonzalez V, Baid K, Montinaro F, Richard D, Falzarano D, LeBlanc EV, Colpitts CC, Banerjee A, Pagani L, Capellini TD. Regulatory dissection of the severe COVID-19 risk locus introgressed by Neanderthals. eLife 2023; 12:e71235. [PMID: 36763080 PMCID: PMC9917435 DOI: 10.7554/elife.71235] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 01/26/2023] [Indexed: 02/11/2023] Open
Abstract
Individuals infected with the SARS-CoV-2 virus present with a wide variety of symptoms ranging from asymptomatic to severe and even lethal outcomes. Past research has revealed a genetic haplotype on chromosome 3 that entered the human population via introgression from Neanderthals as the strongest genetic risk factor for the severe response to COVID-19. However, the specific variants along this introgressed haplotype that contribute to this risk and the biological mechanisms that are involved remain unclear. Here, we assess the variants present on the risk haplotype for their likelihood of driving the genetic predisposition to severe COVID-19 outcomes. We do this by first exploring their impact on the regulation of genes involved in COVID-19 infection using a variety of population genetics and functional genomics tools. We then perform a locus-specific massively parallel reporter assay to individually assess the regulatory potential of each allele on the haplotype in a multipotent immune-related cell line. We ultimately reduce the set of over 600 linked genetic variants to identify four introgressed alleles that are strong functional candidates for driving the association between this locus and severe COVID-19. Using reporter assays in the presence/absence of SARS-CoV-2, we find evidence that these variants respond to viral infection. These variants likely drive the locus' impact on severity by modulating the regulation of two critical chemokine receptor genes: CCR1 and CCR5. These alleles are ideal targets for future functional investigations into the interaction between host genomics and COVID-19 outcomes.
Collapse
Affiliation(s)
- Evelyn Jagoda
- Department of Human Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Davide Marnetto
- Estonian Biocentre, Institute of Genomics, University of TartuTartuEstonia
| | - Gayani Senevirathne
- Department of Human Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Victoria Gonzalez
- Department of Veterinary Microbiology, University of SaskatchewanSaskatoonCanada
- Vaccine and Infectious Disease Organization, University of SaskatchewanSaskatoonCanada
| | - Kaushal Baid
- Vaccine and Infectious Disease Organization, University of SaskatchewanSaskatoonCanada
| | - Francesco Montinaro
- Estonian Biocentre, Institute of Genomics, University of TartuTartuEstonia
- Department of Biology, University of BariBariItaly
| | - Daniel Richard
- Department of Human Evolutionary Biology, Harvard UniversityCambridgeUnited States
| | - Darryl Falzarano
- Department of Veterinary Microbiology, University of SaskatchewanSaskatoonCanada
- Vaccine and Infectious Disease Organization, University of SaskatchewanSaskatoonCanada
| | - Emmanuelle V LeBlanc
- Department of Biomedical and Molecular Sciences, Queen’s UniversityKingstonCanada
| | - Che C Colpitts
- Department of Biomedical and Molecular Sciences, Queen’s UniversityKingstonCanada
| | - Arinjay Banerjee
- Department of Veterinary Microbiology, University of SaskatchewanSaskatoonCanada
- Vaccine and Infectious Disease Organization, University of SaskatchewanSaskatoonCanada
- Department of Biology, University of WaterlooWaterlooCanada
- Department of Laboratory Medicine and Pathobiology, University of TorontoTorontoCanada
| | - Luca Pagani
- Estonian Biocentre, Institute of Genomics, University of TartuTartuEstonia
- Department of Biology, University of PadovaPadovaItaly
| | - Terence D Capellini
- Department of Human Evolutionary Biology, Harvard UniversityCambridgeUnited States
- Broad Institute of MIT and HarvardCambridgeUnited States
| |
Collapse
|
3
|
A computational method for prediction of rSNPs in human genome. Comput Biol Chem 2016; 62:96-103. [PMID: 27107687 DOI: 10.1016/j.compbiolchem.2016.04.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 02/27/2016] [Accepted: 04/01/2016] [Indexed: 11/22/2022]
Abstract
Regulatory single nucleotide polymorphisms (rSNPs) in human genomes are thought to be responsible for phenotypic differences, including susceptibility to diseases and treatment outcomes, even they do not change any gene product. However, a genome-wide search for rSNPs has not been properly addressed so far. In this work, a computational method for rSNP identification is proposed. As background SNPs far outnumber rSNPs, an ensemble method is applied to handle imbalanced data, which firstly converts an unbalanced dataset into several balanced ones and then models for every balanced dataset. Two major types of features are extracted, that are sequence based features and allele-specific based features. Then random forest is applied to build the recognition model for each balanced dataset. Finally, ensemble strategies are adopted to combine the result of each model together. We have tested our method on a set of experimentally verified rSNPs, and leave-one-out cross-validation results showed that our method can achieve accuracy with sensitivity of 73.8%, specificity of 71.8% and the area under ROC curve (AUC) is 0.756. In addition, our method is threshold free and doesn't rely on data of regulatory elements, thus it will have better adaptability when facing different data scenarios. The original data and the source matlab codes involved are available at https://sourceforge.net/projects/rsnpdect/.
Collapse
|
4
|
Zeng H, Hashimoto T, Kang DD, Gifford DK. GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding. Bioinformatics 2015; 32:490-6. [PMID: 26476779 DOI: 10.1093/bioinformatics/btv565] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2015] [Accepted: 09/22/2015] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION The majority of disease-associated variants identified in genome-wide association studies reside in noncoding regions of the genome with regulatory roles. Thus being able to interpret the functional consequence of a variant is essential for identifying causal variants in the analysis of genome-wide association studies. RESULTS We present GERV (generative evaluation of regulatory variants), a novel computational method for predicting regulatory variants that affect transcription factor binding. GERV learns a k-mer-based generative model of transcription factor binding from ChIP-seq and DNase-seq data, and scores variants by computing the change of predicted ChIP-seq reads between the reference and alternate allele. The k-mers learned by GERV capture more sequence determinants of transcription factor binding than a motif-based approach alone, including both a transcription factor's canonical motif and associated co-factor motifs. We show that GERV outperforms existing methods in predicting single-nucleotide polymorphisms associated with allele-specific binding. GERV correctly predicts a validated causal variant among linked single-nucleotide polymorphisms and prioritizes the variants previously reported to modulate the binding of FOXA1 in breast cancer cell lines. Thus, GERV provides a powerful approach for functionally annotating and prioritizing causal variants for experimental follow-up analysis. AVAILABILITY AND IMPLEMENTATION The implementation of GERV and related data are available at http://gerv.csail.mit.edu/.
Collapse
Affiliation(s)
- Haoyang Zeng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and
| | - Tatsunori Hashimoto
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and
| | - Daniel D Kang
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and
| | - David K Gifford
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA and Department of Stem Cell and Regenerative Biology, Harvard University and Harvard Medical School, Cambridge, MA 02138, USA
| |
Collapse
|
5
|
Zhenilo S, Khrameeva E, Tsygankova S, Zhigalova N, Mazur A, Prokhortchouk E. Individual genome sequencing identified a novel enhancer element in exon 7 of the CSFR1 gene by shift of expressed allele ratios. Gene 2015; 566:223-8. [PMID: 25913741 DOI: 10.1016/j.gene.2015.04.053] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Revised: 04/16/2015] [Accepted: 04/20/2015] [Indexed: 10/23/2022]
Abstract
The sequencing of individual genetic information may provide a powerful tool for elucidating the mechanism by which individual SNPs affect promoter function. Here, we assessed the genome of a Russian male that was previously sequenced. The RNA-Seq data from blood cells revealed 234 candidate transcripts with shifts of greater than 1.5-fold from equal biallelic transcription. Of these genes, the CSF1R gene had variations in genic regions that affected the association of RORalpha with its target binding site in vivo. The results of a reporter assay confirmed that a single nucleotide substitution, rs2228422, within the RORalpha recognition motif altered the ability of the enhancer to regulate CSF1R gene transcription. Notably, 31% of Europeans and only 3% of Asians are homozygous for a RORalpha responsive "A" allele, but no association with diseases of rs2228422 has been found thus far.
Collapse
Affiliation(s)
- S Zhenilo
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia
| | - E Khrameeva
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia
| | - S Tsygankova
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia
| | - N Zhigalova
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia
| | - A Mazur
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia
| | - E Prokhortchouk
- Center "Bioengineering" Russian Academy of Sciences, 117312, Prospect 60-let Oktyabrya, 7-1, Moscow, Russia.
| |
Collapse
|
6
|
Regulatory Variants and Disease: The E-Cadherin -160C/A SNP as an Example. Mol Biol Int 2014; 2014:967565. [PMID: 25276428 PMCID: PMC4167656 DOI: 10.1155/2014/967565] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2014] [Revised: 08/23/2014] [Accepted: 08/25/2014] [Indexed: 01/04/2023] Open
Abstract
Single nucleotide polymorphisms (SNPs) occurring in noncoding sequences have largely been ignored in genome-wide association studies (GWAS). Yet, amounting evidence suggests that many noncoding SNPs especially those that are in the vicinity of protein coding genes play important roles in shaping chromatin structure and regulate gene expression and, as such, are implicated in a wide variety of diseases. One of such regulatory SNPs (rSNPs) is the E-cadherin (CDH1) promoter -160C/A SNP (rs16260) which is known to affect E-cadherin promoter transcription by displacing transcription factor binding and has been extensively scrutinized for its association with several diseases especially malignancies. Findings from studying this SNP highlight important clinical relevance of rSNPs and justify their inclusion in future GWAS to identify novel disease causing SNPs.
Collapse
|
7
|
Guo L, Du Y, Chang S, Zhang K, Wang J. rSNPBase: a database for curated regulatory SNPs. Nucleic Acids Res 2014; 42:D1033-9. [PMID: 24285297 PMCID: PMC3964952 DOI: 10.1093/nar/gkt1167] [Citation(s) in RCA: 94] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 10/30/2013] [Indexed: 01/20/2023] Open
Abstract
In recent years, human regulatory SNPs (rSNPs) have been widely studied. Here, we present database rSNPBase, freely available at http://rsnp.psych.ac.cn/, to provide curated rSNPs that analyses the regulatory features of all SNPs in the human genome with reference to experimentally supported regulatory elements. In contrast with previous SNP functional annotation databases, rSNPBase is characterized by several unique features. (i) To improve reliability, all SNPs in rSNPBase are annotated with reference to experimentally supported regulatory elements. (ii) rSNPBase focuses on rSNPs involved in a wide range of regulation types, including proximal and distal transcriptional regulation and post-transcriptional regulation, and identifies their potentially regulated genes. (iii) Linkage disequilibrium (LD) correlations between SNPs were analysed so that the regulatory feature is annotated to SNP-set rather than a single SNP. (iv) rSNPBase provides the spatio-temporal labels and experimental eQTL labels for SNPs. In summary, rSNPBase provides more reliable, comprehensive and user-friendly regulatory annotations on rSNPs and will assist researchers in selecting candidate SNPs for further genetic studies and in exploring causal SNPs for in-depth molecular mechanisms of complex phenotypes.
Collapse
Affiliation(s)
- Liyuan Guo
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Yang Du
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Suhua Chang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Kunlin Zhang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| | - Jing Wang
- Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, 16 Lincui Road, Chaoyang District, Beijing 100101, China and University of Chinese Academy of Sciences, 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|