1
|
Abd Al-Jabar WA, Al-Thuwaini TM. Variation in the AA-NAT gene G203A is associated with Awassi and Hamdani sheep fertility. Anim Biotechnol 2024; 35:2352771. [PMID: 38753969 DOI: 10.1080/10495398.2024.2352771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
Arylalkylamine-N-acetyl-transferase (AA-NAT) is one of several genes that influence sheep reproduction. Thus, the objective of this study was to investigate whether genetic variability within the AA-NAT gene influenced the reproductive performance of Awassi and Hamdani ewes. A total of 99 twin and 101 single-progeny ewes were analyzed for genomic DNA. Polymerase chain reaction (PCR) was used to produce amplicons of 300, 313, and 287 bp from exons 1, 2, and 3 of the AA-NAT gene. A 300-bp amplicon was genotyped, resulting in two genotypes: GG and GA. Through sequence analysis, a mutation 203 G > A was identified in the GA genotype. The statistical analysis revealed a strong correlation between the single nucleotide polymorphism (SNP) 203 G > A and reproductive performance. Ewes carrying this mutation showed significantly increased litter sizes, twinning rates, lambing rates, and fewer days to lambing compared to those carrying GG. These findings demonstrate that the presence of the 203 G > A SNP variant has a significant positive impact on litter sizes and enhances the fertility of Awassi and Hamdani sheep.
Collapse
Affiliation(s)
- Waleed A Abd Al-Jabar
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al-Qasim, Babil, Iraq
| | - Tahreer M Al-Thuwaini
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al-Qasim, Babil, Iraq
| |
Collapse
|
2
|
Imran FS, Al-Thuwaini TM. The Novel PTX3 Variant g.22645332G>T Is Strongly Related to Awassi and Hamdani Sheep Litter Size. Bioinform Biol Insights 2024; 18:11779322241248912. [PMID: 38681096 PMCID: PMC11047254 DOI: 10.1177/11779322241248912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Accepted: 04/04/2024] [Indexed: 05/01/2024] Open
Abstract
The detection of polymorphisms in genes that control livestock reproduction could be highly beneficial for identifying and enhancing economic traits. One of these genes is pentraxin 3 (PTX3), which affects the reproduction of sheep. Therefore, this study investigated whether the variability of the PTX3 gene was related to the litter size of Awassi and Hamdani ewes. A total of 200 ewes (130 Awassi and 70 Hamdani) were used for genomic DNA extraction. Polymerase chain reaction was used to amplify the sequence fragments of exons 1, 2, 3, and 4 from the PTX3 gene (Oar_v4.0; Chr 1, NC_056054.1), resulting in products of 254, 312, 302, and 253, respectively. Two genotypes, GG and GT, were identified for 302 bp amplicon. A novel mutation was discovered through sequence analysis in the GT genotype at position g.22645332G>T. The statistical analysis revealed a significant association between single nucleotide polymorphism (SNP g.22645332G>T; Oar_v4.0; Chr 1, NC_056054.1) and litter size. The presence of the SNP g.22645332G>T (Oar_v4.0; Chr 1, NC_056054.1) genotype in ewes resulted in a significant difference compared to ewes with GG genotypes. The discrepancy became apparent in several aspects, including litter sizes, twinning rates, lambing rates, litter weight at birth, and days to lambing. There were fewer lambs born to ewes with the GG genotype than to ewes with the GT genotype. The variant SNP g.22645332G>T (Oar_v4.0; Chr 1, NC_056054.1) has positive effects on the litter size of Awassi and Hamdani sheep. The SNP g.22645332G>T (Oar_v4.0; Chr 1, NC_056054.1 has been associated with an increase in litter size and higher prolificacy in ewes.
Collapse
Affiliation(s)
- Faris S Imran
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Babil, Iraq
- Department of Public Health, Faculty of Veterinary Medicine, Kerbala University, Karbala, Iraq
| | - Tahreer M Al-Thuwaini
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Babil, Iraq
| |
Collapse
|
3
|
Reshetnikov E, Churnosova M, Reshetnikova Y, Stepanov V, Bocharova A, Serebrova V, Trifonova E, Ponomarenko I, Sorokina I, Efremova O, Orlova V, Batlutskaya I, Ponomarenko M, Churnosov V, Aristova I, Polonikov A, Churnosov M. Maternal Age at Menarche Genes Determines Fetal Growth Restriction Risk. Int J Mol Sci 2024; 25:2647. [PMID: 38473894 DOI: 10.3390/ijms25052647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 02/06/2024] [Accepted: 02/14/2024] [Indexed: 03/14/2024] Open
Abstract
We aimed to explore the potential link of maternal age at menarche (mAAM) gene polymorphisms with risk of the fetal growth restriction (FGR). This case (FGR)-control (FGR free) study included 904 women (273 FGR and 631 control) in the third trimester of gestation examined/treated in the Departments of Obstetrics. For single nucleotide polymorphism (SNP) multiplex genotyping, 50 candidate loci of mAAM were chosen. The relationship of mAAM SNPs and FGR was appreciated by regression procedures (logistic/model-based multifactor dimensionality reduction [MB-MDR]) with subsequent in silico assessment of the assumed functionality pithy of FGR-related loci. Three mAAM-appertain loci were FGR-linked to genes such as KISS1 (rs7538038) (effect allele G-odds ratio (OR)allelic = 0.63/pperm = 0.0003; ORadditive = 0.61/pperm = 0.001; ORdominant = 0.56/pperm = 0.001), NKX2-1 (rs999460) (effect allele A-ORallelic = 1.37/pperm = 0.003; ORadditive = 1.45/pperm = 0.002; ORrecessive = 2.41/pperm = 0.0002), GPRC5B (rs12444979) (effect allele T-ORallelic = 1.67/pperm = 0.0003; ORdominant = 1.59/pperm = 0.011; ORadditive = 1.56/pperm = 0.009). The haplotype ACA FSHB gene (rs555621*rs11031010*rs1782507) was FRG-correlated (OR = 0.71/pperm = 0.05). Ten FGR-implicated interworking models were founded for 13 SNPs (pperm ≤ 0.001). The rs999460 NKX2-1 and rs12444979 GPRC5B interplays significantly influenced the FGR risk (these SNPs were present in 50% of models). FGR-related mAAM-appertain 15 polymorphic variants and 350 linked SNPs were functionally momentous in relation to 39 genes participating in the regulation of hormone levels, the ovulation cycle process, male gonad development and vitamin D metabolism. Thus, this study showed, for the first time, that the mAAM-appertain genes determine FGR risk.
Collapse
Affiliation(s)
- Evgeny Reshetnikov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Maria Churnosova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Yuliya Reshetnikova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Vadim Stepanov
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Anna Bocharova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Victoria Serebrova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Ekaterina Trifonova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Irina Ponomarenko
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Inna Sorokina
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Olga Efremova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Valentina Orlova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Irina Batlutskaya
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Marina Ponomarenko
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Vladimir Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Inna Aristova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Alexey Polonikov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
- Department of Biology, Medical Genetics and Ecology and Research Institute for Genetic and Molecular Epidemiology, Kursk State Medical University, 305041 Kursk, Russia
| | - Mikhail Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| |
Collapse
|
4
|
Imran FS, Al-Thuwaini TM. The novel C268A variant of BMP2 is linked to the reproductive performance of Awassi and Hamdani sheep. Mol Biol Rep 2024; 51:267. [PMID: 38302768 DOI: 10.1007/s11033-024-09274-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 01/19/2024] [Indexed: 02/03/2024]
Abstract
BACKGROUND Prolificacy-associated genetic markers can be utilized to enhance litter size in the sheep breeding industry. Sheep reproduction is influenced by a multitude of genes, including bone morphogenetic protein 2 (BMP2). This study aimed to explore the potential relationship between variability in the BMP2 gene and reproductive performance in Awassi and Hamdani ewes. METHODS AND RESULTS The genomic DNA was extracted from 99 single-progeny ewes and 101 twin ewes. Polymerase chain reaction (PCR) was employed to produce an amplicon consisting of four sequence fragments: 277 bp, 251 bp, 331 bp, and 340 bp, from exons 1, 2, 3, and 4 of the BMP2 gene, respectively. Three genotypes were identified for amplicons in exon 4 with 340-bp lengths: CC, CA, and AA. Upon analyzing the sequence of the CA genotype 382 C > A, a novel mutation was discovered in this genotype. A robust association was identified between the single nucleotide polymorphisms (SNP) 382 C > A and reproductive performance through statistical analysis. An important distinction was discovered between ewes carrying SNP 382 C > A and those carrying CC in terms of litter sizes, twinning rates, lambing rates, and days to lambing. An analysis of logistic regression revealed a significant association between litter size and the 382 C > A SNP. There was a decrease in lamb production among ewes with the CC genotype compared to those with the CA and AA genotypes. CONCLUSIONS These results indicate that the SNP variant 382 C > A has a positive influence on the reproductive performance of Awassi and Hamdani sheep. Sheep carrying the 382 C > A SNP exhibit increased litter size and overall productivity compared to those without the SNP.
Collapse
Affiliation(s)
- Faris S Imran
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al- Qasim, Babil, Iraq
| | - Tahreer M Al-Thuwaini
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al- Qasim, Babil, Iraq.
| |
Collapse
|
5
|
López-Pérez M, Aguirre-Garrido F, Herrera-Zúñiga L, Fernández FJ. Gene as a dynamical notion: An extensive and integrative vision. Redefining the gene concept, from traditional to genic-interaction, as a new dynamical version. Biosystems 2023; 234:105060. [PMID: 37844827 DOI: 10.1016/j.biosystems.2023.105060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 09/08/2023] [Accepted: 10/10/2023] [Indexed: 10/18/2023]
Abstract
The current concept of gene has been very useful during the 20th and 21st centuries. However, recent advances in molecular biology and bioinformatics, which have further diversified the functional and adaptive profile of genetic information and its integration with cell physiology and environmental response, have contributed to focusing on additional new gene properties besides the traditional definition. Considering the inherent complexity of gene expression, whose adaptive objective must be referred to the Tortoise-Hare model, in which two tendencies converge, one focused on rapid adaptation to achieve survival, and the other that prevents an over-adaptation effect. In this context, a revision of the gene concept must be made, which must include these new mechanisms and approaches. In this paper, we propose a new conception of the idea of a gene that moves from a static and defined version of hereditary information to a dynamic idea that preponderates gene interaction (circumscribed to that established between protein-protein, protein-nucleic acid, and nucleic acid-nucleic acid) and the selection it exerts, as the irreducible element that works in a coordinated way in a genomic regulatory network (GRN).
Collapse
Affiliation(s)
- Marcos López-Pérez
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico.
| | - Félix Aguirre-Garrido
- Environmental Sciences Department, Universidad Autónoma Metropolitana (Lerma Unit) Av. de las Garzas N° 10, Col. El Panteón, Municipio de Lerma de Villada, Estado de México, C.P. 52005, Mexico
| | - Leonardo Herrera-Zúñiga
- Chemistry Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico
| | - Francisco J Fernández
- Biotechnology Department, Universidad Autónoma Metropolitana (Iztapalapa Unit), C.P. 09340, Mexico City, Mexico.
| |
Collapse
|
6
|
Sahi N, Haider L, Chung K, Prados Carrasco F, Kanber B, Samson R, Thompson AJ, Gandini Wheeler-Kingshott CAM, Trip SA, Brownlee W, Ciccarelli O, Barkhof F, Tur C, Houlden H, Chard D. Genetic influences on disease course and severity, 30 years after a clinically isolated syndrome. Brain Commun 2023; 5:fcad255. [PMID: 37841069 PMCID: PMC10576246 DOI: 10.1093/braincomms/fcad255] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 08/31/2023] [Accepted: 10/02/2023] [Indexed: 10/17/2023] Open
Abstract
Multiple sclerosis risk has a well-established polygenic component, yet the genetic contribution to disease course and severity remains unclear and difficult to examine. Accurately measuring disease progression requires long-term study of clinical and radiological outcomes with sufficient follow-up duration to confidently confirm disability accrual and multiple sclerosis phenotypes. In this retrospective study, we explore genetic influences on long-term disease course and severity; in a unique cohort of clinically isolated syndrome patients with homogenous 30-year disease duration, deep clinical phenotyping and advanced MRI metrics. Sixty-one clinically isolated syndrome patients [41 female (67%): 20 male (33%)] underwent clinical and MRI assessment at baseline, 1-, 5-, 10-, 14-, 20- and 30-year follow-up (mean age ± standard deviation: 60.9 ± 6.5 years). After 30 years, 29 patients developed relapsing-remitting multiple sclerosis, 15 developed secondary progressive multiple sclerosis and 17 still had a clinically isolated syndrome. Twenty-seven genes were investigated for associations with clinical outcomes [including disease course and Expanded Disability Status Scale (EDSS)] and brain MRI (including white matter lesions, cortical lesions, and brain tissue volumes) at the 30-year follow-up. Genetic associations with changes in EDSS, relapses, white matter lesions and brain atrophy (third ventricular and medullary measurements) over 30 years were assessed using mixed-effects models. HLA-DRB1*1501-positive (n = 26) patients showed faster white matter lesion accrual [+1.96 lesions/year (0.64-3.29), P = 3.8 × 10-3], greater 30-year white matter lesion volumes [+11.60 ml, (5.49-18.29), P = 1.27 × 10-3] and higher annualized relapse rates [+0.06 relapses/year (0.005-0.11), P = 0.031] compared with HLA-DRB1*1501-negative patients (n = 35). PVRL2-positive patients (n = 41) had more cortical lesions (+0.83 [0.08-1.66], P = 0.042), faster EDSS worsening [+0.06 points/year (0.02-0.11), P = 0.010], greater 30-year EDSS [+1.72 (0.49-2.93), P = 0.013; multiple sclerosis cases: +2.60 (1.30-3.87), P = 2.02 × 10-3], and greater risk of secondary progressive multiple sclerosis [odds ratio (OR) = 12.25 (1.15-23.10), P = 0.031] than PVRL2-negative patients (n = 18). In contrast, IRX1-positive (n = 30) patients had preserved 30-year grey matter fraction [+0.76% (0.28-1.29), P = 8.4 × 10-3], lower risk of cortical lesions [OR = 0.22 (0.05-0.99), P = 0.049] and lower 30-year EDSS [-1.35 (-0.87,-3.44), P = 0.026; multiple sclerosis cases: -2.12 (-0.87, -3.44), P = 5.02 × 10-3] than IRX1-negative patients (n = 30). In multiple sclerosis cases, IRX1-positive patients also had slower EDSS worsening [-0.07 points/year (-0.01,-0.13), P = 0.015] and lower risk of secondary progressive multiple sclerosis [OR = 0.19 (0.04-0.92), P = 0.042]. These exploratory findings support diverse genetic influences on pathological mechanisms associated with multiple sclerosis disease course. HLA-DRB1*1501 influenced white matter inflammation and relapses, while IRX1 (protective) and PVRL2 (adverse) were associated with grey matter pathology (cortical lesions and atrophy), long-term disability worsening and the risk of developing secondary progressive multiple sclerosis.
Collapse
Affiliation(s)
- Nitin Sahi
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Lukas Haider
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- Department of Biomedical Imaging and Image Guided Therapy, Medical University Vienna, 1090 Vienna, Austria
| | - Karen Chung
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Ferran Prados Carrasco
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- Centre for Medical Image Computing (CMIC), Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
- Universitat Oberta de Catalunya, 08018 Barcelona, Spain
| | - Baris Kanber
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- Centre for Medical Image Computing (CMIC), Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
- Department of Clinical and Experimental Epilepsy, University College London, London WC1N 3BG, UK
| | - Rebecca Samson
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Alan J Thompson
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Claudia A M Gandini Wheeler-Kingshott
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- Department of Brain and Behavioural Sciences, University of Pavia, 27100 Pavia, Italy
- Brain MRI 3T Research Centre, IRCCS Mondino Foundation, 27100 Pavia, Italy
| | - S Anand Trip
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
| | - Wallace Brownlee
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- National Institute for Health and Care Research (NIHR) University College London Hospitals (UCLH) Biomedical Research Centre, London W1T 7DN, UK
| | - Olga Ciccarelli
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- National Institute for Health and Care Research (NIHR) University College London Hospitals (UCLH) Biomedical Research Centre, London W1T 7DN, UK
| | - Frederik Barkhof
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- Centre for Medical Image Computing (CMIC), Department of Medical Physics and Biomedical Engineering, University College London, London WC1E 6BT, UK
- National Institute for Health and Care Research (NIHR) University College London Hospitals (UCLH) Biomedical Research Centre, London W1T 7DN, UK
- Department of Radiology and Nuclear Medicine, VU University Medical Centre, 1081 HV Amsterdam, The Netherlands
| | - Carmen Tur
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- MS Centre of Catalonia (Cemcat), Vall d'Hebron Institute of Research, Vall d'Hebron Barcelona Hospital Campus, 08035 Barcelona, Spain
| | - Henry Houlden
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, Queen’s Square House, Queen’s Square, London, WC1N 3BG, UK
| | - Declan Chard
- NMR Research Unit, Queen Square Multiple Sclerosis Centre, University College London Queen Square Institute of Neurology, London WC1N 3BG, UK
- National Institute for Health and Care Research (NIHR) University College London Hospitals (UCLH) Biomedical Research Centre, London W1T 7DN, UK
| |
Collapse
|
7
|
Fallah F, Colagar AH, Saleh HA, Ranjbar M. Variation of the genes encoding antioxidant enzymes SOD2 (rs4880), GPX1 (rs1050450), and CAT (rs1001179) and susceptibility to male infertility: a genetic association study and in silico analysis. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:86412-86424. [PMID: 37405601 DOI: 10.1007/s11356-023-28474-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 06/23/2023] [Indexed: 07/06/2023]
Abstract
Enzymatic factors including superoxide dismutase (SOD), glutathione peroxidase (GPX), and catalase (CAT) are among the most important protective antioxidant systems in human semen. This study was conducted to investigate the association between the activities of the mentioned enzymes in semen and also the association between SOD2 rs4880, GPX1 rs1050450, and CAT rs1001179 polymorphisms with male infertility, which was followed by a bioinformatics approach. In a case-control study, 223 infertile men and 154 healthy fertile men were included in the study. After extracting genomic DNA from semen samples, the genotype of rs1001179, rs1050450, and rs4880 polymorphisms was determined using the PCR-RFLP. Next, the activities of SOD, CAT, and GPX enzymes were also measured in semen. Bioinformatics software was used to investigate the effect of polymorphisms on the function of genes. Data analysis indicated that rs1001179 polymorphisms were not associated with male infertility. But our data revealed that the rs1050450 polymorphism is associated with a reduced risk of male infertility as well as asthenozoospermia and teratozoospermia. In addition, rs4880 polymorphism was associated with an increased risk of male infertility as well as teratozoospermia. Further analysis showed that the activity of the CAT enzyme in the infertile group is significantly higher than in the fertile group, but the activity of GPX and SOD enzymes in the infertile group is significantly lower than in the fertile group. Bioinformatic analysis showed that rs1001179 polymorphism affects the transcription factors binding site upstream of the gene, while rs1050450 and rs4880 polymorphisms had an essential role in protein structure and function. On the other hand, rs1050450 (T allele) was exposed to a reduced risk of male infertility and may be a protective factor. And SOD2 rs4880 (C allele) is associated with an increased risk of male infertility, and it is considered a risk factor for male infertility. To reach accurate results, we recommend that the study of SOD2 rs4880 and GPX1 rs1050450 polymorphism effects in the different populations with a larger sample size and meta-analysis are needed.
Collapse
Affiliation(s)
- Fatemeh Fallah
- Department of Molecular and Cell Biology, Faculty of Science, University of Mazandaran, Babolsar, CP:47416-95447, Mazandaran, Iran
| | - Abasalt Hosseinzadeh Colagar
- Department of Molecular and Cell Biology, Faculty of Science, University of Mazandaran, Babolsar, CP:47416-95447, Mazandaran, Iran.
| | - Hayder Abdulhadi Saleh
- Department of Molecular and Cell Biology, Faculty of Science, University of Mazandaran, Babolsar, CP:47416-95447, Mazandaran, Iran
| | - Mojtaba Ranjbar
- Faculty of Biotechnology, Amol University of Special Modern Technologies, Amol, Mazandaran, Iran
| |
Collapse
|
8
|
Alkhammas AH, Al-Thuwaini TM, Al-Shuhaib MBS, Khazaal NM. Association of Novel C319T Variant of PITX2 Gene 3'UTR Region With Reproductive Performance in Awassi Sheep. Bioinform Biol Insights 2023; 17:11779322231179018. [PMID: 37313032 PMCID: PMC10259137 DOI: 10.1177/11779322231179018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 05/13/2023] [Indexed: 06/15/2023] Open
Abstract
Several genes influence sheep's reproductive performance, among them the paired-like homeodomain transcription factor 2 (PITX2) gene. Thus, this study aimed to examine whether the variability within the PITX2 gene is associated with the reproductive performance of Awassi ewes. A total of 123 single-progeny ewes and 109 twin ewes were used to extract genomic DNA. An amplicon of 4 sequence fragments from exons 2, 4, 5 (upstream portion), and 5 (downstream portion) of the PITX2 gene was generated by polymerase chain reaction (PCR), 228, 304, 381, and 382 bp, respectively. Three genotypes of 382 bp amplicons were identified: CC, CT, and TT. Sequence analysis revealed a novel mutation in the CT genotype 319C > T. Statistical analysis revealed that single-nucleotide polymorphism (SNP) 319C > T was associated with reproductive performance. Single-nucleotide polymorphism 319C > T-carrying ewes had significantly (P ⩽ .01) lower litter sizes, twinning rates, lambing rates, and more days to lambing than those carrying CT and CC genotypes. Based on a logistic regression analysis, it was confirmed that the 319C > T SNP decreased litter size. Ewes with TT genotype produced fewer lambs than ewes with CT and CC genotypes. According to these results, the variant 319C> T SNP negatively affects the reproductive performance of Awassi sheep. Ewes carrying the 319C > T SNP have a lower litter size and are less prolific than those without the SNP.
Collapse
Affiliation(s)
- Ahmed H Alkhammas
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | - Tahreer M Al-Thuwaini
- Department of Animal Production, College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | | | - Neam M Khazaal
- Department of Physiology, Biochemistry and Pharmacology, College of Veterinary Medicine, University of Baghdad, Baghdad, Iraq
| |
Collapse
|
9
|
An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping. Nat Commun 2023; 14:1208. [PMID: 36869052 PMCID: PMC9984425 DOI: 10.1038/s41467-023-36897-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Accepted: 02/22/2023] [Indexed: 03/05/2023] Open
Abstract
Genetic sharing is extensively observed for autoimmune diseases, but the causal variants and their underlying molecular mechanisms remain largely unknown. Through systematic investigation of autoimmune disease pleiotropic loci, we found most of these shared genetic effects are transmitted from regulatory code. We used an evidence-based strategy to functionally prioritize causal pleiotropic variants and identify their target genes. A top-ranked pleiotropic variant, rs4728142, yielded many lines of evidence as being causal. Mechanistically, the rs4728142-containing region interacts with the IRF5 alternative promoter in an allele-specific manner and orchestrates its upstream enhancer to regulate IRF5 alternative promoter usage through chromatin looping. A putative structural regulator, ZBTB3, mediates the allele-specific loop to promote IRF5-short transcript expression at the rs4728142 risk allele, resulting in IRF5 overactivation and M1 macrophage polarization. Together, our findings establish a causal mechanism between the regulatory variant and fine-scale molecular phenotype underlying the dysfunction of pleiotropic genes in human autoimmunity.
Collapse
|
10
|
Al-Thuwaini TM, Albazi WJ, Al-Shuhaib MBS, Merzah LH, Mohammed RG, Rhadi FA, Abd Al-Hadi AB, Alkhammas AH. A Novel c.100C > G Mutation in the FST Gene and Its Relation With the Reproductive Traits of Awassi Ewes. Bioinform Biol Insights 2023; 17:11779322231170988. [PMID: 37153841 PMCID: PMC10159244 DOI: 10.1177/11779322231170988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 04/04/2023] [Indexed: 05/10/2023] Open
Abstract
Reproductive traits are affected by many factors, including ovarian function, hormones, and genetics. Genetic polymorphisms of candidate genes are associated with reproductive traits. Several candidate genes are associated with economic traits, including the follistatin (FST) gene. Thus, this study aimed to evaluate whether the genetic variations in the FST gene are associated with the reproductive traits in Awassi ewes. The genomic DNA was extracted from 109 twin ewes and 123 single-progeny ewes. Therefore, 4 sequence fragments from the FST gene were amplified using polymerase chain reaction (PCR) (exon 2/240, exon 3/268, exon 4/254, and exon 5/266 bp, respectively). For a 254 bp amplicon, 3 genotypes were identified: CC, CG, and GG. Sequencing revealed a novel mutation in CG genotypes c.100C > G. The statistical analysis of c.100C > G showed an association with reproductive characteristics. Ewes carrying the c.100C > G had significantly (P ⩽ .01) lower litter sizes, twinning rates, lambing rates, and more days to lambing compared with CG and CC genotypes. Logistic regression analysis confirmed that the c.100C > G single-nucleotide polymorphism (SNP) is responsible for decreasing litter size. According to these results, the variant c.100C > G negatively affects the traits of interest and is associated with lower reproductive traits in Awassi sheep. As a result of this study, ewes carrying the c.100C > G SNP have lower litter size and are less prolific.
Collapse
Affiliation(s)
- Tahreer M Al-Thuwaini
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
- Tahreer M Al-Thuwaini, Department of Animal
Production, College of Agriculture, Al-Qasim Green University, Al-Qasim, Babil,
Iraq. ;
| | - Wefak J Albazi
- Department of Physiology, College of
Veterinary Medicine, University of Kerbala, Kerbala, Iraq
| | | | - Layth H Merzah
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | - Rihab G Mohammed
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | - Fadhil A Rhadi
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | - Ali B Abd Al-Hadi
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| | - Ahmed H Alkhammas
- Department of Animal Production,
College of Agriculture, Al-Qasim Green University, Al-Qasim, Iraq
| |
Collapse
|
11
|
McAfee JC, Bell JL, Krupa O, Matoba N, Stein JL, Won H. Focus on your locus with a massively parallel reporter assay. J Neurodev Disord 2022; 14:50. [PMID: 36085003 PMCID: PMC9463819 DOI: 10.1186/s11689-022-09461-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 09/01/2022] [Indexed: 01/01/2023] Open
Abstract
A growing number of variants associated with risk for neurodevelopmental disorders have been identified by genome-wide association and whole genome sequencing studies. As common risk variants often fall within large haplotype blocks covering long stretches of the noncoding genome, the causal variants within an associated locus are often unknown. Similarly, the effect of rare noncoding risk variants identified by whole genome sequencing on molecular traits is seldom known without functional assays. A massively parallel reporter assay (MPRA) is an assay that can functionally validate thousands of regulatory elements simultaneously using high-throughput sequencing and barcode technology. MPRA has been adapted to various experimental designs that measure gene regulatory effects of genetic variants within cis- and trans-regulatory elements as well as posttranscriptional processes. This review discusses different MPRA designs that have been or could be used in the future to experimentally validate genetic variants associated with neurodevelopmental disorders. Though MPRA has limitations such as it does not model genomic context, this assay can help narrow down the underlying genetic causes of neurodevelopmental disorders by screening thousands of sequences in one experiment. We conclude by describing future directions of this technique such as applications of MPRA for gene-by-environment interactions and pharmacogenetics.
Collapse
Affiliation(s)
- Jessica C. McAfee
- grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ,grid.10698.360000000122483208UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Jessica L. Bell
- grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ,grid.10698.360000000122483208UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Oleh Krupa
- grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ,grid.10698.360000000122483208UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Nana Matoba
- grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ,grid.10698.360000000122483208UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Jason L. Stein
- grid.10698.360000000122483208Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA ,grid.10698.360000000122483208UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599 USA
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA. .,UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
12
|
Variants of the SCD gene and their association with fatty acid composition in Awassi sheep. Mol Biol Rep 2022; 49:7807-7813. [PMID: 35652978 DOI: 10.1007/s11033-022-07606-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 05/13/2022] [Indexed: 10/18/2022]
Abstract
BACKGROUND Genetic factors affect the variability of fatty acid composition in ruminant products. Thus, this study aimed to investigate the association between the variations of the SCD gene and fatty acid composition in Awassi sheep. METHODS AND RESULTS A total of 100 Awassi rams between the ages of one and two and a half years old were used in this study. Blood samples were taken at abattoirs in Babylon, and from each animal, longissimus dorsi (LD) muscle samples were taken to measure the fatty acid composition. DNA samples were isolated from each blood sample, then PCR-single strand conformation polymorphism (PCR-SSCP) experiments were conducted for genotyping followed by sequencing reactions. The study identified two genotypes (TT and TA) of the SCD gene (exon 3). Several novel variants were discovered in the amplified fragments of the SCD gene. CONCLUSIONS The TA genotype resulted in increased intramuscular fat and monounsaturated fatty acids compared to the TT genotype. Breeding for the TA genotype could be used for producing meat containing less saturated fatty acids and more monounsaturated fatty acids, making meat more favorable for human consumption.
Collapse
|
13
|
Boujemaa M, Mighri N, Chouchane L, Boubaker MS, Abdelhak S, Boussen H, Hamdi Y. Health influenced by genetics: A first comprehensive analysis of breast cancer high and moderate penetrance susceptibility genes in the Tunisian population. PLoS One 2022; 17:e0265638. [PMID: 35333900 PMCID: PMC8956157 DOI: 10.1371/journal.pone.0265638] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 03/04/2022] [Indexed: 12/03/2022] Open
Abstract
Significant advances have been made to understand the genetic basis of breast cancer. High, moderate and low penetrance variants have been identified with inter-ethnic variability in mutation frequency and spectrum. Genome wide association studies (GWAS) are widely used to identify disease-associated SNPs. Understanding the functional impact of these risk-SNPs will help the translation of GWAS findings into clinical interventions. Here we aim to characterize the genetic patterns of high and moderate penetrance breast cancer susceptibility genes and to assess the functional impact of non-coding SNPs. We analyzed BRCA1/2, PTEN, STK11, TP53, ATM, BRIP1, CHEK2 and PALB2 genotype data obtained from 135 healthy participants genotyped using Affymetrix Genome-Wide Human SNP-Array 6.0. Haplotype analysis was performed using Haploview.V4.2 and PHASE.V2.1. Population structure and genetic differentiation were assessed using principal component analysis (PCA) and fixation index (FST). Functional annotation was performed using In Silico web-based tools including RegulomeDB and VARAdb. Haplotype analysis showed distinct LD patterns with high levels of recombination and haplotype blocks of moderate to small size. Our findings revealed also that the Tunisian population tends to have a mixed origin with European, South Asian and Mexican footprints. Functional annotation allowed the selection of 28 putative regulatory variants. Of special interest were BRCA1_ rs8176318 predicted to alter the binding sites of a tumor suppressor miRNA hsa-miR-149 and PALB2_ rs120963 located in tumorigenesis-associated enhancer and predicted to strongly affect the binding of P53. Significant differences in allele frequencies were observed with populations of African and European ancestries for rs8176318 and rs120963 respectively. Our findings will help to better understand the genetic basis of breast cancer by guiding upcoming genome wide studies in the Tunisian population. Putative functional SNPs may be used to develop an efficient polygenic risk score to predict breast cancer risk leading to better disease prevention and management.
Collapse
Affiliation(s)
- Maroua Boujemaa
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Najah Mighri
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Lotfi Chouchane
- Department of Genetic Medicine, Weill Cornell Medicine, New York, New York, United States of America
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, New York, United States of America
- Laboratory of Genetic Medicine and Immunology, Weill Cornell Medicine-Qatar, Doha, Qatar
| | - Mohamed Samir Boubaker
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
- Laboratory of Human and Experimental Pathology, Institut Pasteur de Tunis, Tunis, Tunisia
| | - Sonia Abdelhak
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
| | - Hamouda Boussen
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
- Medical Oncology Department, Abderrahman Mami Hospital, Faculty of Medicine Tunis, University Tunis El Manar, Tunis, Tunisia
| | - Yosr Hamdi
- Laboratory of Biomedical Genomics and Oncogenetics, Institut Pasteur de Tunis, University of Tunis El Manar, Tunis, Tunisia
- Laboratory of Human and Experimental Pathology, Institut Pasteur de Tunis, Tunis, Tunisia
- * E-mail:
| |
Collapse
|
14
|
Khan K, Ahram DF, Liu YP, Westland R, Sampogna RV, Katsanis N, Davis EE, Sanna-Cherchi S. Multidisciplinary approaches for elucidating genetics and molecular pathogenesis of urinary tract malformations. Kidney Int 2022; 101:473-484. [PMID: 34780871 PMCID: PMC8934530 DOI: 10.1016/j.kint.2021.09.034] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 09/15/2021] [Accepted: 09/30/2021] [Indexed: 12/28/2022]
Abstract
Advances in clinical diagnostics and molecular tools have improved our understanding of the genetically heterogeneous causes underlying congenital anomalies of kidney and urinary tract (CAKUT). However, despite a sharp incline of CAKUT reports in the literature within the past 2 decades, there remains a plateau in the genetic diagnostic yield that is disproportionate to the accelerated ability to generate robust genome-wide data. Explanations for this observation include (i) diverse inheritance patterns with incomplete penetrance and variable expressivity, (ii) rarity of single-gene drivers such that large sample sizes are required to meet the burden of proof, and (iii) multigene interactions that might produce either intra- (e.g., copy number variants) or inter- (e.g., effects in trans) locus effects. These challenges present an opportunity for the community to implement innovative genetic and molecular avenues to explain the missing heritability and to better elucidate the mechanisms that underscore CAKUT. Here, we review recent multidisciplinary approaches at the intersection of genetics, genomics, in vivo modeling, and in vitro systems toward refining a blueprint for overcoming the diagnostic hurdles that are pervasive in urinary tract malformation cohorts. These approaches will not only benefit clinical management by reducing age at molecular diagnosis and prompting early evaluation for comorbid features but will also serve as a springboard for therapeutic development.
Collapse
Affiliation(s)
- Kamal Khan
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA.,Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois, USA (current address)
| | - Dina F. Ahram
- Division of Nephrology, Columbia University, New York, USA
| | - Yangfan P. Liu
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA
| | - Rik Westland
- Division of Nephrology, Columbia University, New York, USA.,Department of Pediatric Nephrology, Amsterdam UMC- Emma Children’s Hospital, Amsterdam, NL
| | | | - Nicholas Katsanis
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA; Stanley Manne Children's Research Institute, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, Illinois, USA (current address); Department of Pediatrics, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA; Department of Cell and Developmental Biology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA.
| | - Erica E. Davis
- Center for Human Disease Modeling, Duke University, Durham, North Carolina, USA.,Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois, USA (current address).,Department of Pediatrics and Department of Cell and Molecular Biology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, USA.,To whom correspondence should be addressed: ADDRESS CORRESPONDENCE TO: Simone Sanna-Cherchi, MD, Division of Nephrology, Columbia University, College of Physicians and Surgeons, New York, NY 10032, USA; Phone: 212-851-4925; Fax: 212-851-5461; . Erica E. Davis, PhD, Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA; Phone: 312-503-7662; Fax: 312-503-7343; , Nicholas Katsanis, PhD, Stanley Manne Children’s Research Institute, Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, IL 60611, USA; Phone: 312-503-7339; Fax: 312-503-7343;
| | - Simone Sanna-Cherchi
- Department of Medicine, Division of Nephrology, Columbia University Irving Medical Center, New York, New York, USA.
| |
Collapse
|
15
|
Cheng X, Shi J, Jia Z, Ha P, Soo C, Ting K, James AW, Shi B, Zhang X. NELL-1 in Genome-Wide Association Studies across Human Diseases. THE AMERICAN JOURNAL OF PATHOLOGY 2022; 192:395-405. [PMID: 34890556 PMCID: PMC8895422 DOI: 10.1016/j.ajpath.2021.11.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Revised: 11/19/2021] [Accepted: 11/24/2021] [Indexed: 02/08/2023]
Abstract
Neural epidermal growth factor-like (EGFL)-like protein (NELL)-1 is a potent and key osteogenic factor in the development and regeneration of skeletal tissues. Intriguingly, accumulative data from genome-wide association studies (GWASs) have started unveiling potential broader roles of NELL-1 beyond its functions in bone and cartilage. With exploration of the genetic variants of the entire genome in large-scale disease cohorts, GWASs have been used for establishing the connection between specific single-nucleotide polymorphisms of NELL1, in addition to osteoporosis, metabolic diseases, inflammatory conditions, neuropsychiatric diseases, neurodegenerative disorders, and malignant tumors. This review summarizes the findings from GWASs on the manifestation, significance level, implications on function, and correlation of specific NELL1 single-nucleotide polymorphisms in various disorders in humans. By offering a unique and comprehensive correlation between genetic variants and plausible functions of NELL1 in GWASs, this review illustrates the wide range of potential effects of a single gene on the pathogenesis of multiple disorders in humans.
Collapse
Affiliation(s)
- Xu Cheng
- State Key Laboratory of Oral Diseases, National Clinical Research Centre for Oral Diseases, and the Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China,Section of Orthodontics, Division of Growth and Development, School of Dentistry, University of California–Los Angeles, Los Angeles, California
| | - Jiayu Shi
- Section of Orthodontics, Division of Growth and Development, School of Dentistry, University of California–Los Angeles, Los Angeles, California
| | - Zhonglin Jia
- State Key Laboratory of Oral Diseases, National Clinical Research Centre for Oral Diseases, and the Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Pin Ha
- Section of Orthodontics, Division of Growth and Development, School of Dentistry, University of California–Los Angeles, Los Angeles, California
| | - Chia Soo
- Division of Plastic and Reconstructive Surgery, Department of Orthopaedic Surgery, Orthopaedic Hospital Research Center, University of California–Los Angeles, Los Angeles, California
| | - Kang Ting
- Forsyth Institute, affiliate of the Harvard School of Dental Medicine, Boston, Massachusetts
| | - Aaron W. James
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Bing Shi
- State Key Laboratory of Oral Diseases, National Clinical Research Centre for Oral Diseases, and the Department of Cleft Lip and Palate, West China Hospital of Stomatology, Sichuan University, Chengdu, China.
| | - Xinli Zhang
- Section of Orthodontics, Division of Growth and Development, School of Dentistry, University of California-Los Angeles, Los Angeles, California.
| |
Collapse
|
16
|
Osman N, Shawky AEM, Brylinski M. Exploring the effects of genetic variation on gene regulation in cancer in the context of 3D genome structure. BMC Genom Data 2022; 23:13. [PMID: 35176995 PMCID: PMC8851830 DOI: 10.1186/s12863-021-01021-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Accepted: 12/23/2021] [Indexed: 12/31/2022] Open
Abstract
Background Numerous genome-wide association studies (GWAS) conducted to date revealed genetic variants associated with various diseases, including breast and prostate cancers. Despite the availability of these large-scale data, relatively few variants have been functionally characterized, mainly because the majority of single-nucleotide polymorphisms (SNPs) map to the non-coding regions of the human genome. The functional characterization of these non-coding variants and the identification of their target genes remain challenging. Results In this communication, we explore the potential functional mechanisms of non-coding SNPs by integrating GWAS with the high-resolution chromosome conformation capture (Hi-C) data for breast and prostate cancers. We show that more genetic variants map to regulatory elements through the 3D genome structure than the 1D linear genome lacking physical chromatin interactions. Importantly, the association of enhancers, transcription factors, and their target genes with breast and prostate cancers tends to be higher when these regulatory elements are mapped to high-risk SNPs through spatial interactions compared to simply using a linear proximity. Finally, we demonstrate that topologically associating domains (TADs) carrying high-risk SNPs also contain gene regulatory elements whose association with cancer is generally higher than those belonging to control TADs containing no high-risk variants. Conclusions Our results suggest that many SNPs may contribute to the cancer development by affecting the expression of certain tumor-related genes through long-range chromatin interactions with gene regulatory elements. Integrating large-scale genetic datasets with the 3D genome structure offers an attractive and unique approach to systematically investigate the functional mechanisms of genetic variants in disease risk and progression. Supplementary Information The online version contains supplementary material available at 10.1186/s12863-021-01021-x.
Collapse
Affiliation(s)
- Noha Osman
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.,Department of Cell Biology, National Research Centre, Giza, 12622, Egypt.,Department of Medicine, Baylor College of Medicine, Houston, Texas, 77030, USA
| | - Abd-El-Monsif Shawky
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
17
|
Khan AA, Kim N, Korstanje R, Choi S. Loss-of-function mutation in Pcsk1 increases serum APOA1 level and LCAT activity in mice. Lab Anim Res 2022; 38:1. [PMID: 34996527 PMCID: PMC8739671 DOI: 10.1186/s42826-021-00111-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 12/29/2021] [Indexed: 01/20/2023] Open
Abstract
Background The convertase subtilisin/kexin family 1 gene (PCSK1) has been associated in various human genetics studies with a wide spectrum of metabolic phenotypes, including early-onset obesity, hyperphagia, diabetes insipidus, and others. Despite the evident influence of PCSK1 on obesity and the known functions of other PCSKs in lipid metabolism, the role of PCSK1 specifically in lipid and cholesterol metabolism remains unclear. This study evaluated the effect of loss of PCSK1 function on high-density lipoprotein (HDL) metabolism in mice. Results HDL cholesterol, apolipoprotein A1 (APOA1) levels in serum and liver, and the activities of two enzymes (lecithin-cholesterol acyltransferase, LCAT and phospholipid transfer protein, PLTP) were evaluated in 8-week-old mice with a non-synonymous single nucleotide mutation leading to an amino acid substitution in PCSK1, which results in a loss of protein’s function. Mutant mice had similar serum HDL cholesterol concentration but increased levels of serum total and mature APOA1, and LCAT activity in comparison to controls. Conclusions This study presents the first evaluation of the role of PCSK1 in HDL metabolism using a loss-of-function mutant mouse model. Further investigations will be needed to determine the underlying molecular mechanism.
Collapse
Affiliation(s)
| | - Nakyung Kim
- Cerebrovascular Haematology-Immunology Priority Research Center, Medical Science Research Institute, Dongguk University Ilsan Hospital, Goyang, 10326, Republic of Korea
| | - Ron Korstanje
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Seungbum Choi
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA. .,Cerebrovascular Haematology-Immunology Priority Research Center, Medical Science Research Institute, Dongguk University Ilsan Hospital, Goyang, 10326, Republic of Korea.
| |
Collapse
|
18
|
Tarek MM, Yahia A, El-Nakib MM, Elhefnawi M. Integrative assessment of CIP2A overexpression and mutational effects in human malignancies identifies possible deleterious variants. Comput Biol Med 2021; 139:104986. [PMID: 34739970 DOI: 10.1016/j.compbiomed.2021.104986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 10/23/2021] [Accepted: 10/24/2021] [Indexed: 10/19/2022]
Abstract
KIAA1524 is the gene encoding the human cancerous inhibitor of PP2A (CIP2A) protein which is regarded as a novel target for cancer therapy. It is overexpressed in 65%-90% of tissues in almost all studied human cancers. CIP2A expression correlates with cancer progression, disease aggressivity in lung cancer besides poor survival and resistance to chemotherapy in breast cancer. Herein, a pan-cancer analysis of public gene expression datasets was conducted showing significant upregulation of CIP2A in cancerous and metastatic tissues. CIP2A overexpression also correlated with poor survival of cancer patients. To determine the non-coding variants associated with CIP2A overexpression, 5'UTR and 3'UTR variants were annotated and scored using RegulomeDB and Enformer deep learning model. The 5'UTR variants rs1239349555, rs1576326380, and rs1231839144 were predicted to be potential regulators of CIP2A overexpression scoring best on RegulomeDB annotations with a high "2a" rank of supporting experimental data. These variants also scored the highest on Enformer predictions. Analysis of the 3'UTR variants of CIP2A predicted rs56255137 and rs58758610 to alter binding sites of hsa-miR-500a-5 and (hsa-miR-3671, hsa-miR-5692a) respectively. Both variants were also found in linkage disequilibrium with rs11709183 and rs147863209 respectively at r2 ≥ 0.8. The aforementioned variants were found to be eQTL hits significantly associated with CIP2A overexpression. Further, analysis of rs11709183 and rs147863209 revealed a high "2b" rank on RegulomeDB annotations indicating a probable effect on DNAse transcription factors binding. The MuTarget analysis indicated that somatic mutations in TP53 are significantly associated with upregulated CIP2A in human cancers. Analysis of missense SNPs on CIP2A solved structure predicted seven deleterious effects. Four of these variants were also predicted as structurally and functionally destabilizing to CIP2A including; rs375108755, rs147942716, rs368722879, and rs367941403. Variant rs1193091427 was predicted as a potential intronic splicing mutation that might be responsible for the novel CIP2A variant (NOCIVA) in multiple myeloma. Finally, Enrichment of the Wnt/β-catenin pathway within the CIP2A regulatory gene network suggested potential of therapeutic combinations between FTY720 with Wnt/β-catenin, Plk1 and/or HDAC inhibitors to downregulate CIP2A which has been shown to be essential for the survival of different cancer cell lines.
Collapse
Affiliation(s)
- Mohammad M Tarek
- Bioinformatics Department, Armed Forces College of Medicine (AFCM) Cairo, Egypt.
| | - Ahmed Yahia
- Otolaryngology Department, Armed Forces College of Medicine (AFCM) Cairo, Egypt
| | | | - Mahmoud Elhefnawi
- Biomedical Informatics and Chemo-Informatics Group, Centre of Excellence for Medical Research, Informatics and Systems Department, National Research Centre, Cairo, Egypt
| |
Collapse
|
19
|
Henriques D, Lopes AR, Chejanovsky N, Dalmon A, Higes M, Jabal-Uriel C, Le Conte Y, Reyes-Carreño M, Soroker V, Martín-Hernández R, Pinto MA. A SNP assay for assessing diversity in immune genes in the honey bee (Apis mellifera L.). Sci Rep 2021; 11:15317. [PMID: 34321557 PMCID: PMC8319136 DOI: 10.1038/s41598-021-94833-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023] Open
Abstract
With a growing number of parasites and pathogens experiencing large-scale range expansions, monitoring diversity in immune genes of host populations has never been so important because it can inform on the adaptive potential to resist the invaders. Population surveys of immune genes are becoming common in many organisms, yet they are missing in the honey bee (Apis mellifera L.), a key managed pollinator species that has been severely affected by biological invasions. To fill the gap, here we identified single nucleotide polymorphisms (SNPs) in a wide range of honey bee immune genes and developed a medium-density assay targeting a subset of these genes. Using a discovery panel of 123 whole-genomes, representing seven A. mellifera subspecies and three evolutionary lineages, 180 immune genes were scanned for SNPs in exons, introns (< 4 bp from exons), 3' and 5´UTR, and < 1 kb upstream of the transcription start site. After application of multiple filtering criteria and validation, the final medium-density assay combines 91 quality-proved functional SNPs marking 89 innate immune genes and these can be readily typed using the high-sample-throughput iPLEX MassARRAY system. This medium-density-SNP assay was applied to 156 samples from four countries and the admixture analysis clustered the samples according to their lineage and subspecies, suggesting that honey bee ancestry can be delineated from functional variation. In addition to allowing analysis of immunogenetic variation, this newly-developed SNP assay can be used for inferring genetic structure and admixture in the honey bee.
Collapse
Affiliation(s)
- Dora Henriques
- Centro de Investigação de Montanha, Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - Ana R Lopes
- Centro de Investigação de Montanha, Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal
| | - Nor Chejanovsky
- Agricultural Research Organization, The Volcani Center, Rishon LeTsiyon, Israel
| | - Anne Dalmon
- INRAE, Unité Abeilles et Environnement, Avignon, France
| | - Mariano Higes
- IRIAF, Instituto Regional de Investigación y Desarrollo Agroalimentario y Forestal, Laboratorio de Patología Apícola, Centro de Investigación Apícola y Agroambiental (CIAPA), Consejería de Agricultura de la Junta de Comunidades de Castilla-La Mancha, Marchamalo, Spain
| | - Clara Jabal-Uriel
- IRIAF, Instituto Regional de Investigación y Desarrollo Agroalimentario y Forestal, Laboratorio de Patología Apícola, Centro de Investigación Apícola y Agroambiental (CIAPA), Consejería de Agricultura de la Junta de Comunidades de Castilla-La Mancha, Marchamalo, Spain
| | - Yves Le Conte
- INRAE, Unité Abeilles et Environnement, Avignon, France
| | | | - Victoria Soroker
- Agricultural Research Organization, The Volcani Center, Rishon LeTsiyon, Israel
| | - Raquel Martín-Hernández
- IRIAF, Instituto Regional de Investigación y Desarrollo Agroalimentario y Forestal, Laboratorio de Patología Apícola, Centro de Investigación Apícola y Agroambiental (CIAPA), Consejería de Agricultura de la Junta de Comunidades de Castilla-La Mancha, Marchamalo, Spain
- Instituto de Recursos Humanos para la Ciencia y la Tecnología (INCRECYT-FEDER), Fundación Parque Científico y Tecnológico de Castilla-La Mancha, 02006, Albacete, Spain
| | - M Alice Pinto
- Centro de Investigação de Montanha, Instituto Politécnico de Bragança, Campus de Santa Apolónia, 5300-253, Bragança, Portugal.
| |
Collapse
|
20
|
Role of SNPs in the Biogenesis of Mature miRNAs. BIOMED RESEARCH INTERNATIONAL 2021; 2021:2403418. [PMID: 34239922 PMCID: PMC8233088 DOI: 10.1155/2021/2403418] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 04/12/2021] [Accepted: 05/31/2021] [Indexed: 12/16/2022]
Abstract
Single nucleotide polymorphisms (SNPs) play a significant role in microRNA (miRNA) generation, processing, and function and contribute to multiple phenotypes and diseases. Therefore, whole-genome analysis of how SNPs affect miRNA maturation mechanisms is important for precision medicine. The present study established an SNP-associated pre-miRNA (SNP-pre-miRNA) database, named miRSNPBase, and constructed SNP-pre-miRNA sequences. We also identified phenotypes and disease biomarker-associated isoform miRNA (isomiR) based on miRFind, which was developed in our previous study. We identified functional SNPs and isomiRs. We analyzed the biological characteristics of functional SNPs and isomiRs and studied their distribution in different ethnic groups using whole-genome analysis. Notably, we used individuals from Great Britain (GBR) as examples and identified isomiRs and isomiR-associated SNPs (iso-SNPs). We performed sequence alignments of isomiRs and miRNA sequencing data to verify the identified isomiRs and further revealed GBR ethnographic epigenetic dominant biomarkers. The SNP-pre-miRNA database consisted of 886 pre-miRNAs and 2640 SNPs. We analyzed the effects of SNP type, SNP location, and SNP-mediated free energy change during mature miRNA biogenesis and found that these factors were closely associated to mature miRNA biogenesis. Remarkably, 158 isomiRs were verified in the miRNA sequencing data for the 18 GBR samples. Our results indicated that SNPs affected the mature miRNA processing mechanism and contributed to the production of isomiRs. This mechanism may have important significance for epigenetic changes and diseases.
Collapse
|
21
|
Srivastava K, Fratzscher AS, Lan B, Flegel WA. Cataloguing experimentally confirmed 80.7 kb-long ACKR1 haplotypes from the 1000 Genomes Project database. BMC Bioinformatics 2021; 22:273. [PMID: 34039276 PMCID: PMC8150616 DOI: 10.1186/s12859-021-04169-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 05/04/2021] [Indexed: 12/18/2022] Open
Abstract
Background Clinically effective and safe genotyping relies on correct reference sequences, often represented by haplotypes. The 1000 Genomes Project recorded individual genotypes across 26 different populations and, using computerized genotype phasing, reported haplotype data. In contrast, we identified long reference sequences by analyzing the homozygous genomic regions in this online database, a concept that has rarely been reported since next generation sequencing data became available. Study design and methods Phased genotype data for a 80.6 kb region of chromosome 1 was downloaded for all 2,504 unrelated individuals of the 1000 Genome Project Phase 3 cohort. The data was centered on the ACKR1 gene and bordered by the CADM3 and FCER1A genes. Individuals with heterozygosity at a single site or with complete homozygosity allowed unambiguous assignment of an ACKR1 haplotype. A computer algorithm was developed for extracting these haplotypes from the 1000 Genome Project in an automated fashion. A manual analysis validated the data extracted by the algorithm. Results We confirmed 902 ACKR1 haplotypes of varying lengths, the longest at 80,584 nucleotides and shortest at 1,901 nucleotides. The combined length of haplotype sequences comprised 19,895,388 nucleotides with a median of 16,014 nucleotides. Based on our approach, all haplotypes can be considered experimentally confirmed and not affected by the known errors of computerized genotype phasing. Conclusions Tracts of homozygosity can provide definitive reference sequences for any gene. They are particularly useful when observed in unrelated individuals of large scale sequence databases. As a proof of principle, we explored the 1000 Genomes Project database for ACKR1 gene data and mined long haplotypes. These haplotypes are useful for high throughput analysis with next generation sequencing. Our approach is scalable, using automated bioinformatics tools, and can be applied to any gene. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04169-6.
Collapse
Affiliation(s)
- Kshitij Srivastava
- Laboratory Services Section, Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Anne-Sophie Fratzscher
- Laboratory Services Section, Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Bo Lan
- Laboratory Services Section, Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Willy Albert Flegel
- Laboratory Services Section, Department of Transfusion Medicine, NIH Clinical Center, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
22
|
Pachganov S, Murtazalieva K, Zarubin A, Taran T, Chartier D, Tatarinova TV. Prediction of Rice Transcription Start Sites Using TransPrise: A Novel Machine Learning Approach. Methods Mol Biol 2021; 2238:261-274. [PMID: 33471337 DOI: 10.1007/978-1-0716-1068-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As the interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper, we present TransPrise-an efficient deep learning tool for predicting positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well-annotated genome of Oryza sativa. Using a computer with a graphics processing unit, the run time of TransPrise is 250 min on a genome of 374 Mb long.We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all the necessary packages, models, and code as well as the source code of the TransPrise algorithm are available at http://compubioverne.group/ . The source code is ready to use and to be customized to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | | | - Alexei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane Chartier
- International Center for Art Intelligence, Inc, Los Angeles, CA, USA
| | - Tatiana V Tatarinova
- Vavilov Institute of General Genetics, Moscow, Russia.
- Department of Biology, University of La Verne, La Verne, CA, USA.
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
- Siberian Federal University, Krasnoyarsk, Russia.
| |
Collapse
|
23
|
Mahtani-Williams S, Fulton W, Desvars-Larrive A, Lado S, Elbers JP, Halpern B, Herczeg D, Babocsay G, Lauš B, Nagy ZT, Jablonski D, Kukushkin O, Orozco-terWengel P, Vörös J, Burger PA. Landscape Genomics of a Widely Distributed Snake, Dolichophis caspius (Gmelin, 1789) across Eastern Europe and Western Asia. Genes (Basel) 2020; 11:genes11101218. [PMID: 33080926 PMCID: PMC7603136 DOI: 10.3390/genes11101218] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 10/02/2020] [Accepted: 10/15/2020] [Indexed: 11/29/2022] Open
Abstract
Across the distribution of the Caspian whipsnake (Dolichophis caspius), populations have become increasingly disconnected due to habitat alteration. To understand population dynamics and this widespread but locally endangered snake’s adaptive potential, we investigated population structure, admixture, and effective migration patterns. We took a landscape-genomic approach to identify selected genotypes associated with environmental variables relevant to D. caspius. With double-digest restriction-site associated DNA (ddRAD) sequencing of 53 samples resulting in 17,518 single nucleotide polymorphisms (SNPs), we identified 8 clusters within D. caspius reflecting complex evolutionary patterns of the species. Estimated Effective Migration Surfaces (EEMS) revealed higher-than-average gene flow in most of the Balkan Peninsula and lower-than-average gene flow along the middle section of the Danube River. Landscape genomic analysis identified 751 selected genotypes correlated with 7 climatic variables. Isothermality correlated with the highest number of selected genotypes (478) located in 41 genes, followed by annual range (127) and annual mean temperature (87). We conclude that environmental variables, especially the day-to-night temperature oscillation in comparison to the summer-to-winter oscillation, may have an important role in the distribution and adaptation of D. caspius.
Collapse
Affiliation(s)
- Sarita Mahtani-Williams
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
- Cardiff School of Biosciences, Cardiff University, The Sir Martin Evans Building, Museum Ave, Cardiff CF103AX, UK;
- Fundación Charles Darwin, Avenida Charles Darwin s/n, Casilla 200144, Puerto Ayora EC-200350, Ecuador
| | - William Fulton
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
- Cardiff School of Biosciences, Cardiff University, The Sir Martin Evans Building, Museum Ave, Cardiff CF103AX, UK;
| | - Amelie Desvars-Larrive
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
- Institute of Food Safety, Food Technology and Veterinary Public Health, Vetmeduni Vienna, Veterinaerplatz 1, A-1210 Vienna, Austria
- Complexity Science Hub Vienna, Josefstädter Straße 39, A-1080 Vienna, Austria
| | - Sara Lado
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
| | - Jean Pierre Elbers
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
| | - Bálint Halpern
- MME Birdlife Hungary, Költő utca 21., H-1121 Budapest, Hungary; (B.H.); (G.B.)
| | - Dávid Herczeg
- Lendület Evolutionary Ecology Research Group, Centre for Agricultural Research, Plant Protection Institute, Herman Ottó út 15., H-1022 Budapest, Hungary;
| | - Gergely Babocsay
- MME Birdlife Hungary, Költő utca 21., H-1121 Budapest, Hungary; (B.H.); (G.B.)
- Mátra Museum of the Hungarian Natural History Museum, Kossuth Lajos utca 40., H-3200 Gyöngyös, Hungary
| | - Boris Lauš
- Association HYLA, Lipocac I., No. 7, C-10000 Zagreb, Croatia;
| | - Zoltán Tamás Nagy
- Independent Researcher, Hielscherstraße 25, D-13158 Berlin, Germany;
| | - Daniel Jablonski
- Department of Zoology, Comenius University in Bratislava, Ilkovičova 6, Mlynská Dolina, S-84215 Bratislava, Slovakia;
| | - Oleg Kukushkin
- Department of Biodiversity Studies and Ecological Monitoring, T. I. Vyazemsky Karadag Scientific Station–Nature Reserve–Branch of Institute of Biology of the Southern Seas of the Russian Academy of Sciences, Nauki Street 24, R-298188 Theodosia, Crimea;
- Department of Herpetology, Zoological Institute of the Russian Academy of Sciences, Universitetskaya Embankment 1, R-199034 Saint Petersburg, Russia
| | - Pablo Orozco-terWengel
- Cardiff School of Biosciences, Cardiff University, The Sir Martin Evans Building, Museum Ave, Cardiff CF103AX, UK;
| | - Judit Vörös
- Department of Zoology, Hungarian Natural History Museum, Baross u. 13., H-1088 Budapest, Hungary
- Molecular Taxonomy Laboratory, Hungarian Natural History Museum, Ludovika tér 2-6., H-1083 Budapest, Hungary
- Correspondence: (J.V.); (P.A.B.)
| | - Pamela Anna Burger
- Research Institute of Wildlife Ecology, Vetmeduni Vienna, Savoyenstrasse 1, A-1160 Vienna, Austria; (S.M.-W.); (W.F.); (A.D.-L.); (S.L.); (J.P.E.)
- Correspondence: (J.V.); (P.A.B.)
| |
Collapse
|
24
|
Osman N, Shawky A, Brylinski M. Exploring the effects of genetic variation on gene regulation in cancer in the context of 3D genome structure.. [DOI: 10.1101/2020.10.06.328567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
AbstractNumerous genome-wide association studies (GWAS) conducted to date revealed genetic variants associated with various diseases, including breast and prostate cancers. Despite the availability of these large-scale data, relatively few variants have been functionally characterized, mainly because the majority of single-nucleotide polymorphisms (SNPs) map to the non-coding regions of the human genome. The functional characterization of these non-coding variants and the identification of their target genes remain challenging. In this communication, we explore the potential functional mechanisms of non-coding SNPs by integrating GWAS with the high-resolution chromosome conformation capture (Hi-C) data for breast and prostate cancers. We show that more genetic variants map to regulatory elements through the 3D genome structure than the 1D linear genome lacking physical chromatin interactions. Importantly, the association of enhancers, transcription factors, and their target genes with breast and prostate cancers tends to be higher when these regulatory elements are mapped to high-risk SNPs through spatial interactions compared to simply using a linear proximity. Finally, we demonstrate that topologically associating domains (TADs) carrying high-risk SNPs also contain gene regulatory elements whose association with cancer is generally higher than those belonging to control TADs containing no high-risk variants. Our results suggest that many SNPs may contribute to the cancer development by affecting the expression of certain tumor-related genes through long-range chromatin interactions with gene regulatory elements. Integrating large-scale genetic datasets with the 3D genome structure offers an attractive and unique approach to systematically investigate the functional mechanisms of genetic variants in disease risk and progression.
Collapse
|
25
|
Devadasan MJ, Kumar DR, Vineeth MR, Choudhary A, Surya T, Niranjan SK, Verma A, Sivalingam J. Reduced representation approach for identification of genome-wide SNPs and their annotation for economically important traits in Indian Tharparkar cattle. 3 Biotech 2020; 10:309. [PMID: 32582506 DOI: 10.1007/s13205-020-02297-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 06/09/2020] [Indexed: 11/24/2022] Open
Abstract
The present study was carried out in Tharparkar cattle for identification of genome-wide SNPs and microsatellites, and then annotate the identified high-quality SNPs to milk production, fertility, carcass, adaptability and immune response of economically important traits. A total of 146,011 SNPs were identified with respect to Bos taurus reference genome which are indicus specific, out of which 10,519 SNPs were found to be novel. Similarly, a total of 87,047 SNPs were identified with respect to Bos indicus reference genome. After final annotation of SNPs identified with respect to Bos indicus reference genome, 2871 SNPs were found to be associated in 383 candidate genes having to do with milk production, fertility, carcass, immune response and adaptability traits. Following that, 2571 microsatellites were identified. The information mined from the data might be of importance for the future breed improvement programs, conservation efforts and for enhancing the SNPs density of the existing bovine SNP chips.
Collapse
Affiliation(s)
| | - D Ravi Kumar
- ICAR-National Dairy Research Institute, Karnal, India
| | - M R Vineeth
- ICAR-National Dairy Research Institute, Karnal, India
| | | | - T Surya
- ICAR-National Dairy Research Institute, Karnal, India
| | - S K Niranjan
- ICAR-National Bureau of Animal Genetic Resources, Karnal, India
| | - Archana Verma
- ICAR-National Dairy Research Institute, Karnal, India
| | | |
Collapse
|
26
|
Zhang S, He Y, Liu H, Zhai H, Huang D, Yi X, Dong X, Wang Z, Zhao K, Zhou Y, Wang J, Yao H, Xu H, Yang Z, Sham PC, Chen K, Li MJ. regBase: whole genome base-wise aggregation and functional prediction for human non-coding regulatory variants. Nucleic Acids Res 2020; 47:e134. [PMID: 31511901 PMCID: PMC6868349 DOI: 10.1093/nar/gkz774] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 08/29/2019] [Indexed: 12/19/2022] Open
Abstract
Predicting the functional or pathogenic regulatory variants in the human non-coding genome facilitates the interpretation of disease causation. While numerous prediction methods are available, their performance is inconsistent or restricted to specific tasks, which raises the demand of developing comprehensive integration for those methods. Here, we compile whole genome base-wise aggregations, regBase, that incorporate largest prediction scores. Building on different assumptions of causality, we train three composite models to score functional, pathogenic and cancer driver non-coding regulatory variants respectively. We demonstrate the superior and stable performance of our models using independent benchmarks and show great success to fine-map causal regulatory variants on specific locus or at base-wise resolution. We believe that regBase database together with three composite models will be useful in different areas of human genetic studies, such as annotation-based casual variant fine-mapping, pathogenic variant discovery as well as cancer driver mutation identification. regBase is freely available at https://github.com/mulinlab/regBase.
Collapse
Affiliation(s)
- Shijie Zhang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yukun He
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Huanhuan Liu
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Haoyu Zhai
- Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA
| | - Dandan Huang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Xiaobao Dong
- Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhao Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Ke Zhao
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Yao Zhou
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Jianhua Wang
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Hang Xu
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Zhenglu Yang
- College of Computer Science, Nankai University, Tianjin, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Key Laboratory of Inflammation Biology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin, China
| |
Collapse
|
27
|
Zheng Z, Huang D, Wang J, Zhao K, Zhou Y, Guo Z, Zhai S, Xu H, Cui H, Yao H, Wang Z, Yi X, Zhang S, Sham PC, Li MJ. QTLbase: an integrative resource for quantitative trait loci across multiple human molecular phenotypes. Nucleic Acids Res 2020; 48:D983-D991. [PMID: 31598699 PMCID: PMC6943073 DOI: 10.1093/nar/gkz888] [Citation(s) in RCA: 76] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 09/24/2019] [Accepted: 10/02/2019] [Indexed: 12/20/2022] Open
Abstract
Recent advances in genome sequencing and functional genomic profiling have promoted many large-scale quantitative trait locus (QTL) studies, which connect genotypes with tissue/cell type-specific cellular functions from transcriptional to post-translational level. However, no comprehensive resource can perform QTL lookup across multiple molecular phenotypes and investigate the potential cascade effect of functional variants. We developed a versatile resource, named QTLbase, for interpreting the possible molecular functions of genetic variants, as well as their tissue/cell-type specificity. Overall, QTLbase has five key functions: (i) curating and compiling genome-wide QTL summary statistics for 13 human molecular traits from 233 independent studies; (ii) mapping QTL-relevant tissue/cell types to 78 unified terms according to a standard anatomogram; (iii) normalizing variant and trait information uniformly, yielding >170 million significant QTLs; (iv) providing a rich web client that enables phenome- and tissue-wise visualization; and (v) integrating the most comprehensive genomic features and functional predictions to annotate the potential QTL mechanisms. QTLbase provides a one-stop shop for QTL retrieval and comparison across multiple tissues and multiple layers of molecular complexity, and will greatly help researchers interrogate the biological mechanism of causal variants and guide the direction of functional validation. QTLbase is freely available at http://mulinlab.org/qtlbase.
Collapse
Affiliation(s)
- Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Zhenyang Guo
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Sinan Zhai
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Hang Xu
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Hui Cui
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Pak Chung Sham
- Centre of Genomics Sciences, State Key Laboratory of Brain and Cognitive Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR 999077, China
| | - Mulin Jun Li
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, 2011 Collaborative Innovation Center of Tianjin for Medical Epigenetics, School of Basic Medical Sciences, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
28
|
Abstract
Expression quantitative trait locus (eQTL) analysis is a powerful method to understand the association between genetic variant and gene expression; it also has potential impact for the study of transcription medicine for human complex disease. In the past two decades, the researchers focus on studying the eQTL, while more and more evidence shows that the regulatory genetic variants locating noncoding region have strong effect for the gene expression. More and more researchers working on eQTL analysis realize the importance of other types of QTLs beyond eQTL. In this chapter, we will explore some QTLs beyond eQTLs that show the regulatory association with eQTLs and explain the underlying link among these types of QTLs.
Collapse
Affiliation(s)
- Jia Wen
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA.
| | - Conor Nodzak
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, Charlotte, NC, USA
| |
Collapse
|
29
|
Lin H, Hargreaves KA, Li R, Reiter JL, Wang Y, Mort M, Cooper DN, Zhou Y, Zhang C, Eadon MT, Dolan ME, Ipe J, Skaar TC, Liu Y. RegSNPs-intron: a computational framework for predicting pathogenic impact of intronic single nucleotide variants. Genome Biol 2019; 20:254. [PMID: 31779641 PMCID: PMC6883696 DOI: 10.1186/s13059-019-1847-4] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Accepted: 10/03/2019] [Indexed: 12/27/2022] Open
Abstract
Single nucleotide variants (SNVs) in intronic regions have yet to be systematically investigated for their disease-causing potential. Using known pathogenic and neutral intronic SNVs (iSNVs) as training data, we develop the RegSNPs-intron algorithm based on a random forest classifier that integrates RNA splicing, protein structure, and evolutionary conservation features. RegSNPs-intron showed excellent performance in evaluating the pathogenic impacts of iSNVs. Using a high-throughput functional reporter assay called ASSET-seq (ASsay for Splicing using ExonTrap and sequencing), we evaluate the impact of RegSNPs-intron predictions on splicing outcome. Together, RegSNPs-intron and ASSET-seq enable effective prioritization of iSNVs for disease pathogenesis.
Collapse
Affiliation(s)
- Hai Lin
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA
| | - Katherine A Hargreaves
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, 950 W Walnut St, Suite 419, Indianapolis, IN, 46202, USA
| | - Rudong Li
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA
| | - Jill L Reiter
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA
| | - Yue Wang
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, Cardiff University, Heath Park, Cardiff, CF14 4XN, UK
| | - Yaoqi Zhou
- Institute for Glycomics and School of Informatics and Communication Technology, Griffith University, Parklands Dr., Southport, QLD, 4215, Australia
| | - Chi Zhang
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA
| | - Michael T Eadon
- Division of Nephrology, Department of Medicine, Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - M Eileen Dolan
- Section of Hematology/Oncology, Department of Medicine, University of Chicago, Chicago, IL, 60637, USA
| | - Joseph Ipe
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, 950 W Walnut St, Suite 419, Indianapolis, IN, 46202, USA
| | - Todd C Skaar
- Division of Clinical Pharmacology, Department of Medicine, Indiana University School of Medicine, 950 W Walnut St, Suite 419, Indianapolis, IN, 46202, USA.
| | - Yunlong Liu
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, 46202, USA.
- Department of Medical & Molecular Genetics, Indiana University School of Medicine, 410 West 10th Street, Suite 5000, Indianapolis, IN, 46202, USA.
| |
Collapse
|
30
|
Pachganov S, Murtazalieva K, Zarubin A, Sokolov D, Chartier DR, Tatarinova TV. TransPrise: a novel machine learning approach for eukaryotic promoter prediction. PeerJ 2019; 7:e7990. [PMID: 31695967 PMCID: PMC6827441 DOI: 10.7717/peerj.7990] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2019] [Accepted: 10/04/2019] [Indexed: 02/01/2023] Open
Abstract
As interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper we present TransPrise-an efficient deep learning tool for prediction of positions of eukaryotic transcription start sites. Our pipeline consists of two parts: the binary classifier operates the first, and if a sequence is classified as TSS-containing the regression step follows, where the precise location of TSS is being identified. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise classification and regression models with the TSSPlant approach for the well annotated genome of Oryza sativa. Using a computer equipped with a graphics processing unit, the run time of TransPrise is 250 minutes on a genome of 374 Mb long. The Matthews correlation coefficient value for TransPrise is 0.79, more than two times larger than the 0.31 for TSSPlant classification models. This represents a high level of prediction accuracy. Additionally, the mean absolute error for the regression model is 29.19 nt, allowing for accurate prediction of TSS location. TransPrise was also tested in Homo sapiens, where mean absolute error of the regression model was 47.986 nt. We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all necessary packages, models, code as well as the source code of the TransPrise algorithm are available at (http://compubioverne.group/). The source code is ready to use and customizable to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | - Khalimat Murtazalieva
- Vavilov Institute for General Genetics, Moscow, Russia.,Institute of Bioinformatics, Moscow, Russia
| | - Aleksei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane R Chartier
- International Center for Art Intelligence, Inc., Los Angeles, CA, United States of America
| | - Tatiana V Tatarinova
- Vavilov Institute for General Genetics, Moscow, Russia.,Department of Biology, University of La Verne, La Verne, CA, United States of America.,A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.,Siberian Federal University, Krasnoyarsk, Russia
| |
Collapse
|
31
|
Huang D, Yi X, Zhang S, Zheng Z, Wang P, Xuan C, Sham PC, Wang J, Li MJ. GWAS4D: multidimensional analysis of context-specific regulatory variant for human complex diseases and traits. Nucleic Acids Res 2019; 46:W114-W120. [PMID: 29771388 PMCID: PMC6030885 DOI: 10.1093/nar/gky407] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 05/03/2018] [Indexed: 01/04/2023] Open
Abstract
Genome-wide association studies have generated over thousands of susceptibility loci for many human complex traits, and yet for most of these associations the true causal variants remain unknown. Tissue/cell type-specific prediction and prioritization of non-coding regulatory variants will facilitate the identification of causal variants and underlying pathogenic mechanisms for particular complex diseases and traits. By leveraging recent large-scale functional genomics/epigenomics data, we develop an intuitive web server, GWAS4D (http://mulinlab.tmu.edu.cn/gwas4d or http://mulinlab.org/gwas4d), that systematically evaluates GWAS signals and identifies context-specific regulatory variants. The updated web server includes six major features: (i) updates the regulatory variant prioritization method with our new algorithm; (ii) incorporates 127 tissue/cell type-specific epigenomes data; (iii) integrates motifs of 1480 transcriptional regulators from 13 public resources; (iv) uniformly processes Hi-C data and generates significant interactions at 5 kb resolution across 60 tissues/cell types; (v) adds comprehensive non-coding variant functional annotations; (vi) equips a highly interactive visualization function for SNP-target interaction. Using a GWAS fine-mapped set for 161 coronary artery disease risk loci, we demonstrate that GWAS4D is able to efficiently prioritize disease-causal regulatory variants.
Collapse
Affiliation(s)
- Dandan Huang
- Department of Pharmacology, Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Xianfu Yi
- School of Biomedical Engineering, Tianjin Medical University, Tianjin, China
| | - Shijie Zhang
- Department of Pharmacology, Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Zhanye Zheng
- Department of Pharmacology, Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Panwen Wang
- Department of Health Sciences Research & Center for Individualized Medicine, Mayo Clinic, Scottsdale, USA
| | - Chenghao Xuan
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| | - Pak Chung Sham
- Center for Genomic Sciences, The University of Hong Kong, Hong Kong SAR, China.,Departments of Psychiatry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China.,State Key Laboratory of Brain and Cognitive Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Junwen Wang
- Department of Health Sciences Research & Center for Individualized Medicine, Mayo Clinic, Scottsdale, USA.,Department of Biomedical Informatics, Arizona State University, Scottsdale, USA
| | - Mulin Jun Li
- Department of Pharmacology, Collaborative Innovation Center of Tianjin for Medical Epigenetics, Tianjin Key Laboratory of Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin, China
| |
Collapse
|
32
|
Beiki H, Liu H, Huang J, Manchanda N, Nonneman D, Smith TPL, Reecy JM, Tuggle CK. Improved annotation of the domestic pig genome through integration of Iso-Seq and RNA-seq data. BMC Genomics 2019; 20:344. [PMID: 31064321 PMCID: PMC6505119 DOI: 10.1186/s12864-019-5709-y] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2018] [Accepted: 04/17/2019] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Our understanding of the pig transcriptome is limited. RNA transcript diversity among nine tissues was assessed using poly(A) selected single-molecule long-read isoform sequencing (Iso-seq) and Illumina RNA sequencing (RNA-seq) from a single White cross-bred pig. RESULTS Across tissues, a total of 67,746 unique transcripts were observed, including 60.5% predicted protein-coding, 36.2% long non-coding RNA and 3.3% nonsense-mediated decay transcripts. On average, 90% of the splice junctions were supported by RNA-seq within tissue. A large proportion (80%) represented novel transcripts, mostly produced by known protein-coding genes (70%), while 17% corresponded to novel genes. On average, four transcripts per known gene (tpg) were identified; an increase over current EBI (1.9 tpg) and NCBI (2.9 tpg) annotations and closer to the number reported in human genome (4.2 tpg). Our new pig genome annotation extended more than 6000 known gene borders (5' end extension, 3' end extension, or both) compared to EBI or NCBI annotations. We validated a large proportion of these extensions by independent pig poly(A) selected 3'-RNA-seq data, or human FANTOM5 Cap Analysis of Gene Expression data. Further, we detected 10,465 novel genes (81% non-coding) not reported in current pig genome annotations. More than 80% of these novel genes had transcripts detected in > 1 tissue. In addition, more than 80% of novel intergenic genes with at least one transcript detected in liver tissue had H3K4me3 or H3K36me3 peaks mapping to their promoter and gene body, respectively, in independent liver chromatin immunoprecipitation data. CONCLUSIONS These validated results show significant improvement over current pig genome annotations.
Collapse
Affiliation(s)
- H Beiki
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA
| | - H Liu
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA
| | - J Huang
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA.,College of Animal Science and Technology, Jiangxi Agricultural University, Nanchang, Jiangxi, People's Republic of China
| | - N Manchanda
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, 819 Wallace Road, Ames, IA, 50011, USA
| | - D Nonneman
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | - T P L Smith
- USDA, ARS, U.S. Meat Animal Research Center, Clay Center, NE, 68933, USA
| | - J M Reecy
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA
| | - C K Tuggle
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA, 50011, USA.
| |
Collapse
|
33
|
Mortlock S, Restuadi R, Levien R, Girling JE, Holdsworth-Carson SJ, Healey M, Zhu Z, Qi T, Wu Y, Lukowski SW, Rogers PAW, Yang J, McRae AF, Fung JN, Montgomery GW. Genetic regulation of methylation in human endometrium and blood and gene targets for reproductive diseases. Clin Epigenetics 2019; 11:49. [PMID: 30871624 PMCID: PMC6416889 DOI: 10.1186/s13148-019-0648-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 03/06/2019] [Indexed: 02/02/2023] Open
Abstract
Background Major challenges in understanding the functional consequences of genetic risk factors for human disease are which tissues and cell types are affected and the limited availability of suitable tissue. The aim of this study was to evaluate tissue-specific genotype-epigenetic characteristics in DNA samples from both endometrium and blood collected from women at different stages of the menstrual cycle and relate results to genetic risk factors for reproductive traits and diseases. Results We analysed DNA methylation (DNAm) data from endometrium and blood samples from 66 European women. Methylation profiles were compared between stages of the menstrual cycle, and changes in methylation overlaid with changes in transcription and genotypes. We observed large changes in methylation (27,262 DNAm probes) across the menstrual cycle in endometrium that were not observed in blood. Individual genotype data was tested for association with methylation at 443,016 and 443,101 DNAm probes in endometrium and blood respectively to identify methylation quantitative trait loci (mQTLs). A total of 4546 sentinel cis-mQTLs (P < 1.13 × 10−10) and 434 sentinel trans-mQTLs (P < 2.29 × 10−12) were detected in endometrium and 6615 sentinel cis-mQTLs (P < 1.13 × 10−10) and 590 sentinel trans-mQTLs (P < 2.29 × 10−12) were detected in blood. Following secondary analyses, conducted to test for overlap between mQTLs in the two tissues, we found that 62% of endometrial cis-mQTLs were also observed in blood and the genetic effects between tissues were highly correlated. A number of mQTL SNPs were associated with reproductive traits and diseases, including one mQTL located in a known risk region for endometriosis (near GREB1). Conclusions We report novel findings characterising genetic regulation of methylation in endometrium and the association of endometrial mQTLs with endometriosis risk and other reproductive traits and diseases. The high correlation of genetic effects between tissues highlights the potential to exploit the power of large mQTL datasets in endometrial research and identify target genes for functional studies. However, tissue-specific methylation profiles and genetic effects also highlight the importance of also using disease-relevant tissues when investigating molecular mechanisms of disease risk. Electronic supplementary material The online version of this article (10.1186/s13148-019-0648-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sally Mortlock
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia.
| | - Restuadi Restuadi
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Rupert Levien
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Jane E Girling
- Department of Obstetrics and Gynaecology, and Gynaecology Research Centre, University of Melbourne, Royal Women's Hospital, Parkville, VIC, 3052, Australia.,Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Sarah J Holdsworth-Carson
- Department of Obstetrics and Gynaecology, and Gynaecology Research Centre, University of Melbourne, Royal Women's Hospital, Parkville, VIC, 3052, Australia
| | - Martin Healey
- Department of Obstetrics and Gynaecology, and Gynaecology Research Centre, University of Melbourne, Royal Women's Hospital, Parkville, VIC, 3052, Australia
| | - Zhihong Zhu
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Ting Qi
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Yang Wu
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Samuel W Lukowski
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Peter A W Rogers
- Department of Obstetrics and Gynaecology, and Gynaecology Research Centre, University of Melbourne, Royal Women's Hospital, Parkville, VIC, 3052, Australia
| | - Jian Yang
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Allan F McRae
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Jenny N Fung
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| | - Grant W Montgomery
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Building 80, St Lucia, QLD, 4072, Australia
| |
Collapse
|
34
|
Yao Y, Liu Z, Wei Q, Ramsey SA. CERENKOV2: improved detection of functional noncoding SNPs using data-space geometric features. BMC Bioinformatics 2019; 20:63. [PMID: 30727967 PMCID: PMC6364436 DOI: 10.1186/s12859-019-2637-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 01/18/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND We previously reported on CERENKOV, an approach for identifying regulatory single nucleotide polymorphisms (rSNPs) that is based on 246 annotation features. CERENKOV uses the xgboost classifier and is designed to be used to find causal noncoding SNPs in loci identified by genome-wide association studies (GWAS). We reported that CERENKOV has state-of-the-art performance (by two traditional measures and a novel GWAS-oriented measure, AVGRANK) in a comparison to nine other tools for identifying functional noncoding SNPs, using a comprehensive reference SNP set (OSU17, 15,331 SNPs). Given that SNPs are grouped within loci in the reference SNP set and given the importance of the data-space manifold geometry for machine-learning model selection, we hypothesized that within-locus inter-SNP distances would have class-based distributional biases that could be exploited to improve rSNP recognition accuracy. We thus defined an intralocus SNP "radius" as the average data-space distance from a SNP to the other intralocus neighbors, and explored radius likelihoods for five distance measures. RESULTS We expanded the set of reference SNPs to 39,083 (the OSU18 set) and extracted CERENKOV SNP feature data. We computed radius empirical likelihoods and likelihood densities for rSNPs and control SNPs, and found significant likelihood differences between rSNPs and control SNPs. We fit parametric models of likelihood distributions for five different distance measures to obtain ten log-likelihood features that we combined with the 248-dimensional CERENKOV feature matrix. On the OSU18 SNP set, we measured the classification accuracy of CERENKOV with and without the new distance-based features, and found that the addition of distance-based features significantly improves rSNP recognition performance as measured by AUPVR, AUROC, and AVGRANK. Along with feature data for the OSU18 set, the software code for extracting the base feature matrix, estimating ten distance-based likelihood ratio features, and scoring candidate causal SNPs, are released as open-source software CERENKOV2. CONCLUSIONS Accounting for the locus-specific geometry of SNPs in data-space significantly improved the accuracy with which noncoding rSNPs can be computationally identified.
Collapse
Affiliation(s)
- Yao Yao
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, 97330 OR USA
- Department of Biomedical Sciences, Oregon State University, 106 Dryden Hall, Corvallis, 97330 OR USA
| | - Zheng Liu
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, 97330 OR USA
- Department of Biomedical Sciences, Oregon State University, 106 Dryden Hall, Corvallis, 97330 OR USA
| | - Qi Wei
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, 97330 OR USA
- Department of Biomedical Sciences, Oregon State University, 106 Dryden Hall, Corvallis, 97330 OR USA
| | - Stephen A. Ramsey
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, 97330 OR USA
- Department of Biomedical Sciences, Oregon State University, 106 Dryden Hall, Corvallis, 97330 OR USA
| |
Collapse
|
35
|
The Identification and Interpretation of cis-Regulatory Noncoding Mutations in Cancer. High Throughput 2018; 8:ht8010001. [PMID: 30577431 PMCID: PMC6473693 DOI: 10.3390/ht8010001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Revised: 12/11/2018] [Accepted: 12/14/2018] [Indexed: 12/30/2022] Open
Abstract
In the need to characterise the genomic landscape of cancers and to establish novel biomarkers and therapeutic targets, studies have largely focused on the identification of driver mutations within the protein-coding gene regions, where the most pathogenic alterations are known to occur. However, the noncoding genome is significantly larger than its protein-coding counterpart, and evidence reveals that regulatory sequences also harbour functional mutations that significantly affect the regulation of genes and pathways implicated in cancer. Due to the sheer number of noncoding mutations (NCMs) and the limited knowledge of regulatory element functionality in cancer genomes, differentiating pathogenic mutations from background passenger noise is particularly challenging technically and computationally. Here we review various up-to-date high-throughput sequencing data/studies and in silico methods that can be employed to interrogate the noncoding genome. We aim to provide an overview of available data resources as well as computational and molecular techniques that can help and guide the search for functional NCMs in cancer genomes.
Collapse
|
36
|
Harakalova M, Asselbergs FW. Systems analysis of dilated cardiomyopathy in the next generation sequencing era. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2018; 10:e1419. [PMID: 29485202 DOI: 10.1002/wsbm.1419] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 12/31/2017] [Accepted: 01/17/2018] [Indexed: 12/17/2022]
Abstract
Dilated cardiomyopathy (DCM) is a form of severe failure of cardiac muscle caused by a long list of etiologies ranging from myocardial infarction, DNA mutations in cardiac genes, to toxics. Systems analysis integrating next-generation sequencing (NGS)-based omics approaches, such as the sequencing of DNA, RNA, and chromatin, provide valuable insights into DCM mechanisms. The outcome and interpretation of NGS methods can be affected by the localization of cardiac biopsy, level of tissue degradation, and variable ratios of different cell populations, especially in the presence of fibrosis. Heart tissue composition may even differ between sexes, or siblings carrying the same disease causing mutation. Therefore, before planning any experiments, it is important to fully appreciate the complexities of DCM, and the selection of samples suitable for given research question should be an interdisciplinary effort involving clinicians and biologists. The list of NGS omics datasets in DCM to date is short. More studies have to be performed to contribute to public data repositories and facilitate systems analysis. In addition, proper data integration is a difficult task requiring complex computational approaches. Despite these complications, there are multiple promising implications of systems analysis in DCM. By combining various types of datasets, for example, RNA-seq, ChIP-seq, or 4C, deep insights into cardiac biology, and possible biomarkers and treatment targets, can be gained. Systems analysis can also facilitate the annotation of noncoding mutations in cardiac-specific DNA regulatory regions that play a substantial role in maintaining the tissue- and cell-specific transcriptional programs in the heart. This article is categorized under: Physiology > Mammalian Physiology in Health and Disease Laboratory Methods and Technologies > Genetic/Genomic Methods Laboratory Methods and Technologies > RNA Methods.
Collapse
Affiliation(s)
- Magdalena Harakalova
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology, Division Heart and Lungs, University Medical Center Utrecht, Utrecht University, Utrecht, Netherlands.,Durrer Center for Cardiovascular Research, Netherlands Heart Institute, Utrecht, Netherlands.,Institute of Cardiovascular Science, University College London, London, UK
| |
Collapse
|
37
|
Hashimoto H, Kawabe T, Fukuda T, Kusakabe M. A Novel Ataxic Mutant Mouse Line Having Sensory Neuropathy Shows Heavy Iron Deposition in Kidney. NEURODEGENER DIS 2017; 17:181-198. [DOI: 10.1159/000457126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 01/20/2017] [Indexed: 01/11/2023] Open
|
38
|
Iorio A, De Angelis F, Di Girolamo M, Luigetti M, Pradotto LG, Mazzeo A, Frusconi S, My F, Manfellotto D, Fuciarelli M, Polimanti R. Population diversity of the genetically determined TTR expression in human tissues and its implications in TTR amyloidosis. BMC Genomics 2017; 18:254. [PMID: 28335735 PMCID: PMC5364715 DOI: 10.1186/s12864-017-3646-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Accepted: 03/18/2017] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Transthyretin (TTR) amyloidosis is a hereditary disease with a complex genotype-phenotype correlation. We conducted a literature survey to define the clinical landscape of TTR amyloidosis across populations worldwide. Then, we investigated whether the genetically determined TTR expression differs among human populations, contributing to the differences observed in patients. Polygenic scores for genetically determined TTR expression in 14 clinically relevant tissues were constructed using data from the GTEx (Genotype-Tissue Expression) project and tested in the samples from the 1,000 Genomes Project. RESULTS We observed differences among the ancestral groups and, to a lesser extent, among the investigated populations within the ancestry groups. Scandinavian populations differed in their genetically determined TTR expression of skeletal muscle tissue with respect to Southern Europeans (p = 6.79*10-6). This is in line with epidemiological data related to Swedish and Portuguese TTR Val30Met endemic areas. Familial amyloidotic cardiomyopathy (TTR deposits occur primarily in heart tissues) presents clinical variability among human populations, a finding that agrees with the among-ancestry diversity of genetically determined TTR expression in heart tissues (i.e., Atrial Appendage p = 4.55*10-28; Left Ventricle p = 6.54*10-35). CONCLUSIONS Genetically determined TTR expression varied across human populations. This might contribute to the genotype-phenotype correlation of TTR amyloidosis.
Collapse
Affiliation(s)
- Andrea Iorio
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | | | - Marco Di Girolamo
- Clinical Pathophysiology Center, AFaR Foundation - "San Giovanni Calibita" Fatebenefratelli Hospital, Isola Tiberina, Rome, Italy
| | - Marco Luigetti
- Departments of Geriatrics, Neurosciences & Orthopedics, Institute of Neurology, Catholic University of the Sacred Heart, Fondazione Policlinico Universitario A. Gemelli, Rome, Italy
| | - Luca G Pradotto
- Division of Neurology and Neurorehabilitation, San Giuseppe Hospital, IRCCS-Istituto Auxologico Italiano, Piancavallo (VB), Italy
| | - Anna Mazzeo
- Department of Clinical and Experimental Medicine, University of Messina, Messina, Italy
| | - Sabrina Frusconi
- Genetic Diagnostics Unit, Laboratory Department, Careggi University Hospital, Florence, Italy
| | - Filomena My
- Division of Neurology, "Vito Fazzi" Hospital, Lecce, Italy
| | - Dario Manfellotto
- Clinical Pathophysiology Center, AFaR Foundation - "San Giovanni Calibita" Fatebenefratelli Hospital, Isola Tiberina, Rome, Italy
| | - Maria Fuciarelli
- Department of Biology, University of Rome Tor Vergata, Rome, Italy
| | - Renato Polimanti
- Department of Psychiatry, Yale University School of Medicine and VA CT Healthcare Center, VA CT 116A2, 950 Campbell Avenue, West Haven, CT, 06516, USA.
| |
Collapse
|
39
|
Wu M, Chen T, Jiang R. Global inference of disease-causing single nucleotide variants from exome sequencing data. BMC Bioinformatics 2016; 17:468. [PMID: 28155632 PMCID: PMC5260102 DOI: 10.1186/s12859-016-1325-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Background Whole exome sequencing (WES) has recently emerged as an effective approach for identifying genetic variants underlying human diseases. However, considerable time and labour is needed for careful investigation of candidate variants. Although filtration based on population frequencies and functional prediction scores could effectively remove common and neutral variants, hundreds or even thousands of rare deleterious variants still remain. In addition, current WES platforms also provide variant information in flanking noncoding regions, such as promoters, introns and splice sites. Despite of being recognized to harbour causal variants, these regions are usually ignored by current analysis pipelines. Results We present a novel computational method, called Glints, to overcome the above limitations. Glints is capable of identifying disease-causing SNVs in both coding and flanking noncoding regions from exome sequencing data. The principle behind Glints is that disease-causing variants should manifest their effect at both variant and gene levels. Specifically, Glints integrates 14 types of functional scores, including predictions for both coding and noncoding variants, and 9 types of association scores, which help identifying disease relevant genes. We conducted a large-scale simulation studies based on 1000 Genomes Project data and demonstrated the effectiveness of our method in both coding and flanking noncoding regions. We also applied Glints in two real exome sequencing and demonstrated its effectiveness for uncovering disease-causing SNVs. Both standalone software and web server are available at our website http://bioinfo.au.tsinghua.edu.cn/jianglab/glints. Conclusions Glints is effective for uncovering disease-causing SNVs in coding and flanking noncoding regions, which is supported by both simulation and real case studies. Glints is expected to be a useful tool for human genetics research based on exome sequencing data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1325-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mengmeng Wu
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, Tsinghua National Laboratory for Information Science and Technology, Beijing, China.,Department of Computer Science, Tsinghua University, Beijing, China
| | - Ting Chen
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, Tsinghua National Laboratory for Information Science and Technology, Beijing, China.,Department of Computer Science, Tsinghua University, Beijing, China
| | - Rui Jiang
- MOE Key Laboratory of Bioinformatics; Bioinformatics Division and Center for Synthetic & Systems Biology, Tsinghua National Laboratory for Information Science and Technology, Beijing, China. .,Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
40
|
Qin J, Yan B, Hu Y, Wang P, Wang J. Applications of integrative OMICs approaches to gene regulation studies. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0085-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
41
|
Hsiao YHE, Bahn JH, Lin X, Chan TM, Wang R, Xiao X. Alternative splicing modulated by genetic variants demonstrates accelerated evolution regulated by highly conserved proteins. Genome Res 2016; 26:440-50. [PMID: 26888265 PMCID: PMC4817768 DOI: 10.1101/gr.193359.115] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 02/17/2016] [Indexed: 01/23/2023]
Abstract
Identification of functional genetic variants and elucidation of their regulatory mechanisms represent significant challenges of the post-genomic era. A poorly understood topic is the involvement of genetic variants in mediating post-transcriptional RNA processing, including alternative splicing. Thus far, little is known about the genomic, evolutionary, and regulatory features of genetically modulated alternative splicing (GMAS). Here, we systematically identified intronic tag variants for genetic modulation of alternative splicing using RNA-seq data specific to cellular compartments. Combined with our previous method that identifies exonic tags for GMAS, this study yielded 622 GMAS exons. We observed that GMAS events are highly cell type independent, indicating that splicing-altering genetic variants could have widespread function across cell types. Interestingly, GMAS genes, exons, and single-nucleotide variants (SNVs) all demonstrated positive selection or accelerated evolution in primates. We predicted that GMAS SNVs often alter binding of splicing factors, with SRSF1 affecting the most GMAS events and demonstrating global allelic binding bias. However, in contrast to their GMAS targets, the predicted splicing factors are more conserved than expected, suggesting that cis-regulatory variation is the major driving force of splicing evolution. Moreover, GMAS-related splicing factors had stronger consensus motifs than expected, consistent with their susceptibility to SNV disruption. Intriguingly, GMAS SNVs in general do not alter the strongest consensus position of the splicing factor motif, except the more than 100 GMAS SNVs in linkage disequilibrium with polymorphisms reported by genome-wide association studies. Our study reports many GMAS events and enables a better understanding of the evolutionary and regulatory features of this phenomenon.
Collapse
Affiliation(s)
- Yun-Hua Esther Hsiao
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA; Department of Bioengineering, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Jae Hoon Bahn
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Xianzhi Lin
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Tak-Ming Chan
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Rena Wang
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA
| | - Xinshu Xiao
- Department of Integrative Biology and Physiology, University of California Los Angeles, Los Angeles, California 90095, USA; Department of Bioengineering, University of California Los Angeles, Los Angeles, California 90095, USA; Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, California 90095, USA; Molecular Biology Institute, University of California Los Angeles, Los Angeles, California 90095, USA
| |
Collapse
|
42
|
Hofmann-Apitius M, Ball G, Gebel S, Bagewadi S, de Bono B, Schneider R, Page M, Kodamullil AT, Younesi E, Ebeling C, Tegnér J, Canard L. Bioinformatics Mining and Modeling Methods for the Identification of Disease Mechanisms in Neurodegenerative Disorders. Int J Mol Sci 2015; 16:29179-206. [PMID: 26690135 PMCID: PMC4691095 DOI: 10.3390/ijms161226148] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 11/10/2015] [Accepted: 11/12/2015] [Indexed: 12/22/2022] Open
Abstract
Since the decoding of the Human Genome, techniques from bioinformatics, statistics, and machine learning have been instrumental in uncovering patterns in increasing amounts and types of different data produced by technical profiling technologies applied to clinical samples, animal models, and cellular systems. Yet, progress on unravelling biological mechanisms, causally driving diseases, has been limited, in part due to the inherent complexity of biological systems. Whereas we have witnessed progress in the areas of cancer, cardiovascular and metabolic diseases, the area of neurodegenerative diseases has proved to be very challenging. This is in part because the aetiology of neurodegenerative diseases such as Alzheimer´s disease or Parkinson´s disease is unknown, rendering it very difficult to discern early causal events. Here we describe a panel of bioinformatics and modeling approaches that have recently been developed to identify candidate mechanisms of neurodegenerative diseases based on publicly available data and knowledge. We identify two complementary strategies-data mining techniques using genetic data as a starting point to be further enriched using other data-types, or alternatively to encode prior knowledge about disease mechanisms in a model based framework supporting reasoning and enrichment analysis. Our review illustrates the challenges entailed in integrating heterogeneous, multiscale and multimodal information in the area of neurology in general and neurodegeneration in particular. We conclude, that progress would be accelerated by increasing efforts on performing systematic collection of multiple data-types over time from each individual suffering from neurodegenerative disease. The work presented here has been driven by project AETIONOMY; a project funded in the course of the Innovative Medicines Initiative (IMI); which is a public-private partnership of the European Federation of Pharmaceutical Industry Associations (EFPIA) and the European Commission (EC).
Collapse
Affiliation(s)
- Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
- Rheinische Friedrich-Wilhelms-Universitaet Bonn, University of Bonn, Bonn 53113, Germany.
| | - Gordon Ball
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, and Unit of Clinical Epidemiology, Karolinska University Hospital, Stockholm SE-171 77, Sweden.
- Science for Life Laboratories, Karolinska Institutet, Stockholm SE-171 77, Sweden.
| | - Stephan Gebel
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, avenue des Hauts-Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
| | - Shweta Bagewadi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Bernard de Bono
- Institute of Health Informatics, University College London, London NW1 2DA, UK.
- Auckland Bioengineering Institute, University of Auckland, Symmonds Street, Auckland 1142, New Zealand.
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7, avenue des Hauts-Fourneaux, Esch-sur-Alzette L-4362, Luxembourg.
| | - Matt Page
- Translational Bioinformatics, UCB Pharma, 216 Bath Rd, Slough SL1 3WE, UK.
| | - Alpha Tom Kodamullil
- Rheinische Friedrich-Wilhelms-Universitaet Bonn, University of Bonn, Bonn 53113, Germany.
| | - Erfan Younesi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Christian Ebeling
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Institutszentrum Birlinghoven, Sankt Augustin D-53754, Germany.
| | - Jesper Tegnér
- Unit of Computational Medicine, Center for Molecular Medicine, Department of Medicine, and Unit of Clinical Epidemiology, Karolinska University Hospital, Stockholm SE-171 77, Sweden.
- Science for Life Laboratories, Karolinska Institutet, Stockholm SE-171 77, Sweden.
| | - Luc Canard
- Translational Science Unit, SANOFI Recherche & Développement, 1 Avenue Pierre Brossolette, Chilly-Mazarin Cedex 91385, France.
| |
Collapse
|
43
|
Li MJ, Liu Z, Wang P, Wong MP, Nelson MR, Kocher JPA, Yeager M, Sham PC, Chanock SJ, Xia Z, Wang J. GWASdb v2: an update database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res 2015; 44:D869-76. [PMID: 26615194 PMCID: PMC4702921 DOI: 10.1093/nar/gkv1317] [Citation(s) in RCA: 139] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 11/10/2015] [Indexed: 12/19/2022] Open
Abstract
Genome-wide association studies (GWASs), now as a routine approach to study single-nucleotide polymorphism (SNP)-trait association, have uncovered over ten thousand significant trait/disease associated SNPs (TASs). Here, we updated GWASdb (GWASdb v2, http://jjwanglab.org/gwasdb) which provides comprehensive data curation and knowledge integration for GWAS TASs. These updates include: (i) Up to August 2015, we collected 2479 unique publications from PubMed and other resources; (ii) We further curated moderate SNP-trait associations (P-value < 1.0×10−3) from each original publication, and generated a total of 252 530 unique TASs in all GWASdb v2 collected studies; (iii) We manually mapped 1610 GWAS traits to 501 Human Phenotype Ontology (HPO) terms, 435 Disease Ontology (DO) terms and 228 Disease Ontology Lite (DOLite) terms. For each ontology term, we also predicted the putative causal genes; (iv) We curated the detailed sub-populations and related sample size for each study; (v) Importantly, we performed extensive function annotation for each TAS by incorporating gene-based information, ENCODE ChIP-seq assays, eQTL, population haplotype, functional prediction across multiple biological domains, evolutionary signals and disease-related annotation; (vi) Additionally, we compiled a SNP-drug response association dataset for 650 pharmacogenetic studies involving 257 drugs in this update; (vii) Last, we improved the user interface of website.
Collapse
Affiliation(s)
- Mulin Jun Li
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Zipeng Liu
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Panwen Wang
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Maria P Wong
- Department of Pathology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Matthew R Nelson
- Quantitative Sciences, GlaxoSmithKline, Research Triangle Park, NC, USA
| | - Jean-Pierre A Kocher
- Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN, USA
| | - Meredith Yeager
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Pak Chung Sham
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China State Key Laboratory of Brain and Cognitive Sciences and Department of Psychiatry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD, USA
| | - Zhengyuan Xia
- Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Junwen Wang
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China School of Biomedical Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
44
|
Flores Saiffe Farías A, Jaime Herrera López E, Moreno Vázquez CJ, Li W, Prado Montes de Oca E. Predicting functional regulatory SNPs in the human antimicrobial peptide genes DEFB1 and CAMP in tuberculosis and HIV/AIDS. Comput Biol Chem 2015; 59 Pt A:117-25. [PMID: 26447748 DOI: 10.1016/j.compbiolchem.2015.09.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Revised: 09/03/2015] [Accepted: 09/04/2015] [Indexed: 01/04/2023]
Abstract
Single nucleotide polymorphisms (SNPs) in transcription factor binding sites (TFBSs) within gene promoter region or enhancers can modify the transcription rate of genes related to complex diseases. These SNPs can be called regulatory SNPs (rSNPs). Data compiled from recent projects, such as the 1000 Genomes Project and ENCODE, has revealed essential information used to perform in silico prediction of the molecular and biological repercussions of SNPs within TFBS. However, most of these studies are very limited, as they only analyze SNPs in coding regions or when applied to promoters, and do not integrate essential biological data like TFBSs, expression profiles, pathway analysis, homotypic redundancy (number of TFBSs for the same TF in a region), chromatin accessibility and others, which could lead to a more accurate prediction. Our aim was to integrate different data in a biologically coherent method to analyze the proximal promoter regions of two antimicrobial peptide genes, DEFB1 and CAMP, that are associated with tuberculosis (TB) and HIV/AIDS. We predicted SNPs within the promoter regions that are more likely to interact with transcription factors (TFs). We also assessed the impact of homotypic redundancy using a novel approach called the homotypic redundancy weight factor (HWF). Our results identified 10 SNPs, which putatively modify the binding affinity of 24 TFs previously identified as related to TB and HIV/AIDS expression profiles (e.g. KLF5, CEBPA and NFKB1 for TB; FOXP2, BRCA1, CEBPB, CREB1, EBF1 and ZNF354C for HIV/AIDS; and RUNX2, HIF1A, JUN/AP-1, NR4A2, EGR1 for both diseases). Validating with the OregAnno database and cell-specific functional/non functional SNPs from additional 13 genes, our algorithm performed 53% sensitivity and 84.6% specificity to detect functional rSNPs using the DNAseI-HUP database. We are proposing our algorithm as a novel in silico method to detect true functional rSNPs in antimicrobial peptide genes. With further improvement, this novel method could be applied to other promoters in order to design probes and to discover new drug targets for complex diseases.
Collapse
Affiliation(s)
- Adolfo Flores Saiffe Farías
- Personalized Medicine Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Guadalajara Unit, Research Center of Technology and Design Assistance of Jalisco State, National Council of Science and Technology (CIATEJ AC, CONACYT), Av. Normalistas 800, Col. Colinas de la Normal, CP 44270 Guadalajara, Jalisco, Mexico.
| | - Enrique Jaime Herrera López
- Industrial Biotechnology, CIATEJ AC, Zapopan Unit, CONACYT, Camino Arenero 1227, Col. El Bajío del Arenal, CP 45019 Zapopan, Jalisco, Mexico.
| | - Cristopher Jorge Moreno Vázquez
- Personalized Medicine Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Guadalajara Unit, Research Center of Technology and Design Assistance of Jalisco State, National Council of Science and Technology (CIATEJ AC, CONACYT), Av. Normalistas 800, Col. Colinas de la Normal, CP 44270 Guadalajara, Jalisco, Mexico.
| | - Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, 350 Community Dr. Manhasset, NY 11030, USA.
| | - Ernesto Prado Montes de Oca
- Personalized Medicine Laboratory (LAMPER), Medical and Pharmaceutical Biotechnology, Guadalajara Unit, Research Center of Technology and Design Assistance of Jalisco State, National Council of Science and Technology (CIATEJ AC, CONACYT), Av. Normalistas 800, Col. Colinas de la Normal, CP 44270 Guadalajara, Jalisco, Mexico; Molecular Biology Laboratory, Biosafety Area, Medical and Pharmaceutical Biotechnology, Guadalajara Unit, CIATEJ AC, CONACYT, Av. Normalistas 800, Col. Colinas de la Normal, CP 44270 Guadalajara, Jalisco, Mexico.
| |
Collapse
|
45
|
Ma M, Ru Y, Chuang LS, Hsu NY, Shi LS, Hakenberg J, Cheng WY, Uzilov A, Ding W, Glicksberg BS, Chen R. Disease-associated variants in different categories of disease located in distinct regulatory elements. BMC Genomics 2015; 16 Suppl 8:S3. [PMID: 26110593 PMCID: PMC4480828 DOI: 10.1186/1471-2164-16-s8-s3] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Background The invention of high throughput sequencing technologies has led to the discoveries of hundreds of thousands of genetic variants associated with thousands of human diseases. Many of these genetic variants are located outside the protein coding regions, and as such, it is challenging to interpret the function of these genetic variants by traditional genetic approaches. Recent genome-wide functional genomics studies, such as FANTOM5 and ENCODE have uncovered a large number of regulatory elements across hundreds of different tissues or cell lines in the human genome. These findings provide an opportunity to study the interaction between regulatory elements and disease-associated genetic variants. Identifying these diseased-related regulatory elements will shed light on understanding the mechanisms of how these variants regulate gene expression and ultimately result in disease formation and progression. Results In this study, we curated and categorized 27,558 Mendelian disease variants, 20,964 complex disease variants, 5,809 cancer predisposing germline variants, and 43,364 recurrent cancer somatic mutations. Compared against nine different types of regulatory regions from FANTOM5 and ENCODE projects, we found that different types of disease variants show distinctive propensity for particular regulatory elements. Mendelian disease variants and recurrent cancer somatic mutations are 22-fold and 10- fold significantly enriched in promoter regions respectively (q<0.001), compared with allele-frequency-matched genomic background. Separate from these two categories, cancer predisposing germline variants are 27-fold enriched in histone modification regions (q<0.001), 10-fold enriched in chromatin physical interaction regions (q<0.001), and 6-fold enriched in transcription promoters (q<0.001). Furthermore, Mendelian disease variants and recurrent cancer somatic mutations share very similar distribution across types of functional effects. We further found that regulatory regions are located within over 50% coding exon regions. Transcription promoters, methylation regions, and transcription insulators have the highest density of disease variants, with 472, 239, and 72 disease variants per one million base pairs, respectively. Conclusions Disease-associated variants in different disease categories are preferentially located in particular regulatory elements. These results will be useful for an overall understanding about the differences among the pathogenic mechanisms of various disease-associated variants.
Collapse
|