1
|
Reshetnikov E, Churnosova M, Reshetnikova Y, Stepanov V, Bocharova A, Serebrova V, Trifonova E, Ponomarenko I, Sorokina I, Efremova O, Orlova V, Batlutskaya I, Ponomarenko M, Churnosov V, Aristova I, Polonikov A, Churnosov M. Maternal Age at Menarche Genes Determines Fetal Growth Restriction Risk. Int J Mol Sci 2024; 25:2647. [PMID: 38473894 DOI: 10.3390/ijms25052647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 02/06/2024] [Accepted: 02/14/2024] [Indexed: 03/14/2024] Open
Abstract
We aimed to explore the potential link of maternal age at menarche (mAAM) gene polymorphisms with risk of the fetal growth restriction (FGR). This case (FGR)-control (FGR free) study included 904 women (273 FGR and 631 control) in the third trimester of gestation examined/treated in the Departments of Obstetrics. For single nucleotide polymorphism (SNP) multiplex genotyping, 50 candidate loci of mAAM were chosen. The relationship of mAAM SNPs and FGR was appreciated by regression procedures (logistic/model-based multifactor dimensionality reduction [MB-MDR]) with subsequent in silico assessment of the assumed functionality pithy of FGR-related loci. Three mAAM-appertain loci were FGR-linked to genes such as KISS1 (rs7538038) (effect allele G-odds ratio (OR)allelic = 0.63/pperm = 0.0003; ORadditive = 0.61/pperm = 0.001; ORdominant = 0.56/pperm = 0.001), NKX2-1 (rs999460) (effect allele A-ORallelic = 1.37/pperm = 0.003; ORadditive = 1.45/pperm = 0.002; ORrecessive = 2.41/pperm = 0.0002), GPRC5B (rs12444979) (effect allele T-ORallelic = 1.67/pperm = 0.0003; ORdominant = 1.59/pperm = 0.011; ORadditive = 1.56/pperm = 0.009). The haplotype ACA FSHB gene (rs555621*rs11031010*rs1782507) was FRG-correlated (OR = 0.71/pperm = 0.05). Ten FGR-implicated interworking models were founded for 13 SNPs (pperm ≤ 0.001). The rs999460 NKX2-1 and rs12444979 GPRC5B interplays significantly influenced the FGR risk (these SNPs were present in 50% of models). FGR-related mAAM-appertain 15 polymorphic variants and 350 linked SNPs were functionally momentous in relation to 39 genes participating in the regulation of hormone levels, the ovulation cycle process, male gonad development and vitamin D metabolism. Thus, this study showed, for the first time, that the mAAM-appertain genes determine FGR risk.
Collapse
Affiliation(s)
- Evgeny Reshetnikov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Maria Churnosova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Yuliya Reshetnikova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Vadim Stepanov
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Anna Bocharova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Victoria Serebrova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Ekaterina Trifonova
- Research Institute for Medical Genetics, Tomsk National Research Medical Center of the Russian Academy of Sciences, 634050 Tomsk, Russia
| | - Irina Ponomarenko
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Inna Sorokina
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Olga Efremova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Valentina Orlova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Irina Batlutskaya
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Marina Ponomarenko
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Vladimir Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Inna Aristova
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| | - Alexey Polonikov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
- Department of Biology, Medical Genetics and Ecology and Research Institute for Genetic and Molecular Epidemiology, Kursk State Medical University, 305041 Kursk, Russia
| | - Mikhail Churnosov
- Department of Medical Biological Disciplines, Belgorod State National Research University, 308015 Belgorod, Russia
| |
Collapse
|
2
|
San Valentin EMD, Do KA, Yeung SCJ, Reyes-Gibby CC. Attempts to Understand Oral Mucositis in Head and Neck Cancer Patients through Omics Studies: A Narrative Review. Int J Mol Sci 2023; 24:16995. [PMID: 38069314 PMCID: PMC10706892 DOI: 10.3390/ijms242316995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open
Abstract
Oral mucositis (OM) is a common and clinically impactful side effect of cytotoxic cancer treatment, particularly in patients with head and neck squamous cell carcinoma (HNSCC) who undergo radiotherapy with or without concomitant chemotherapy. The etiology and pathogenic mechanisms of OM are complex, multifaceted and elicit both direct and indirect damage to the mucosa. In this narrative review, we describe studies that use various omics methodologies (genomics, transcriptomics, microbiomics and metabolomics) in attempts to elucidate the biological pathways associated with the development or severity of OM. Integrating different omics into multi-omics approaches carries the potential to discover links among host factors (genomics), host responses (transcriptomics, metabolomics), and the local environment (microbiomics).
Collapse
Affiliation(s)
- Erin Marie D. San Valentin
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Interventional Radiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Kim-Anh Do
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Sai-Ching J. Yeung
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Cielito C. Reyes-Gibby
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
3
|
Ayesiga SB, Rubaihayo P, Oloka BM, Dramadri IO, Sserumaga JP. Genome-wide association study and pathway analysis to decipher loci associated with Fusarium ear rot resistance in tropical maize germplasm. GENETIC RESOURCES AND CROP EVOLUTION 2023; 71:2435-2448. [PMID: 39026943 PMCID: PMC11252232 DOI: 10.1007/s10722-023-01793-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/04/2023] [Accepted: 10/25/2023] [Indexed: 07/20/2024]
Abstract
Breeding for host resistance is the most efficient and environmentally safe method to curb the spread of fusarium ear rot (FER). However, conventional breeding for resistance to FER is hampered by the complex polygenic nature of this trait, which is highly influenced by environmental conditions. This study aimed to identify genomic regions, single nucleotide polymorphisms (SNPs), and putative candidate genes associated with FER resistance as well as candidate metabolic pathways and pathway genes involved in it. A panel of 151 tropical inbred maize lines were used to assess the genetic architecture of FER resistance over two seasons. During the study period, seven SNPs associated with FER resistance were identified on chromosomes 1, 2, 4, 5, and 9, accounting for 4-11% of the phenotypic variance. These significant markers were annotated into four genes. Seven significant metabolic pathways involved in FER resistance were identified using the Pathway Association Study Tool, the most significant being the superpathway of the glyoxylate cycle. Overall, this study confirmed that resistance to FER is indeed a complex mechanism controlled by several small to medium-effect loci. Our findings may contribute to fast-tracking the efforts to develop disease-resistant maize lines through marker-assisted selection. Supplementary Information The online version contains supplementary material available at 10.1007/s10722-023-01793-4.
Collapse
Affiliation(s)
- Stella Bigirwa Ayesiga
- Department of Agricultural Production, College of Agriculture and Environmental Sciences, Makerere University, P. O. Box 7062, Kampala, Uganda
- National Livestock Resources Research Institute, National Agricultural Research Organization, PO Box 5704, Kampala, Uganda
| | - Patrick Rubaihayo
- Department of Agricultural Production, College of Agriculture and Environmental Sciences, Makerere University, P. O. Box 7062, Kampala, Uganda
| | - Bonny Michael Oloka
- Department of Horticultural Sciences, North Carolina State University, Raleigh, NC USA
| | - Isaac Ozinga Dramadri
- Department of Agricultural Production, College of Agriculture and Environmental Sciences, Makerere University, P. O. Box 7062, Kampala, Uganda
| | - Julius Pyton Sserumaga
- National Livestock Resources Research Institute, National Agricultural Research Organization, PO Box 5704, Kampala, Uganda
| |
Collapse
|
4
|
Wu MJ, Yu DD, Du YQ, Zhang J, Su MZ, Jiang CS, Guo YW. Further undescribed cembranoids from South China Sea soft coral Sarcophyton ehrenbergi: Structural elucidation and biological evaluation. PHYTOCHEMISTRY 2023; 206:113549. [PMID: 36481314 DOI: 10.1016/j.phytochem.2022.113549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/25/2022] [Accepted: 12/03/2022] [Indexed: 06/17/2023]
Abstract
A detailed chemical investigation of the South China Sea soft coral Sarcophyton ehrenbergi has yield seven undescribed cembranoids, namely isoehrenbergol D and sarcoehrenolides F-K embodying a rare α,β-unsaturated-lactone moiety at C-6 to C-19, along with two known related compounds, ehrenbergol D and sarcoehrenolide A. Their structures and absolute configurations were unambiguously established in the light of extensive spectroscopic data analysis, modified Mosher's method, X-ray diffraction analysis, and quantum chemical computation method. In a bioassay for α-glucosidase inhibition, ehrenbergol D was evaluated as α-glucosidase inhibitor with an IC50 value of 13.57 μM.
Collapse
Affiliation(s)
- Meng-Jun Wu
- Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals and College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou, 310014, China; State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Road, Zhangjiang Hi-Tech Park, Shanghai, 201203, China
| | - Dan-Dan Yu
- College of Pharmacy, Guangdong Medical University, Zhanjiang, Guangdong, 524023, PR China; Shandong Laboratory of Yantai Drug Discovery, Bohai Rim Advanced Research Institute for Drug Discovery, Yantai, Shandong, 264117, China
| | - Ye-Qing Du
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Road, Zhangjiang Hi-Tech Park, Shanghai, 201203, China
| | - Juan Zhang
- School of Biological Science and Technology, University of Jinan, Jinan, 250022, China
| | - Ming-Zhi Su
- Shandong Laboratory of Yantai Drug Discovery, Bohai Rim Advanced Research Institute for Drug Discovery, Yantai, Shandong, 264117, China.
| | - Cheng-Shi Jiang
- School of Biological Science and Technology, University of Jinan, Jinan, 250022, China.
| | - Yue-Wei Guo
- Collaborative Innovation Center of Yangtze River Delta Region Green Pharmaceuticals and College of Pharmaceutical Science, Zhejiang University of Technology, Hangzhou, 310014, China; State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zu Chong Zhi Road, Zhangjiang Hi-Tech Park, Shanghai, 201203, China; College of Pharmacy, Guangdong Medical University, Zhanjiang, Guangdong, 524023, PR China; Shandong Laboratory of Yantai Drug Discovery, Bohai Rim Advanced Research Institute for Drug Discovery, Yantai, Shandong, 264117, China.
| |
Collapse
|
5
|
Kravatsky YV, Chechetkin VR, Tchurikov NA, Kravatskaya GI. Genome-Wide Study of Colocalization between Genomic Stretches: A Method and Applications to the Regulation of Gene Expression. BIOLOGY 2022; 11:1422. [PMID: 36290327 PMCID: PMC9598420 DOI: 10.3390/biology11101422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/25/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
In this paper, we describe a method for the study of colocalization effects between stretch-stretch and stretch-point genome tracks based on a set of indices varying within the (-1, +1) interval. The indices combine the distances between the centers of neighboring stretches and their lengths. The extreme boundaries of the interval correspond to the complete colocalization of the genome tracks or its complete absence. We also obtained the relevant criteria of statistical significance for such indices using the complete permutation test. The method is robust with respect to strongly inhomogeneous positioning and length distribution of the genome tracks. On the basis of this approach, we created command-line software, the Genome Track Colocalization Analyzer. The software was tested, compared with other available packages, and applied to particular problems related to gene expression. The package, Genome Track Colocalization Analyzer (GTCA), is freely available to the users. GTCA complements our previous software, the Genome Track Analyzer, intended for the search for pairwise correlations between point-like genome tracks (also freely available). The corresponding details are provided in Data Availability Statement at the end of the text.
Collapse
Affiliation(s)
- Yuri V. Kravatsky
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119991 Moscow, Russia
| | - Vladimir R. Chechetkin
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia
| | - Nickolai A. Tchurikov
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia
| | - Galina I. Kravatskaya
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilov Str., 32, 119991 Moscow, Russia
| |
Collapse
|
6
|
Chimusa ER, Dalvie S, Dandara C, Wonkam A, Mazandu GK. Post genome-wide association analysis: dissecting computational pathway/network-based approaches. Brief Bioinform 2020; 20:690-700. [PMID: 29701762 DOI: 10.1093/bib/bby035] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Revised: 04/04/2018] [Indexed: 02/02/2023] Open
Abstract
Over thousands of genetic associations to diseases have been identified by genome-wide association studies (GWASs), which conceptually is a single-marker-based approach. There are potentially many uses of these identified variants, including a better understanding of the pathogenesis of diseases, new leads for studying underlying risk prediction and clinical prediction of treatment. However, because of inadequate power, GWAS might miss disease genes and/or pathways with weak genetic or strong epistatic effects. Driven by the need to extract useful information from GWAS summary statistics, post-GWAS approaches (PGAs) were introduced. Here, we dissect and discuss advances made in pathway/network-based PGAs, with a particular focus on protein-protein interaction networks that leverage GWAS summary statistics by combining effects of multiple loci, subnetworks or pathways to detect genetic signals associated with complex diseases. We conclude with a discussion of research areas where further work on summary statistic-based methods is needed.
Collapse
Affiliation(s)
- Emile R Chimusa
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Level 3, Wernher and Beit North, Private Bag, Rondebosch, 7700, Anzio road, Observatory Cape Town, South Africa
| | - Shareefa Dalvie
- Department of Psychiatry and Mental Health, University of Cape Town, Observatory, 7925, Cape Town, South Africa
| | - Collet Dandara
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Private Bag, Rondebosch, 7700, Cape Town, South Africa
| | - Ambroise Wonkam
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Private Bag, Rondebosch, 7700, Cape Town, South Africa
| | - Gaston K Mazandu
- Division of Human Genetics, Department of Pathology, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Private Bag, Rondebosch, 7700, Cape Town, South Africa; African Institute for Mathematical Sciences, 7945 Muizenberg, Cape Town, South Africa and Computational Biology Division, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, Medical School, Anzio Road, Observatory, 7925, Cape Town, South Africa
| |
Collapse
|
7
|
Identification of additional loci associated with antibody response to Mycobacterium avium ssp. Paratuberculosis in cattle by GSEA-SNP analysis. Mamm Genome 2017; 28:520-527. [PMID: 28864882 DOI: 10.1007/s00335-017-9714-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2017] [Accepted: 08/27/2017] [Indexed: 10/18/2022]
Abstract
Mycobacterium avium subsp. paratuberculosis: (MAP) causes a contagious chronic infection results in Johne's disease in a wide range of animal species, including cattle. Several genome-wide association studies (GWAS) have been carried out to identify loci putatively associated with MAP susceptibility by testing each marker separately and identifying SNPs that show a significant association with the phenotype, while SNP with modest effects are usually ignored. The objective of this study was to identify modest-effect genes associated with MAP susceptibility using a pathway-based approach. The Illumina BovineSNP50 BeadChip was used to genotype 966 Holstein cows, 483 positive and 483 negative for antibody response to MAP, data were then analyzed using novel SNP-based Gene Set Enrichment Analysis (GSEA-SNP) and validated with Adaptive Rank Truncated Product methodology. An allele-based test was carried out to estimate the statistical association for each marker with the phenotype, subsequently SNPs were mapped to the closest genes, considering for each gene the single variant with the highest value within a window of 50 kb, then pathway-statistics were tested using the GSEA-SNP method. The GO biological process "embryogenesis and morphogenesis" was most highly associated with antibody response to MAP. Within this pathway, five genes code for proteins which play a role in the immune defense relevant to response to bacterial infection. The immune response genes identified would not have been considered using a standard GWAS, thus demonstrating that the pathway approach can extend the interpretation of genome-wide association analyses and identify additional candidate genes for target traits.
Collapse
|
8
|
Brodie A, Azaria JR, Ofran Y. How far from the SNP may the causative genes be? Nucleic Acids Res 2016; 44:6046-54. [PMID: 27269582 PMCID: PMC5291268 DOI: 10.1093/nar/gkw500] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Revised: 05/20/2016] [Accepted: 05/22/2016] [Indexed: 02/03/2023] Open
Abstract
While GWAS identify many disease-associated SNPs, using them to decipher disease mechanisms is hindered by the difficulty in mapping SNPs to genes. Most SNPs are in non-coding regions and it is often hard to identify the genes they implicate. To explore how far the SNP may be from the affected genes we used a pathway-based approach. We found that affected genes are often up to 2 Mbps away from the associated SNP, and are not necessarily the closest genes to the SNP. Existing approaches for mapping SNPs to genes leave many SNPs unmapped to genes and reveal only 86 significant phenotype-pathway associations for all known GWAS hits combined. Using the pathway-based approach we propose here allows mapping of virtually all SNPs to genes and reveals 435 statistically significant phenotype-pathway associations. In search for mechanisms that may explain the relationships between SNPs and distant genes, we found that SNPs that are mapped to distant genes have significantly more large insertions/deletions around them than other SNPs, suggesting that these SNPs may sometimes be markers for large insertions/deletions that may affect large genomic regions.
Collapse
Affiliation(s)
- Aharon Brodie
- The Goodman faculty of life sciences, Nanotechnology building, Bar Ilan University, Ramat Gan 52900, Israel
| | - Johnathan Roy Azaria
- The Goodman faculty of life sciences, Nanotechnology building, Bar Ilan University, Ramat Gan 52900, Israel
| | - Yanay Ofran
- The Goodman faculty of life sciences, Nanotechnology building, Bar Ilan University, Ramat Gan 52900, Israel
| |
Collapse
|
9
|
Brossard M, Fang S, Vaysse A, Wei Q, Chen WV, Mohamdi H, Maubec E, Lavielle N, Galan P, Lathrop M, Avril MF, Lee JE, Amos CI, Demenais F. Integrated pathway and epistasis analysis reveals interactive effect of genetic variants at TERF1 and AFAP1L2 loci on melanoma risk. Int J Cancer 2015; 137:1901-1909. [PMID: 25892537 DOI: 10.1002/ijc.29570] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2014] [Revised: 03/12/2015] [Accepted: 03/30/2015] [Indexed: 12/18/2022]
Abstract
Genome-wide association studies (GWASs) have characterized 13 loci associated with melanoma, which only account for a small part of melanoma risk. To identify new genes with too small an effect to be detected individually but which collectively influence melanoma risk and/or show interactive effects, we used a two-step analysis strategy including pathway analysis of genome-wide SNP data, in a first step, and epistasis analysis within significant pathways, in a second step. Pathway analysis, using the gene-set enrichment analysis (GSEA) approach and the gene ontology (GO) database, was applied to the outcomes of MELARISK (3,976 subjects) and MDACC (2,827 subjects) GWASs. Cross-gene SNP-SNP interaction analysis within melanoma-associated GOs was performed using the INTERSNP software. Five GO categories were significantly enriched in genes associated with melanoma (false discovery rate ≤ 5% in both studies): response to light stimulus, regulation of mitotic cell cycle, induction of programmed cell death, cytokine activity and oxidative phosphorylation. Epistasis analysis, within each of the five significant GOs, showed significant evidence for interaction for one SNP pair at TERF1 and AFAP1L2 loci (pmeta-int = 2.0 × 10(-7) , which met both the pathway and overall multiple-testing corrected thresholds that are equal to 9.8 × 10(-7) and 2.0 × 10(-7) , respectively) and suggestive evidence for another pair involving correlated SNPs at the same loci (pmeta-int = 3.6 × 10(-6) ). This interaction has important biological relevance given the key role of TERF1 in telomere biology and the reported physical interaction between TERF1 and AFAP1L2 proteins. This finding brings a novel piece of evidence for the emerging role of telomere dysfunction into melanoma development.
Collapse
Affiliation(s)
- Myriam Brossard
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France
| | - Shenying Fang
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Amaury Vaysse
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France
| | - Qingyi Wei
- Duke Cancer Institute, Duke University Medical center and Department of Medicine, Duke University School of Medicine, Durham, NC, USA
| | - Wei V Chen
- Laboratory Informatics System, Department of Clinical Applications & Support, The University of Texas M. D. Anderson Cancer Center, Houston, TX, USA
| | - Hamida Mohamdi
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France
| | - Eve Maubec
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France.,AP-HP (Assistance Publique-Hôpitaux de Paris), Hôpital Bichat, Service de Dermatologie, Université Paris Diderot, Paris, France
| | - Nolwenn Lavielle
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France
| | - Pilar Galan
- INSERM, UMR U557; Institut national de la Recherche Agronomique,U1125; Conservatoire national des arts et métiers, Centre de Recherche en Nutrition Humaine, Ile de France, Bobigny, France
| | - Mark Lathrop
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada
| | | | - Jeffrey E Lee
- Department of Surgical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Christopher I Amos
- Department of Community and Family Medicine, Geisel College of Medicine, Dartmouth College, Hanover, New Hampshire, USA
| | - Florence Demenais
- INSERM, Genetic Variation and Human Diseases Unit, UMR-946, Paris, France.,Université Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d'Hématologie, Paris, France
| |
Collapse
|
10
|
JAG: A Computational Tool to Evaluate the Role of Gene-Sets in Complex Traits. Genes (Basel) 2015; 6:238-51. [PMID: 26110313 PMCID: PMC4488663 DOI: 10.3390/genes6020238] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 04/27/2015] [Indexed: 12/25/2022] Open
Abstract
Gene-set analysis has been proposed as a powerful tool to deal with the highly polygenic architecture of complex traits, as well as with the small effect sizes typically found in GWAS studies for complex traits. We developed a tool, Joint Association of Genetic variants (JAG), which can be applied to Genome Wide Association (GWA) data and tests for the joint effect of all single nucleotide polymorphisms (SNPs) located in a user-specified set of genes or biological pathway. JAG assigns SNPs to genes and incorporates self-contained and/or competitive tests for gene-set analysis. JAG uses permutation to evaluate gene-set significance, which implicitly controls for linkage disequilibrium, sample size, gene size, the number of SNPs per gene and the number of genes in the gene-set. We conducted a power analysis using the Wellcome Trust Case Control Consortium (WTCCC) Crohn’s disease data set and show that JAG correctly identifies validated gene-sets for Crohn’s disease and has more power than currently available tools for gene-set analysis. JAG is a powerful, novel tool for gene-set analysis, and can be freely downloaded from the CTG Lab website.
Collapse
|
11
|
Benavides MV, Sonstegard TS, Kemp S, Mugambi JM, Gibson JP, Baker RL, Hanotte O, Marshall K, Van Tassell C. Identification of novel loci associated with gastrointestinal parasite resistance in a Red Maasai x Dorper backcross population. PLoS One 2015; 10:e0122797. [PMID: 25867089 PMCID: PMC4395112 DOI: 10.1371/journal.pone.0122797] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 02/21/2015] [Indexed: 12/14/2022] Open
Abstract
Gastrointestinal (GI) parasitic infection is the main health constraint for small ruminant production, causing loss of weight and/or death. Red Maasai sheep have adapted to a tropical environment where extreme parasite exposure is a constant, especially with highly pathogenic Haemonchus contortus. This breed has been reported to be resistant to gastrointestinal parasite infection, hence it is considered an invaluable resource to study associations between host genetics and resistance. The aim of this study was to identify polymorphisms strongly associated with host resistance in a double backcross population derived from Red Maasai and Dorper sheep using a SNP-based GWAS analysis. The animals that were genotyped represented the most resistant and susceptible individuals based on the tails of phenotypic distribution (10% each) for average faecal egg counts (AVFEC). AVFEC, packed cell volume (AVPCV), and live weight (AVLWT) were adjusted for fixed effects and co-variables, and an association analysis was run using EMMAX. Revised significance levels were calculated using 100,000 permutation tests. The top five significant SNP markers with - log10 p-values >3.794 were observed on five different chromosomes for AVFEC, and BLUPPf90/PostGSf90 results confirmed EMMAX significant regions for this trait. One of these regions included a cluster of significant SNP on chromosome (Chr) 6 not in linkage disequilibrium to each other. This genomic location contains annotated genes involved in cytokine signalling, haemostasis and mucus biosynthesis. Only one association detected on Chr 7 was significant for both AVPCV and AVLWT. The results generated here reveal candidate immune variants for genes involved in differential response to infection and provide additional SNP marker information that has potential to aid selection of resistance to gastrointestinal parasites in sheep of a similar genetic background to the double backcross population.
Collapse
Affiliation(s)
| | - Tad S. Sonstegard
- Animal Genomics & Improvement Laboratory, USDA/ARS/Beltsville Agricultural Research Center, Beltsville, MD, United States of America
| | - Stephen Kemp
- Animal Biosciences, The International Livestock Research Institute (ILRI), Nairobi, Kenya
| | - John M. Mugambi
- National Veterinary Research Centre, Kenya Agricultural Research Institute (KARI), Muguga, Kenya
| | - John P. Gibson
- Centre for Genetic Analysis and Applications, University of New England, Armidale, NSW, Australia
| | | | - Olivier Hanotte
- Medicine & Health Sciences, The University of Nottingham, Nottingham, United Kingdom
| | - Karen Marshall
- Animal Biosciences, The International Livestock Research Institute (ILRI), Nairobi, Kenya
| | - Curtis Van Tassell
- Animal Genomics & Improvement Laboratory, USDA/ARS/Beltsville Agricultural Research Center, Beltsville, MD, United States of America
| |
Collapse
|
12
|
Jin L, Zuo XY, Su WY, Zhao XL, Yuan MQ, Han LZ, Zhao X, Chen YD, Rao SQ. Pathway-based analysis tools for complex diseases: a review. GENOMICS PROTEOMICS & BIOINFORMATICS 2014; 12:210-20. [PMID: 25462153 PMCID: PMC4411419 DOI: 10.1016/j.gpb.2014.10.002] [Citation(s) in RCA: 90] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/21/2014] [Revised: 08/30/2014] [Accepted: 09/04/2014] [Indexed: 11/23/2022]
Abstract
Genetic studies are traditionally based on single-gene analysis. The use of these analyses can pose tremendous challenges for elucidating complicated genetic interplays involved in complex human diseases. Modern pathway-based analysis provides a technique, which allows a comprehensive understanding of the molecular mechanisms underlying complex diseases. Extensive studies utilizing the methods and applications for pathway-based analysis have significantly advanced our capacity to explore large-scale omics data, which has rapidly accumulated in biomedical fields. This article is a comprehensive review of the pathway-based analysis methods—the powerful methods with the potential to uncover the biological depths of the complex diseases. The general concepts and procedures for the pathway-based analysis methods are introduced and then, a comprehensive review of the major approaches for this analysis is presented. In addition, a list of available pathway-based analysis software and databases is provided. Finally, future directions and challenges for the methodological development and applications of pathway-based analysis techniques are discussed. This review will provide a useful guide to dissect complex diseases.
Collapse
Affiliation(s)
- Lv Jin
- Institute for Medical Systems Biology, and Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan 523808, China
| | - Xiao-Yu Zuo
- Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-Sen University, Guangzhou 510080, China
| | - Wei-Yang Su
- Community Health Service Management Center of Panyu District, Guangzhou 511400, China
| | - Xiao-Lei Zhao
- Institute for Medical Systems Biology, and Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan 523808, China
| | - Man-Qiong Yuan
- Department of Statistical Sciences, School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou 510275, China
| | - Li-Zhen Han
- Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-Sen University, Guangzhou 510080, China
| | - Xiang Zhao
- Institute for Medical Systems Biology, and Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan 523808, China
| | - Ye-Da Chen
- Institute for Medical Systems Biology, and Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan 523808, China
| | - Shao-Qi Rao
- Institute for Medical Systems Biology, and Department of Medical Statistics and Epidemiology, School of Public Health, Guangdong Medical College, Dongguan 523808, China; Department of Medical Statistics and Epidemiology, School of Public Health, Sun Yat-Sen University, Guangzhou 510080, China; Department of Statistical Sciences, School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou 510275, China.
| |
Collapse
|
13
|
Zhao J, Zhu Y, Boerwinkle E, Xiong M. Pathway analysis with next-generation sequencing data. Eur J Hum Genet 2014; 23:507-15. [PMID: 24986826 DOI: 10.1038/ejhg.2014.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Revised: 03/29/2014] [Accepted: 04/26/2014] [Indexed: 12/21/2022] Open
Abstract
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Collapse
Affiliation(s)
- Jinying Zhao
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Yun Zhu
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Eric Boerwinkle
- Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Momiao Xiong
- Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
14
|
Abstract
Genome-wide association studies (GWAS) are designed to identify the portion of single-nucleotide polymorphisms (SNPs) in genome sequences associated with a complex trait. Strategies based on the gene list enrichment concept are currently applied for the functional analysis of GWAS, according to which a significant overrepresentation of candidate genes associated with a biological pathway is used as a proxy to infer overrepresentation of candidate SNPs in the pathway. Here we show that such inference is not always valid and introduce the program SNP2GO, which implements a new method to properly test for the overrepresentation of candidate SNPs in biological pathways.
Collapse
|
15
|
Pan Q, Hu T, Malley JD, Andrew AS, Karagas MR, Moore JH. A system-level pathway-phenotype association analysis using synthetic feature random forest. Genet Epidemiol 2014; 38:209-19. [PMID: 24535726 DOI: 10.1002/gepi.21794] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Revised: 11/21/2013] [Accepted: 01/02/2014] [Indexed: 11/07/2022]
Abstract
As the cost of genome-wide genotyping decreases, the number of genome-wide association studies (GWAS) has increased considerably. However, the transition from GWAS findings to the underlying biology of various phenotypes remains challenging. As a result, due to its system-level interpretability, pathway analysis has become a popular tool for gaining insights on the underlying biology from high-throughput genetic association data. In pathway analyses, gene sets representing particular biological processes are tested for significant associations with a given phenotype. Most existing pathway analysis approaches rely on single-marker statistics and assume that pathways are independent of each other. As biological systems are driven by complex biomolecular interactions, embracing the complex relationships between single-nucleotide polymorphisms (SNPs) and pathways needs to be addressed. To incorporate the complexity of gene-gene interactions and pathway-pathway relationships, we propose a system-level pathway analysis approach, synthetic feature random forest (SF-RF), which is designed to detect pathway-phenotype associations without making assumptions about the relationships among SNPs or pathways. In our approach, the genotypes of SNPs in a particular pathway are aggregated into a synthetic feature representing that pathway via Random Forest (RF). Multiple synthetic features are analyzed using RF simultaneously and the significance of a synthetic feature indicates the significance of the corresponding pathway. We further complement SF-RF with pathway-based Statistical Epistasis Network (SEN) analysis that evaluates interactions among pathways. By investigating the pathway SEN, we hope to gain additional insights into the genetic mechanisms contributing to the pathway-phenotype association. We apply SF-RF to a population-based genetic study of bladder cancer and further investigate the mechanisms that help explain the pathway-phenotype associations using SEN. The bladder cancer associated pathways we found are both consistent with existing biological knowledge and reveal novel and plausible hypotheses for future biological validations.
Collapse
Affiliation(s)
- Qinxin Pan
- Department of Genetics, Geisel School of Medicine, Dartmouth College, Hanover, New Hampshire, United States of America
| | | | | | | | | | | |
Collapse
|
16
|
Incorporating prior knowledge to increase the power of genome-wide association studies. Methods Mol Biol 2014; 1019:519-41. [PMID: 23756909 DOI: 10.1007/978-1-62703-447-0_25] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Typical methods of analyzing genome-wide single nucleotide variant (SNV) data in cases and controls involve testing each variant's genotypes separately for phenotype association, and then using a substantial multiple-testing penalty to minimize the rate of false positives. This approach, however, can result in low power for modestly associated SNVs. Furthermore, simply looking at the most associated SNVs may not directly yield biological insights about disease etiology. SNVset methods attempt to address both limitations of the traditional approach by testing biologically meaningful sets of SNVs (e.g., genes or pathways). The number of tests run in a SNVset analysis is typically much lower (hundreds or thousands instead of millions) than in a traditional analysis, so the false-positive rate is lower. Additionally, by testing SNVsets that are biologically meaningful finding a significant set may more quickly yield insights into disease etiology.In this chapter we summarize the short history of SNVset testing and provide an overview of the many recently proposed methods. Furthermore, we provide detailed step-by-step instructions on how to perform a SNVset analysis, including a substantial number of practical tips and questions that researchers should consider before undertaking a SNVset analysis. Lastly, we describe a companion R package (snvset) that implements recently proposed SNVset methods. While SNVset testing is a new approach, with many new methods still being developed and many open questions, the promise of the approach is worth serious consideration when considering analytic methods for GWAS.
Collapse
|
17
|
Lee D, Lee GK, Yoon KA, Lee JS. Pathway-based analysis using genome-wide association data from a Korean non-small cell lung cancer study. PLoS One 2013; 8:e65396. [PMID: 23762359 PMCID: PMC3675130 DOI: 10.1371/journal.pone.0065396] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 04/24/2013] [Indexed: 11/18/2022] Open
Abstract
Pathway-based analysis, used in conjunction with genome-wide association study (GWAS) techniques, is a powerful tool to detect subtle but systematic patterns in genome that can help elucidate complex diseases, like cancers. Here, we stepped back from genetic polymorphisms at a single locus and examined how multiple association signals can be orchestrated to find pathways related to lung cancer susceptibility. We used single-nucleotide polymorphism (SNP) array data from 869 non-small cell lung cancer (NSCLC) cases from a previous GWAS at the National Cancer Center and 1,533 controls from the Korean Association Resource project for the pathway-based analysis. After mapping single-nucleotide polymorphisms to genes, considering their coding region and regulatory elements (±20 kbp), multivariate logistic regression of additive and dominant genetic models were fitted against disease status, with adjustments for age, gender, and smoking status. Pathway statistics were evaluated using Gene Set Enrichment Analysis (GSEA) and Adaptive Rank Truncated Product (ARTP) methods. Among 880 pathways, 11 showed relatively significant statistics compared to our positive controls (PGSEA≤0.025, false discovery rate≤0.25). Candidate pathways were validated using the ARTP method and similarities between pathways were computed against each other. The top-ranked pathways were ABC Transporters (PGSEA<0.001, PARTP = 0.001), VEGF Signaling Pathway (PGSEA<0.001, PARTP = 0.008), G1/S Check Point (PGSEA = 0.004, PARTP = 0.013), and NRAGE Signals Death through JNK (PGSEA = 0.006, PARTP = 0.001). Our results demonstrate that pathway analysis can shed light on post-GWAS research and help identify potential targets for cancer susceptibility.
Collapse
MESH Headings
- Adult
- Aged
- Aged, 80 and over
- Asian People
- Carcinoma, Non-Small-Cell Lung/diagnosis
- Carcinoma, Non-Small-Cell Lung/ethnology
- Carcinoma, Non-Small-Cell Lung/genetics
- Carcinoma, Non-Small-Cell Lung/metabolism
- Case-Control Studies
- Databases, Genetic
- Female
- Gene Expression Regulation, Neoplastic
- Genetic Predisposition to Disease
- Genome, Human
- Genome-Wide Association Study
- Humans
- Logistic Models
- Lung Neoplasms/diagnosis
- Lung Neoplasms/ethnology
- Lung Neoplasms/genetics
- Lung Neoplasms/metabolism
- Male
- Metabolic Networks and Pathways/genetics
- Middle Aged
- Models, Genetic
- Polymorphism, Single Nucleotide
- Signal Transduction
Collapse
Affiliation(s)
- Donghoon Lee
- Lung Cancer Branch, Research Institute and Hospital, National Cancer Center, Gyeonggi, Republic of Korea
| | - Geon Kook Lee
- Lung Cancer Branch, Research Institute and Hospital, National Cancer Center, Gyeonggi, Republic of Korea
| | - Kyong-Ah Yoon
- Lung Cancer Branch, Research Institute and Hospital, National Cancer Center, Gyeonggi, Republic of Korea
- * E-mail:
| | - Jin Soo Lee
- Lung Cancer Branch, Research Institute and Hospital, National Cancer Center, Gyeonggi, Republic of Korea
| |
Collapse
|
18
|
Ried JS, Döring A, Oexle K, Meisinger C, Winkelmann J, Klopp N, Meitinger T, Peters A, Suhre K, Wichmann HE, Gieger C. PSEA: Phenotype Set Enrichment Analysis--a new method for analysis of multiple phenotypes. Genet Epidemiol 2012; 36:244-52. [PMID: 22714936 DOI: 10.1002/gepi.21617] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Most genome-wide association studies (GWAS) are restricted to one phenotype, even if multiple related or unrelated phenotypes are available. However, an integrated analysis of multiple phenotypes can provide insight into their shared genetic basis and may improve the power of association studies. We present a new method, called "phenotype set enrichment analysis" (PSEA), which uses ideas of gene set enrichment analysis for the investigation of phenotype sets. PSEA combines statistics of univariate phenotype analyses and tests by permutation. It does not only allow analyzing predefined phenotype sets, but also to identify new phenotype sets. Apart from the application to situations where phenotypes and genotypes are available for each person, the method was adjusted to the analysis of GWAS summary statistics. PSEA was applied to data from the population-based cohort KORA F4 (N = 1,814) using iron-related and blood count traits. By confirming associations previously found in large meta-analyses on these traits, PSEA was shown to be a reliable tool. Many of these associations were not detectable by GWAS on single phenotypes in KORA F4. Therefore, the results suggest that PSEA can be more powerful than a single phenotype GWAS for the identification of association with multiple phenotypes. PSEA is a valuable method for analysis of multiple phenotypes, which can help to understand phenotype networks. Its flexible design enables both the use of prior knowledge and the generation of new knowledge on connection of multiple phenotypes. A software program for PSEA based on GWAS results is available upon request.
Collapse
Affiliation(s)
- Janina S Ried
- Institute of Genetic Epidemiology, Helmholtz Zentrum München-German Research Center for Environmental Health, Ingolstadter Landstraße 1, Neuherberg, Germany
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Kwon JS, Kim J, Nam D, Kim S. Performance Comparison of Two Gene Set Analysis Methods for Genome-wide Association Study Results: GSA-SNP vs i-GSEA4GWAS. Genomics Inform 2012; 10:123-7. [PMID: 23105940 PMCID: PMC3480679 DOI: 10.5808/gi.2012.10.2.123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2012] [Revised: 05/21/2012] [Accepted: 05/22/2012] [Indexed: 11/23/2022] Open
Abstract
Gene set analysis (GSA) is useful in interpreting a genome-wide association study (GWAS) result in terms of biological mechanism. We compared the performance of two different GSA implementations that accept GWAS p-values of single nucleotide polymorphisms (SNPs) or gene-by-gene summaries thereof, GSA-SNP and i-GSEA4GWAS, under the same settings of inputs and parameters. GSA runs were made with two sets of p-values from a Korean type 2 diabetes mellitus GWAS study: 259,188 and 1,152,947 SNPs of the original and imputed genotype datasets, respectively. When Gene Ontology terms were used as gene sets, i-GSEA4GWAS produced 283 and 1,070 hits for the unimputed and imputed datasets, respectively. On the other hand, GSA-SNP reported 94 and 38 hits, respectively, for both datasets. Similar, but to a lesser degree, trends were observed with Kyoto Encyclopedia of Genes and Genomes (KEGG) gene sets as well. The huge number of hits by i-GSEA4GWAS for the imputed dataset was probably an artifact due to the scaling step in the algorithm. The decrease in hits by GSA-SNP for the imputed dataset may be due to the fact that it relies on Z-statistics, which is sensitive to variations in the background level of associations. Judicious evaluation of the GSA outcomes, perhaps based on multiple programs, is recommended.
Collapse
Affiliation(s)
- Ji-Sun Kwon
- Department of Bioinformatics and Life Science, Soongsil University, Seoul 156-743, Korea
| | | | | | | |
Collapse
|
20
|
Wang JY, Luo YR, Fu WX, Lu X, Zhou JP, Ding XD, Liu JF, Zhang Q. Genome-wide association studies for hematological traits in swine. Anim Genet 2012; 44:34-43. [PMID: 22548415 DOI: 10.1111/j.1365-2052.2012.02366.x] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/05/2012] [Indexed: 12/30/2022]
Abstract
Improving immune capacity may increase the profitability of animal production if it enables animals to better cope with infections. Hematological traits play pivotal roles in animal immune capacity and disease resistance. Thus far, few studies have been conducted using a high-density swine SNP chip panel to unravel the genetic mechanism of the immune capability in domestic animals. In this study, using mixed model-based single-locus regression analyses, we carried out genome-wide association studies, using the Porcine SNP60 BeadChip, for immune responses in piglets for 18 hematological traits (seven leukocyte traits, seven erythrocyte traits, and four platelet traits) after being immunized with classical swine fever vaccine. After adjusting for multiple testing based on permutations, 10, 24, and 77 chromosome-wise significant SNPs were identified for the leukocyte traits, erythrocyte traits, and platelet traits respectively, of which 10 reached genome-wise significance level. Among the 53 SNPs for mean platelet volume, 29 are located in a linkage disequilibrium block between 32.77 and 40.59 Mb on SSC6. Four genes of interest are located within the block, providing genetic evidence that this genomic segment may be considered a candidate region relevant to the platelet traits. Other candidate genes of interest for red blood cell, hemoglobin, and red blood cell volume distribution width also have been found near the significant SNPs. Our genome-wide association study provides a list of significant SNPs and candidate genes that offer valuable information for future dissection of molecular mechanisms regulating hematological traits.
Collapse
Affiliation(s)
- J Y Wang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Agriculture, College of Animal Science and Technology, China
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Shahbaba B, Shachaf CM, Yu Z. A pathway analysis method for genome-wide association studies. Stat Med 2012; 31:988-1000. [PMID: 22302470 DOI: 10.1002/sim.4477] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2011] [Revised: 10/20/2011] [Accepted: 11/02/2011] [Indexed: 12/20/2022]
Abstract
For genome-wide association studies, we propose a new method for identifying significant biological pathways. In this approach, we aggregate data across single-nucleotide polymorphisms to obtain summary measures at the gene level. We then use a hierarchical Bayesian model, which takes the gene-level summary measures as data, in order to evaluate the relevance of each pathway to an outcome of interest (e.g., disease status). Although shifting the focus of analysis from individual genes to pathways has proven to improve the statistical power and provide more robust results, such methods tend to eliminate a large number of genes whose pathways are unknown. For these genes, we propose to use a Bayesian multinomial logit model to predict the associated pathways by using the genes with known pathways as the training data. Our hierarchical Bayesian model takes the uncertainty regarding the pathway predictions into account while assessing the significance of pathways. We apply our method to two independent studies on type 2 diabetes and show that the overlap between the results from the two studies is statistically significant. We also evaluate our approach on the basis of simulated data.
Collapse
Affiliation(s)
- Babak Shahbaba
- Department of Statistics, University of California, Irvine, CA, USA
| | | | | |
Collapse
|
22
|
Sykes J, Cheng L, Xu W, Tsao MS, Liu G, Pintilie M. Addition of multiple rare SNPs to known common variants improves the association between disease and gene in the Genetic Analysis Workshop 17 data. BMC Proc 2011; 5 Suppl 9:S97. [PMID: 22373301 PMCID: PMC3287939 DOI: 10.1186/1753-6561-5-s9-s97] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The upcoming release of new whole-genome genotyping technologies will shed new light on whether there is an associative effect of previously immeasurable rare variants on incidence of disease. For Genetic Analysis Workshop 17, our team focused on a statistical method to detect associations between gene-based multiple rare variants and disease status. We added a combination of rare SNPs to a common variant shown to have an influence on disease status. This method provides us with an enhanced ability to detect the effect of these rare variants, which, modeled alone, would normally be undetectable. Adjusting for significant clinical parameters, several genes were found to have multiple rare variants that were significantly associated with disease outcome.
Collapse
Affiliation(s)
- Jenna Sykes
- Department of Biostatistics, Princess Margaret Hospital, 610 University Avenue, Toronto, ON M5G 2M9, Canada.
| | | | | | | | | | | |
Collapse
|
23
|
Gui H, Li M, Sham PC, Cherny SS. Comparisons of seven algorithms for pathway analysis using the WTCCC Crohn's Disease dataset. BMC Res Notes 2011; 4:386. [PMID: 21981765 PMCID: PMC3199264 DOI: 10.1186/1756-0500-4-386] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Accepted: 10/07/2011] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Though rooted in genomic expression studies, pathway analysis for genome-wide association studies (GWAS) has gained increasing popularity, since it has the potential to discover hidden disease pathogenic mechanisms by combining statistical methods with biological knowledge. Generally, algorithms or programs proposed recently can be categorized by different types of input data, null hypothesis or counts of analysis stages. Due to complexity caused by SNP, gene and pathway relationships, re-sampling strategies like permutation are always utilized to derive an empirical distribution for test statistics for evaluating the significance of candidate pathways. However, evaluation of these algorithms on real GWAS datasets and real biological pathway databases needs to be addressed before we apply them widely with confidence. FINDINGS Two algorithms which use summary statistics from GWAS as input were implemented in KGG, a novel and user-friendly software tool for GWAS pathway analysis. Comparisons of these two algorithms as well as the other five selected algorithms were conducted by analyzing the WTCCC Crohn's Disease dataset utilizing the MsigDB canonical pathways. As a result of using permutation to obtain empirical p-value, most of these methods could control Type I error rate well, although some are conservative. However, the methods varied greatly in terms of power and running time, with the PLINK truncated set-based test being the most powerful and KGG being the fastest. CONCLUSIONS Raw data-based algorithms, such as those implemented in PLINK, are preferable for GWAS pathway analysis as long as computational capacity is available. It may be worthwhile to apply two or more pathway analysis algorithms on the same GWAS dataset, since the methods differ greatly in their outputs and might provide complementary findings for the studied complex disease.
Collapse
Affiliation(s)
- Hongsheng Gui
- Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China.
| | | | | | | |
Collapse
|
24
|
Wang L, Jia P, Wolfinger RD, Chen X, Zhao Z. Gene set analysis of genome-wide association studies: methodological issues and perspectives. Genomics 2011; 98:1-8. [PMID: 21565265 PMCID: PMC3852939 DOI: 10.1016/j.ygeno.2011.04.006] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2010] [Revised: 03/02/2011] [Accepted: 04/15/2011] [Indexed: 12/25/2022]
Abstract
Recent studies have demonstrated that gene set analysis, which tests disease association with genetic variants in a group of functionally related genes, is a promising approach for analyzing and interpreting genome-wide association studies (GWAS) data. These approaches aim to increase power by combining association signals from multiple genes in the same gene set. In addition, gene set analysis can also shed more light on the biological processes underlying complex diseases. However, current approaches for gene set analysis are still in an early stage of development in that analysis results are often prone to sources of bias, including gene set size and gene length, linkage disequilibrium patterns and the presence of overlapping genes. In this paper, we provide an in-depth review of the gene set analysis procedures, along with parameter choices and the particular methodology challenges at each stage. In addition to providing a survey of recently developed tools, we also classify the analysis methods into larger categories and discuss their strengths and limitations. In the last section, we outline several important areas for improving the analytical strategies in gene set analysis.
Collapse
Affiliation(s)
- Lily Wang
- Department of Biostatistics, Vanderbilt University, Nashville, TN 37232, USA
| | - Peilin Jia
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
- Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | | | - Xi Chen
- Division of Cancer Biostatistics, Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| | - Zhongming Zhao
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
- Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
- Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN 37232, USA
| |
Collapse
|
25
|
Ritchie MD. Using biological knowledge to uncover the mystery in the search for epistasis in genome-wide association studies. Ann Hum Genet 2011; 75:172-82. [PMID: 21158748 DOI: 10.1111/j.1469-1809.2010.00630.x] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The search for the missing heritability in genome-wide association studies (GWAS) has become an important focus for the human genetics community. One suspected location of these genetic effects is in gene-gene interactions, or epistasis. The computational burden of exploring gene-gene interactions in the wealth of data generated in GWAS, along with small to moderate sample sizes, have led to epistasis being an afterthought, rather than a primary focus of GWAS analyses. In this review, I discuss some potential approaches to filter a GWAS dataset to a smaller, more manageable dataset where searching for epistasis is considerably more feasible. I describe a number of alternative approaches, but primarily focus on the use of prior biological knowledge from databases in the public domain to guide the search for epistasis. The manner in which prior knowledge is incorporated into a GWA study can be many and these data can be extracted from a variety of database sources. I discuss a number of these approaches and propose that a comprehensive approach will likely be most fruitful for searching for epistasis in large-scale genomic studies of the current state-of-the-art and into the future.
Collapse
Affiliation(s)
- Marylyn D Ritchie
- Department of Molecular Physiology, Center for Human Genetics Research, Vanderbilt University, Nashville, TN 37232-0700, USA.
| |
Collapse
|
26
|
Wang K, Li M, Hakonarson H. Analysing biological pathways in genome-wide association studies. Nat Rev Genet 2010; 11:843-54. [PMID: 21085203 DOI: 10.1038/nrg2884] [Citation(s) in RCA: 581] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genome-wide association (GWA) studies have typically focused on the analysis of single markers, which often lacks the power to uncover the relatively small effect sizes conferred by most genetic variants. Recently, pathway-based approaches have been developed, which use prior biological knowledge on gene function to facilitate more powerful analysis of GWA study data sets. These approaches typically examine whether a group of related genes in the same functional pathway are jointly associated with a trait of interest. Here we review the development of pathway-based approaches for GWA studies, discuss their practical use and caveats, and suggest that pathway-based approaches may also be useful for future GWA studies with sequencing data.
Collapse
Affiliation(s)
- Kai Wang
- Center for Applied Genomics, The Childrens Hospital of Philadelphia, Pennsylvania 19104, USA
| | | | | |
Collapse
|
27
|
Li J, Humphreys K, Darabi H, Rosin G, Hannelius U, Heikkinen T, Aittomäki K, Blomqvist C, Pharoah PD, Dunning AM, Ahmed S, Hooning MJ, Hollestelle A, Oldenburg RA, Alfredsson L, Palotie A, Peltonen-Palotie L, Irwanto A, Low HQ, Teoh GH, Thalamuthu A, Kere J, D'Amato M, Easton DF, Nevanlinna H, Liu J, Czene K, Hall P. A genome-wide association scan on estrogen receptor-negative breast cancer. Breast Cancer Res 2010; 12:R93. [PMID: 21062454 PMCID: PMC3046434 DOI: 10.1186/bcr2772] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2010] [Revised: 10/06/2010] [Accepted: 11/09/2010] [Indexed: 12/20/2022] Open
Abstract
Introduction Breast cancer is a heterogeneous disease and may be characterized on the basis of whether estrogen receptors (ER) are expressed in the tumour cells. ER status of breast cancer is important clinically, and is used both as a prognostic indicator and treatment predictor. In this study, we focused on identifying genetic markers associated with ER-negative breast cancer risk. Methods We conducted a genome-wide association analysis of 285,984 single nucleotide polymorphisms (SNPs) genotyped in 617 ER-negative breast cancer cases and 4,583 controls. We also conducted a genome-wide pathway analysis on the discovery dataset using permutation-based tests on pre-defined pathways. The extent of shared polygenic variation between ER-negative and ER-positive breast cancers was assessed by relating risk scores, derived using ER-positive breast cancer samples, to disease state in independent, ER-negative breast cancer cases. Results Association with ER-negative breast cancer was not validated for any of the five most strongly associated SNPs followed up in independent studies (1,011 ER-negative breast cancer cases, 7,604 controls). However, an excess of small P-values for SNPs with known regulatory functions in cancer-related pathways was found (global P = 0.052). We found no evidence to suggest that ER-negative breast cancer shares a polygenic basis to disease with ER-positive breast cancer. Conclusions ER-negative breast cancer is a distinct breast cancer subtype that merits independent analyses. Given the clinical importance of this phenotype and the likelihood that genetic effect sizes are small, greater sample sizes and further studies are required to understand the etiology of ER-negative breast cancers.
Collapse
Affiliation(s)
- Jingmei Li
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 17177, Sweden.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Pan F, Liu XG, Guo YF, Chen Y, Dong SS, Qiu C, Zhang ZX, Zhou Q, Yang TL, Guo Y, Zhu XZ, Deng HW. The regulation-of-autophagy pathway may influence Chinese stature variation: evidence from elder adults. J Hum Genet 2010; 55:441-7. [PMID: 20448653 DOI: 10.1038/jhg.2010.44] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recent success of genome-wide association studies (GWASs) on human height variation emphasized the effects of individual loci or genes. In this study, we used a developed pathway-based approach to further test biological pathways for potential association with stature, by examining approximately 370,000 single-nucleotide polymorphisms (SNPs) across the human genome in 618 unrelated elder Han Chinese. A total of 626 biological pathways annotated by any of the three major public pathway databases (KEGG, BioCarta and Ambion GeneAssist Pathway Atlas) were tested. The regulation-of-autophagy (ROA) (nominal P=0.012) pathway was marginally significantly associated with human stature after our family wise error rate multiple-testing correction. We also used 1000 random recruited US whites for further replication. Interestingly, the ROA pathway presented the strongest signals in whites for height variation (nominal P=0.002). The results correspond to biological roles of the ROA pathway in human long bone development and growth. Our findings also implied that multiple-genetic factors may work jointly as a functional unit (pathway), and the traditional GWASs could have missed important genetic information imbedded in those less significant markers.
Collapse
Affiliation(s)
- Feng Pan
- The Key Laboratory of Biomedical Information Engineering of Ministry of Education, and Institute of Molecular Genetics, School of Life Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|