1
|
Luo M, Wang T, Huang P, Zhang S, Song X, Sun M, Liu Y, Wei J, Shu J, Zhong T, Chen Q, Zhu P, Qin J. Association of Maternal Betaine-Homocysteine Methyltransferase (BHMT) and BHMT2 Genes Polymorphisms with Congenital Heart Disease in Offspring. Reprod Sci 2023; 30:309-325. [PMID: 35835902 DOI: 10.1007/s43032-022-01029-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 06/25/2022] [Indexed: 01/11/2023]
Abstract
To systematically explore the association of single nucleotide polymorphisms (SNPs) of maternal BHMT and BHMT2 genes with the risk of congenital heart disease (CHD) and its three subtypes including atrial septal defect (ASD), ventricular septal defect (VSD), and patent ductus arteriosus (PDA) in offspring. A hospital-based case-control study involving 683 mothers of CHD children and 740 controls was performed. Necessary exposure information was captured through epidemiological investigation. Totally twelve SNPs of maternal BHMT and BHMT2 genes were detected and analyzed systematically. The study showed that maternal BHMT gene polymorphisms at rs1316753 (CG vs. CC: OR = 1.96 [95% CI 1.41-2.71]; GG vs. CC: OR = 1.99 [95% CI 1.32-3.00]; dominant model: OR = 1.97 [95% CI 1.44-2.68]) and rs1915706 (TC vs. TT: OR = 1.93 [95% CI 1.44-2.59]; CC vs. TT: OR = 2.55 [95% CI 1.38-4.72]; additive model: OR = 1.77 [95% CI 1.40-2.24]) were significantly associated with increased risk of total CHD in offspring. And two haplotypes were observed to be significantly associated with risk of total CHD, including C-C haplotype involving rs1915706 and rs3829809 in BHMT gene (OR = 1.30 [95% CI 1.07-1.58]) and C-A-A-C haplotype involving rs642431, rs592052, rs626105, and rs682985 in BHMT2 gene (OR = 0.71 [95% CI 0.58-0.88]). Besides, a three-locus model involving rs1316753 (BHMT), rs1915706 (BHMT), and rs642431 (BHMT2) was identified through gene-gene interaction analyses (P < 0.01). As for three subtypes including ASD, VSD, and PDA, significant SNPs and haplotypes were also identified. The results indicated that maternal BHMT gene polymorphisms at rs1316753 and rs1915706 are significantly associated with increased risk of total CHD and its three subtypes in offspring. Besides, significant interactions between different SNPs do exist on risk of CHD. Nevertheless, studies with larger sample size in different ethnic populations and involving more SNPs in more genes are expected to further define the genetic contribution underlying CHD and its subtypes.
Collapse
Affiliation(s)
- Manjun Luo
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Tingting Wang
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China.
- NHC Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China.
| | - Peng Huang
- Department of Cardiothoracic Surgery, Hunan Children's Hospital, Changsha, China
| | - Senmao Zhang
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Xinli Song
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Mengting Sun
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Yiping Liu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jianhui Wei
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jing Shu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Taowei Zhong
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Qian Chen
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Ping Zhu
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.
| | - Jiabi Qin
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China.
- NHC Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China.
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Changsha, China.
| |
Collapse
|
2
|
Hu X, Meng Z. Using potential variable to study gene-gene and gene-environment interaction effects with genetic model uncertainty. Ann Hum Genet 2022; 86:257-267. [PMID: 35582845 DOI: 10.1111/ahg.12470] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/02/2022] [Accepted: 04/08/2022] [Indexed: 11/28/2022]
Abstract
One of the critical issues in genetic association studies is to evaluate the risk of a disease associated with gene-gene or gene-environment interactions. The commonly employed procedures are derived by assigning a particular set of scores to genotypes. However, the underlying genetic models of inheritance are rarely known in practice. Misspecifying a genetic model may result in power loss. By using some potential genetic variables to separate the genotype coding and genetic model parameter, we construct a model-embedded score test (MEST). Our test is free of assumption of gene-environment independence and allows for covariates in the model. An effective sequential optimization algorithm is developed. Extensive simulations show the proposed MEST is robust and powerful in most of scenarios. Finally, we apply the proposed method to rheumatoid arthritis data from the Genetic Analysis Workshop 16 to further investigate the potential interaction effects.
Collapse
Affiliation(s)
- Xiaonan Hu
- NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Zhen Meng
- School of Statistics, Capital University of Economics and Business, Beijing, China
| |
Collapse
|
3
|
Song X, Wei J, Shu J, Liu Y, Sun M, Zhu P, Qin J. Association of polymorphisms of FOLR1 gene and FOLR2 gene and maternal folic acid supplementation with risk of ventricular septal defect: a case-control study. Eur J Clin Nutr 2022; 76:1273-1280. [PMID: 35273364 DOI: 10.1038/s41430-022-01110-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Revised: 02/18/2022] [Accepted: 02/22/2022] [Indexed: 11/09/2022]
Abstract
OBJECTIVES It was the first time to examine the role of maternal polymorphisms of FOLR1 gene and FOLR2 gene, as well as their interactions with maternal folic acid supplementation (FAS), in the risk of ventricular septal defect (VSD). METHODS A case-control study was conducted with 385 mothers of VSD infants and 652 controls. The exposures of interest were FAS and FOLR1 gene and FOLR2 gene polymorphisms. The logistic regression model was used for accessing the strength of association. RESULTS After controlling for the potential confounders, women who did not utilize folic acid had a substantially higher risk of VSD (aOR = 2.25; 95% CI: 1.48 to 3.43), compared to those who did. We also observed genetic polymorphisms of FOLR1 gene at rs2071010 (GA vs. GG: aOR = 0.63, 95%CI: 0.45 to 0.88) and rs11235462 (AA vs. TT: aOR = 0.53, 95%CI: 0.33 to 0.84), as well as FOLR2 gene at rs651646 (AA vs. TT: aOR = 0.46, 95%CI: 0.30 to 0.70), rs2298444 (CC vs. TT: aOR = 0.58, 95%CI: 0.36 to 0.91) and rs514933 (TC vs. TT: aOR = 0.57, 95%CI: 0.41 to 0.78) were associated with a lower risk of VSD. Furthermore, there was a statistically significant interaction between maternal FAS and genetic polymorphisms at rs514933 on the risk of VSD (FDR_P = 0.015). CONCLUSIONS The maternal genetic polymorphisms of the FOLR1 gene and FOLR2 gene, as well as FAS and their interactions, were shown to be significantly associated with the risk of VSD in offspring.
Collapse
Affiliation(s)
- Xinli Song
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China
| | - Jianhui Wei
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China
| | - Jing Shu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China
| | - Yiping Liu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China
| | - Mengting Sun
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China
| | - Ping Zhu
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China.
| | - Jiabi Qin
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, Hunan, China. .,Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China. .,NHC Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, Hunan, China. .,Hunan Provincial Key Laboratory of clinical epidemiology, Changsha, Hunan, China.
| |
Collapse
|
4
|
Song X, Li Q, Diao J, Li J, Li Y, Zhang S, Zhao L, Chen L, Wei J, Shu J, Liu Y, Sun M, Huang P, Wang T, Qin J. Association of MTHFD1 gene polymorphisms and maternal smoking with risk of congenital heart disease: a hospital-based case-control study. BMC Pregnancy Childbirth 2022; 22:88. [PMID: 35100977 PMCID: PMC8805321 DOI: 10.1186/s12884-022-04419-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 01/20/2022] [Indexed: 11/16/2022] Open
Abstract
Background MTHFD1 gene may affect the embryonic development by elevated homocysteine levels, DNA synthesis and DNA methylation, but limited number of genetic variants of MTHFD1 gene was focused on the association with congenital heart disease (CHD). This study examined the role of MTHFD1 gene and maternal smoking on infant CHD risk, and investigated their interaction effects in Chinese populations. Methods A case-control study of 464 mothers of CHD infants and 504 mothers of health controls was performed. The exposures of interest were maternal tobacco exposure, single nucleotide polymorphisms (SNPs) of maternal MTHFD1 gene. The logistic regression model was used for accessing the strength of association. Results Mothers exposed to secondhand smoke during 3 months before pregnancy (adjusted odds ratio [aOR] = 1.56; 95% confidence interval [CI]: 1.13–2.15) and in the first trimester of pregnancy (aOR = 2.24; 95%CI: 1.57–3.20) were observed an increased risk of CHD. Our study also found that polymorphisms of maternal MTHFD1 gene at rs1950902 (AA vs. GG: aOR = 1.73, 95% CI: 1.01–2.97), rs2236222 (GG vs. AA: aOR = 2.38, 95% CI: 1.38–4.12), rs1256142 (GA vs.GG: aOR = 1.57, 95% CI: 1.01–2.45) and rs11849530 (GG vs. AA: aOR = 1.68, 95% CI: 1.02–2.77) were significantly associated with higher risk of CHD. However, we did not observe a significant association between maternal MTHFD1 rs2236225 and offspring CHD risk. Furthermore, we found the different degrees of interaction effects between polymorphisms of the MTHFD1 gene including rs1950902, rs2236222, rs1256142, rs11849530 and rs2236225, and maternal tobacco exposure. Conclusions Maternal polymorphisms of MTHFD1 gene, maternal tobacco exposure and their interactions are significantly associated with the risk of CHD in offspring in Han Chinese populations. However, more studies in different ethnic populations with a larger sample and prospective designs are required to confirm our findings. Trial registration Registration number: ChiCTR1800016635. Supplementary Information The online version contains supplementary material available at 10.1186/s12884-022-04419-2.
Collapse
Affiliation(s)
- Xinli Song
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Qiongxuan Li
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Jingyi Diao
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Jinqi Li
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Yihuan Li
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Senmao Zhang
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Lijuan Zhao
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Letao Chen
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Jianhui Wei
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Jing Shu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Yiping Liu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Mengting Sun
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China
| | - Peng Huang
- Department of Cardiothoracic Surgery, Hunan Children's Hospital, Changsha, Hunan, China
| | - Tingting Wang
- NHC Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, 53 Xiangchun Road, Changsha, 410028, Hunan, China.
| | - Jiabi Qin
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, 110 Xiangya Road, Changsha, 410078, Hunan, China. .,Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong, China. .,NHC Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, 53 Xiangchun Road, Changsha, 410028, Hunan, China. .,Hunan Provincial Key Laboratory of clinical epidemiology, Changsha, Hunan, China.
| |
Collapse
|
5
|
Zhong T, Song X, Liu Y, Sun M, Zhang S, Chen L, Diao J, Li J, Li Y, Shu J, Wei J, Zhu P, Wang T, Qin J. Association of methylenetetrahydrofolate reductase gene polymorphisms and maternal folic acid use with the risk of congenital heart disease. Front Pediatr 2022; 10:939119. [PMID: 36160803 PMCID: PMC9492935 DOI: 10.3389/fped.2022.939119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/08/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND To systematically evaluate the association of MTHFR genetic polymorphisms, maternal folic acid intake, and the time when folic acid intake was started with the risk of congenital heart disease (CHD) and investigated the role of their interaction on infant CHD risk in Chinese populations. METHODS A case-control study involving 592 CHD cases, 617 health controls, and their mothers was performed. The exposures of interest were single nucleotide polymorphisms (SNPs) of the MTHFR gene, maternal folic acid use, and the time when folic acid use was started. We applied the logistic regression model to explore the strength of association. RESULTS Our findings showed that mothers lacking folic acid intake had a significantly higher risk of CHD in offspring (aOR = 2.00; 95%CI: 1.34-2.98). Mothers who started to use folic acid from the first trimester of the fetation (aOR = 1.65; 95% CI: 1.22-2.23) or from the second trimester of the fetation (aOR = 7.77; 95% CI: 2.52-23.96), compared with those starting to use folic acid from 3 months previous to the conception, were at a significantly higher risk of CHD in offspring. Genetic variants at rs2066470 (AA vs. GG: aOR = 5.09, 95%CI: 1.99-13.03), rs1801133 (AA vs. GG: aOR = 2.49, 95%CI: 1.58-3.93), and rs1801131 (TG vs. TT: aOR = 1.84, 95%CI: 1.36-2.50; GG vs. TT: aOR = 3.58, 95%CI: 1.68-7.63) were significantly associated with the risk of CHD based on the multivariate analysis. Additionally, statistically significant interactions between maternal folic acid intake and genetic variants of the MTHFR gene at rs1801133 and rs1801131 were observed. CONCLUSION An association of maternal folic acid intake and the time when intake was started with the risk of CHD in offspring was found. What's more, maternal folic acid fortification may help counteract partial of the risks of CHD in offspring attributable to MTHFR genetic mutations. REGISTRATION NUMBER http://www.chictr.org.cn/edit.aspx?pid=28300&htm=4, identifier: ChiCTR1800016635.
Collapse
Affiliation(s)
- Taowei Zhong
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Xinli Song
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Yiping Liu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Mengting Sun
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Senmao Zhang
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Letao Chen
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jingyi Diao
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jinqi Li
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Yihuan Li
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jing Shu
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Jianhui Wei
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China
| | - Ping Zhu
- Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China
| | - Tingting Wang
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China.,National Health Council (NHC) Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China
| | - Jiabi Qin
- Department of Epidemiology and Health Statistics, Xiangya School of Public Health, Central South University, Changsha, China.,Guangdong Cardiovascular Institute, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, China.,National Health Council (NHC) Key Laboratory of Birth Defect for Research and Prevention, Hunan Provincial Maternal and Child Health Care Hospital, Changsha, China
| |
Collapse
|
6
|
CMAX3: A Robust Statistical Test for Genetic Association Accounting for Covariates. Genes (Basel) 2021; 12:genes12111723. [PMID: 34828328 PMCID: PMC8622598 DOI: 10.3390/genes12111723] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 10/21/2021] [Accepted: 10/26/2021] [Indexed: 12/13/2022] Open
Abstract
The additive genetic model as implemented in logistic regression has been widely used in genome-wide association studies (GWASs) for binary outcomes. Unfortunately, for many complex diseases, the underlying genetic models are generally unknown and a mis-specification of the genetic model can result in a substantial loss of power. To address this issue, the MAX3 test (the maximum of three separate test statistics) has been proposed as a robust test that performs plausibly regardless of the underlying genetic model. However, the original implementation of MAX3 utilizes the trend test so it cannot adjust for any covariates such as age and gender. This drawback has significantly limited the application of the MAX3 in GWASs, as covariates account for a considerable amount of variability in these disorders. In this paper, we extended the MAX3 and proposed the CMAX3 (covariate-adjusted MAX3) based on logistic regression. The proposed test yielded a similar robust efficiency as the original MAX3 while easily adjusting for any covariate based on the likelihood framework. The asymptotic formula to calculate the p-value of the proposed test was also developed in this paper. The simulation results showed that the proposed test performed desirably under both the null and alternative hypotheses. For the purpose of illustration, we applied the proposed test to re-analyze a case-control GWAS dataset from the Collaborative Studies on Genetics of Alcoholism (COGA). The R code to implement the proposed test is also introduced in this paper and is available for free download.
Collapse
|
7
|
Two GWAS-identified variants are associated with lumbar spinal stenosis and Gasdermin-C expression in Chinese population. Sci Rep 2020; 10:21069. [PMID: 33273635 PMCID: PMC7713291 DOI: 10.1038/s41598-020-78249-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 11/17/2020] [Indexed: 12/13/2022] Open
Abstract
The aim of this study is to investigate the expression levels of genome-wide association studies (GWAS)-identified variants near Gasdermin-C (GSDMC) and its association with lumbar disc degeneration (LDD) in a Chinese population. In accordance with previously reported findings, our study involved the top 4 variants; rs6651255, rs7833174, rs4130415, and rs7816342. A total of 800 participants, 400 LDD patients and 400 controls were involved in the study. The LDD patients were divided into two mutually exclusive subgroups: subgroup 1: lumbar disc herniation; subgroup 2: lumbar spinal stenosis. Genotyping were performed using TaqMan assay, and Enzyme-Linked Immunosorbent Assay (ELISA) used to measure the plasma GSDMC levels, while quantitative reverse-transcription (qRT)-PCR and immunohistochemistry (IHC) were used to evaluate the GSDMC expression levels. Among the studied variants, there were no statistically significant differences in allelic and genotypic frequencies between LDD patients and their controls (all P > 0.05). However, the subgroup analysis revealed a significant association between rs6651255 and rs7833174 in patients with lumbar spinal stenosis (subgroup 2). Furthermore, the max-statistic test revealed that the inheritance models of two variants of lumbar spinal stenosis were represented by the recessive model. The plasma and mRNA expression levels of GSDMC were significantly higher in patients with lumbar spinal stenosis compared with the control group (P < 0.05). Furthermore, the CC genotypes of rs6651255 and rs7833174 were significantly associated with increased plasma expression levels of GSDMC in patients with lumbar spinal stenosis (P < 0.01). Two GWAS-identified variants (rs6651255 and rs7833174) near GSDMC were associated with a predisposition to lumbar spinal stenosis. GSDMC protein and mRNA expression levels may have prognostic qualities as biomarkers for the existence, occurrence or development of lumbar spinal stenosis.
Collapse
|
8
|
Xue Y, Ding J, Wang J, Zhang S, Pan D. Two-phase SSU and SKAT in genetic association studies. J Genet 2020. [DOI: 10.1007/s12041-019-1166-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
9
|
Moore CM, Jacobson SA, Fingerlin TE. Power and Sample Size Calculations for Genetic Association Studies in the Presence of Genetic Model Misspecification. Hum Hered 2020; 84:256-271. [PMID: 32721961 DOI: 10.1159/000508558] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 05/11/2020] [Indexed: 01/02/2023] Open
Abstract
INTRODUCTION When analyzing data from large-scale genetic association studies, such as targeted or genome-wide resequencing studies, it is common to assume a single genetic model, such as dominant or additive, for all tests of association between a given genetic variant and the phenotype. However, for many variants, the chosen model will result in poor model fit and may lack statistical power due to model misspecification. OBJECTIVE We develop power and sample size calculations for tests of gene and gene × environment interaction, allowing for misspecification of the true mode of genetic susceptibility. METHODS The power calculations are based on a likelihood ratio test framework and are implemented in an open-source R package ("genpwr"). RESULTS We use these methods to develop an analysis plan for a resequencing study in idiopathic pulmonary fibrosis and show that using a 2-degree of freedom test can increase power to detect recessive genetic effects while maintaining power to detect dominant and additive effects. CONCLUSIONS Understanding the impact of model misspecification can aid in study design and developing analysis plans that maximize power to detect a range of true underlying genetic effects. In particular, these calculations help identify when a multiple degree of freedom test or other robust test of association may be advantageous.
Collapse
Affiliation(s)
- Camille M Moore
- Center for Genes, Environment, and Health, National Jewish Health, Denver, Colorado, USA,
| | - Sean A Jacobson
- Center for Genes, Environment, and Health, National Jewish Health, Denver, Colorado, USA
| | - Tasha E Fingerlin
- Center for Genes, Environment, and Health, National Jewish Health, Denver, Colorado, USA
| |
Collapse
|
10
|
Xue Y, Ding J, Wang J, Zhang S, Pan D. Two-phase SSU and SKAT in genetic association studies. J Genet 2020; 99:9. [PMID: 32089528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The sum of squared score (SSU) and sequence kernel association test (SKAT) are the two good alternative tests for genetic association studies in case-control data. Both SSU and SKAT are derived through assuming a dose-response model between the risk of disease and genotypes. However, in practice, the real genetic mode of inheritance is impossible to know. Thus, these two tests might losepower substantially as shown in simulation results when the genetic model is misspecified. Here, to make both the tests suitable in broad situations, we propose two-phase SSU (tpSSU) and two-phase SKAT (tpSKAT), where the Hardy-Weinberg equilibrium test is adopted to choose the genetic model in the first phase and the SSU and SKAT are constructed corresponding to the selected genetic model in the second phase. We found that both tpSSU and tpSKAT outperformed the original SSU and SKAT in most of our simulation scenarios. Byapplying tpSSU and tpSKAT to the study of type 2 diabetes data, we successfully identified some genes that have direct effects on obesity. Besides, we also detected the significant chromosomal region 10q21.22 in GAW16 rheumatoid arthritis dataset, with P<10-6. These findings suggest that tpSSU and tpSKAT can be effective in identifying genetic variants for complex diseases in case-control association studies.
Collapse
Affiliation(s)
- Yuan Xue
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China.
| | | | | | | | | |
Collapse
|
11
|
Dimou NL, Pantavou KG, Braliou GG, Bagos PG. Multivariate Methods for Meta-Analysis of Genetic Association Studies. Methods Mol Biol 2019; 1793:157-182. [PMID: 29876897 DOI: 10.1007/978-1-4939-7868-7_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.
Collapse
Affiliation(s)
- Niki L Dimou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece.,Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
| | - Katerina G Pantavou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | - Georgia G Braliou
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
| | - Pantelis G Bagos
- Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece.
| |
Collapse
|
12
|
Zang Y, Fung WK, Cao S, Ng HKT, Zhang C. Robust tests for gene–environment interaction in case-control and case-only designs. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2018.08.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
13
|
Chen Z, Liu Q, Wang K. A genetic association test through combining two independent tests. Genomics 2018; 111:1152-1159. [PMID: 30009923 DOI: 10.1016/j.ygeno.2018.07.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Revised: 06/25/2018] [Accepted: 07/11/2018] [Indexed: 12/21/2022]
Abstract
Gene- and pathway-based variant association tests are important tools in finding genetic variants that are associated with phenotypes of interest. Although some methods have been proposed in the literature, powerful and robust statistical tests are still desirable in this area. In this study, we propose a statistical test based on decomposing the genotype data into orthogonal parts from which powerful and robust independent p-value combination approaches can be utilized. Through a comprehensive simulation study, we compare the proposed test with some existing popular ones. Our simulation results show that the new test has great performance in terms of controlling type I error rate and statistical power. Real data applications are also conducted to illustrate the performance and usefulness of the proposed test.
Collapse
Affiliation(s)
- Zhongxue Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, 1025 E. 7th street, Bloomington, IN 47405, USA.
| | - Qingzhong Liu
- Department of Computer Science, Sam Houston State University, 1803 Avenue I, Huntsville, TX 77341, USA
| | - Kai Wang
- Department of Biostatistics, College of Public Health, University of Iowa, 145 N. Riverside Drive, Iowa City, IA 52242, USA
| |
Collapse
|
14
|
Chen Z, Lu Y, Lin T, Liu Q, Wang K. Gene-based genetic association test with adaptive optimal weights. Genet Epidemiol 2017; 42:95-103. [DOI: 10.1002/gepi.22098] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2017] [Accepted: 10/22/2017] [Indexed: 12/13/2022]
Affiliation(s)
- Zhongxue Chen
- Department of Epidemiology and Biostatistics; School of Public Health; Indiana University Bloomington; Bloomington Indiana United States of America
| | - Yan Lu
- Department of Mathematics and Statistics; University of New Mexico; Albuquerque New Mexico United States of America
| | - Tong Lin
- The Key Laboratory of Machine Perception (Ministry of Education); School of EECS; Peking University; Beijing China
| | - Qingzhong Liu
- Department of Computer Science; Sam Houston State University; Huntsville Texas United States of America
| | - Kai Wang
- Department of Biostatistics; College of Public Health; University of Iowa; Iowa City Iowa United States of America
| |
Collapse
|
15
|
Gaye A, Davis SK. Genetic model misspecification in genetic association studies. BMC Res Notes 2017; 10:569. [PMID: 29115983 PMCID: PMC5678796 DOI: 10.1186/s13104-017-2911-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 11/01/2017] [Indexed: 02/08/2023] Open
Abstract
Objective The underlying model of the genetic determinant of a trait is generally not known with certainty a priori. Hence, in genetic association studies, a dominant model might be erroneously modelled as additive, an error investigated previously. We explored this question, for candidate gene studies, by evaluating the sample size required to compensate for the misspecification and improve inference at the analysis stage. Power calculations were carried out with (1) the true dominant model and (2) the incorrect additive model. Empirical power, sample size and effect size were compared between scenarios (1) and (2). In each of the scenarios the estimates were evaluated for a rare (minor allele frequency < 0.01), low frequency (0.01 ≤ minor allele frequency < 0.05) and common (minor allele frequency ≥ 0.05) single nucleotide polymorphism. Results The results confirm the detrimental effect of the misspecification error on power and effect size for any minor allele frequency. The implications of the error are not negligible; therefore, candidate gene studies should consider the more conservative sample size to compensate for the effect of error. When it is not possible to extend the sample size, methods that help mitigate the impact of the error should be systematically used. Electronic supplementary material The online version of this article (10.1186/s13104-017-2911-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Amadou Gaye
- Metabolic, Cardiovascular and Inflammatory Disease Genomics Branch, Social Epidemiology Research Unit, National Institutes of Health, National Human Genome Research Institute, Bethesda, USA.
| | - Sharon K Davis
- Metabolic, Cardiovascular and Inflammatory Disease Genomics Branch, Social Epidemiology Research Unit, National Institutes of Health, National Human Genome Research Institute, Bethesda, USA
| |
Collapse
|
16
|
Chen Z, Han S, Wang K. Genetic association test based on principal component analysis. Stat Appl Genet Mol Biol 2017; 16:189-198. [DOI: 10.1515/sagmb-2016-0061] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
AbstractMany gene- and pathway-based association tests have been proposed in the literature. Among them, the SKAT is widely used, especially for rare variants association studies. In this paper, we investigate the connection between SKAT and a principal component analysis. This investigation leads to a procedure that encompasses SKAT as a special case. Through simulation studies and real data applications, we compare the proposed method with some existing tests.
Collapse
|
17
|
Angiotensin-Converting Enzyme Insertion/Deletion Polymorphism and Susceptibility to Osteoarthritis of the Knee: A Case-Control Study and Meta-Analysis. PLoS One 2016; 11:e0161754. [PMID: 27657933 PMCID: PMC5033346 DOI: 10.1371/journal.pone.0161754] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Accepted: 08/11/2016] [Indexed: 02/07/2023] Open
Abstract
Background Studies of angiotensin-converting enzyme insertion/deletion (ACE I/D) polymorphisms and the risks of knee osteoarthritis (OA) have yielded conflicting results. Objective To determine the association between ACE I/D and knee OA, we conducted a combined case-control study and meta-analysis. Methods For the case-control study, 447 knee OA cases and 423 healthy controls were recruited between March 2010 and July 2011. Knee OA cases were defined using the Kellgren-Lawrence grading system, and the ACE I/D genotype was determined using a standard polymerase chain reaction. The association between ACE I/D and knee OA was detected using allele, genotype, dominant, and recessive models. For the meta-analysis, PubMed and Embase databases were systematically searched for prospective observational studies published up until August 2015. Studies of ACE I/D and knee OA with sufficient data were selected. Pooled results were expressed as odds ratios (ORs) with corresponding 95% confidence intervals (CI) for the D versus I allele with regard to knee OA risk. Results We found no significant association between the D allele and knee OA [OR: 1.09 (95% CI: 0.76–1.89)] in the present case-control study, and the results of other genetic models were also nonsignificant. Five current studies were included, and there were a total of six study populations after including our case-control study (1165 cases and 1029 controls). In the meta-analysis, the allele model also yielded nonsignificant results [OR: 1.37 (95% CI: 0.95–1.99)] and a high heterogeneity (I2: 87.2%). Conclusions The association between ACE I/D and knee OA tended to yield negative results. High heterogeneity suggests a complex, multifactorial mechanism, and an epistasis analysis of ACE I/D and knee OA should therefore be conducted.
Collapse
|
18
|
Zheng G, Zhang W, Xu J, Yuan A, Li Q, Gastwirth JL. Genetic risks and genetic model specification. J Theor Biol 2016; 403:68-74. [PMID: 27181372 DOI: 10.1016/j.jtbi.2016.05.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 04/12/2016] [Accepted: 05/06/2016] [Indexed: 11/17/2022]
Abstract
Genetic risks and genetic models are often used in design and analysis of genetic epidemiology studies. A genetic model is defined in terms of two genetic risk measures: genotype relative risk and odds ratio. The impacts of choosing a risk measure on the resulting genetic models are studied in the power to detect association and deviation from Hardy-Weinberg equilibrium in cases using genetic relative risk. Extensive simulations demonstrate that the power of a study to detect associations using odds ratio is lower than that using relative risk with the same value when other parameters are fixed. When the Hardy-Weinberg equilibrium holds in the general population, the genetic model can be inferred by the deviation from Hardy-Weinberg equilibrium in only cases. Furthermore, it is more efficient than that based on the deviation from Hardy-Weinberg equilibrium in all cases and controls.
Collapse
Affiliation(s)
- Gang Zheng
- National Heart, Lung and Blood Institute, Bethesda, MD, USA
| | - Wei Zhang
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | | | - Ao Yuan
- Georgetown University, Washington DC, USA.
| | - Qizhai Li
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | | |
Collapse
|
19
|
Pan DD, Li ZB, Li QZ, Kam Fung W. A Novel Powerful Joint Analysis with Data Fusion in Two-stage Case–Control Genome-wide Association Studies. COMMUN STAT-SIMUL C 2016. [DOI: 10.1080/03610918.2014.901360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
20
|
Zhang W, Zhang Z, Li X, Li Q. Fitting Proportional Odds Model to Case-Control data with Incorporating Hardy-Weinberg Equilibrium. Sci Rep 2015; 5:17286. [PMID: 26607176 PMCID: PMC4660314 DOI: 10.1038/srep17286] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 10/28/2015] [Indexed: 01/07/2023] Open
Abstract
Genetic association studies have been proved to be an efficient tool to reveal the aetiology of many human complex diseases and traits. When the phenotype is binary, the logistic regression model is commonly employed to evaluate the association strength of the genetic variants predispose to human diseases because the maximum likelihood estimator of the odds ratio based on case-control data is equivalent to that from the same model by taking the data as being arisen prospectively. This equivalence does not hold for the proportional odds model and using it to analyze the case-control data directly often results in a substantial bias. Through putting a parameter of the minor allele frequency in the modified likelihood function under the condition that the Hardy-Weinberg equilibrium law holds within controls, a consistent estimator is obtained. On the basis of it, we construct a score test statistic to test whether the genetic variant is associated with the diseases. Simulation studies show that the proposed estimator has smaller mean squared error than the existing methods when the genetic effect size is away from zero and the proposed test statistic has a good control of type I error rate and is more powerful than the existing procedures. Application to 45 single nucleotide polymorphisms located in the region of TRAF1-C5 genes for the association with four-level anticyclic citrullinated protein antibody from Genetic Analysis Workshop 16 further demonstrates its performance.
Collapse
Affiliation(s)
- Wei Zhang
- Key Laboratory of Systems Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Zehui Zhang
- Central China Normal University, Wuhan, China
| | | | - Qizhai Li
- Key Laboratory of Systems Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
21
|
Robust Association Tests for the Replication of Genome-Wide Association Studies. BIOMED RESEARCH INTERNATIONAL 2015; 2015:461593. [PMID: 26345547 PMCID: PMC4539975 DOI: 10.1155/2015/461593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 02/14/2015] [Accepted: 02/14/2015] [Indexed: 11/18/2022]
Abstract
In genome-wide association study (GWAS), robust genetic association tests such as maximum of three CATTs (MAX3), each corresponding to recessive, additive, and dominant genetic models, the minimum p value of Pearson's Chi-square test with 2 degrees of freedom, and CATT based on additive genetic model (MIN2), genetic model selection (GMS), and genetic model exclusion (GME) methods have been shown to provide better power performance under wide range of underlying genetic models. In this paper, we demonstrate how these robust tests can be applied to the replication study of GWAS and how the overall statistical significance can be evaluated using the combined test formed by p values of the discovery and replication studies.
Collapse
|
22
|
Qu L. Combining dependent F-tests for robust association of quantitative traits under genetic model uncertainty. Stat Appl Genet Mol Biol 2014; 13:123-39. [PMID: 24603842 DOI: 10.1515/sagmb-2013-0001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In association mapping of quantitative traits, the F-test based on an assumed genetic model is a basic statistical tool for testing association of each candidate locus with the trait of interest. However, the true underlying genetic model is often unknown, and using an incorrect model may cause serious loss of power. For case-control studies, it is known that the combination of several tests that are optimal for different models is robust to model misspecification. In this paper, we extend the test combination approach to quantitative trait association. We first derive the exact correlations among transformed test statistics and discuss interesting special cases. We then propose and evaluate a multivariate normality based approximation to the joint distribution of test statistics, such that the marginal distributions and pairwise correlations among test statistics are accounted for. Through simulations, we show that the sizes of the resulting approximate combined tests are accurate for practical purposes under a variety of situations. We find that the combination of the tests from the additive model and the genotypic model performs well, because it demonstrates both robustness to incorrect models and satisfactory power. A mouse lipoprotein data set is used to demonstrate the method.
Collapse
|
23
|
|
24
|
Kang G, Bi W, Zhao Y, Zhang JF, Yang JJ, Xu H, Loh ML, Hunger SP, Relling MV, Pounds S, Cheng C. A new system identification approach to identify genetic variants in sequencing studies for a binary phenotype. Hum Hered 2014; 78:104-16. [PMID: 25096228 DOI: 10.1159/000363660] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2013] [Accepted: 05/16/2014] [Indexed: 12/24/2022] Open
Abstract
We propose in this paper a set-valued (SV) system model, which is a generalized form of logistic (LG) and Probit (Probit) regression, to be considered as a method for discovering genetic variants, especially rare genetic variants in next-generation sequencing studies, for a binary phenotype. We propose a new SV system identification method to estimate all underlying key system parameters for the Probit model and compare it with the LG model in the setting of genetic association studies. Across an extensive series of simulation studies, the Probit method maintained type I error control and had similar or greater power than the LG method, which is robust to different distributions of noise: logistic, normal, or t distributions. Additionally, the Probit association parameter estimate was 2.7-46.8-fold less variable than the LG log-odds ratio association parameter estimate. Less variability in the association parameter estimate translates to greater power and robustness across the spectrum of minor allele frequencies (MAFs), and these advantages are the most pronounced for rare variants. For instance, in a simulation that generated data from an additive logistic model with an odds ratio of 7.4 for a rare single nucleotide polymorphism with a MAF of 0.005 and a sample size of 2,300, the Probit method had 60% power whereas the LG method had 25% power at the α = 10(-6) level. Consistent with these simulation results, the set of variants identified by the LG method was a subset of those identified by the Probit method in two example analyses. Thus, we suggest the Probit method may be a competitive alternative to the LG method in genetic association studies such as candidate gene, genome-wide, or next-generation sequencing studies for a binary phenotype.
Collapse
Affiliation(s)
- Guolian Kang
- Department of Biostatistics and Pharmaceutical Sciences, St. Jude Children's Research Hospital, Memphis, Tenn., USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Chen Z. A new association test based on disease allele selection for case-control genome-wide association studies. BMC Genomics 2014; 15:358. [PMID: 24886381 PMCID: PMC4059871 DOI: 10.1186/1471-2164-15-358] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2014] [Accepted: 05/06/2014] [Indexed: 12/20/2022] Open
Abstract
Background Current robust association tests for case–control genome-wide association study (GWAS) data are mainly based on the assumption of some specific genetic models. Due to the richness of the genetic models, this assumption may not be appropriate. Therefore, robust but powerful association approaches are desirable. Results In this paper, we propose a new approach to testing for the association between the genotype and phenotype for case–control GWAS. This method assumes a generalized genetic model and is based on the selected disease allele to obtain a p-value from the more powerful one-sided test. Through a comprehensive simulation study we assess the performance of the new test by comparing it with existing methods. Some real data applications are also used to illustrate the use of the proposed test. Conclusions Based on the simulation results and real data application, the proposed test is powerful and robust. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-358) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zhongxue Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, 1025 E, 7th street, PH C104, Bloomington, IN 47405, USA.
| |
Collapse
|
26
|
Chen Z, Nadarajah S. On the optimally weighted z-test for combining probabilities from independent studies. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2013.09.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
27
|
Lee D, Bacanu SA. Association testing strategy for data from dense marker panels. PLoS One 2013; 8:e80540. [PMID: 24265830 PMCID: PMC3827222 DOI: 10.1371/journal.pone.0080540] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 10/14/2013] [Indexed: 01/31/2023] Open
Abstract
Genome wide association studies have been usually analyzed in a univariate manner. The commonly used univariate tests have one degree of freedom and assume an additive mode of inheritance. The experiment-wise significance of these univariate statistics is obtained by adjusting for multiple testing. Next generation sequencing studies, which assay 10-20 million variants, are beginning to come online. For these studies, the strategy of additive univariate testing and multiple testing adjustment is likely to result in a loss of power due to (1) the substantial multiple testing burden and (2) the possibility of a non-additive causal mode of inheritance. To reduce the power loss we propose: a new method (1) to summarize in a single statistic the strength of the association signals coming from all not-very-rare variants in a linkage disequilibrium block and (2) to incorporate, in any linkage disequilibrium block statistic, the strength of the association signals under multiple modes of inheritance. The proposed linkage disequilibrium block test consists of the sum of squares of nominally significant univariate statistics. We compare the performance of this method to the performance of existing linkage disequilibrium block/gene-based methods. Simulations show that (1) extending methods to combine testing for multiple modes of inheritance leads to substantial power gains, especially for a recessive mode of inheritance, and (2) the proposed method has a good overall performance. Based on simulation results, we provide practical advice on choosing suitable methods for applied analyses.
Collapse
Affiliation(s)
- Donghyung Lee
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
- *E-mail:
| | - Silviu-Alin Bacanu
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, Virginia, United States of America
| |
Collapse
|
28
|
|
29
|
Chen Z, Huang H, Ng HKT. Testing for association in case-control genome-wide association studies with shared controls. Stat Methods Med Res 2013; 25:954-67. [DOI: 10.1177/0962280212474061] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The statistical analysis of genome-wide association studies (GWASs) with multiple diseases and shared controls (SCs) is discussed. The usual method for analyzing data from these studies is to compare each individual disease with either the SCs or the pooled controls which include other diseases. We observed that applying individual association tests can be problematic because these tests may suffer from power loss in detecting significant associations between diseases and single-nucleotide polymorphism or copy number variant. We propose here a two-stage procedure wherein we first apply an overall chi-square test for multiple diseases with SCs; if the overall test is rejected, then individual tests using the chi-square partition method will be applied to each disease against SCs. A real GWAS data set with SCs and a Monte Carlo simulation study are used to demonstrate that the proposed method is more effective and preferable than other existing methods for analyzing data from GWASs with multiple diseases and SCs.
Collapse
Affiliation(s)
- Zhongxue Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, Bloomington, IN, USA
| | - Hanwen Huang
- Center for Clinical and Translational Sciences, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Hon Keung Tony Ng
- Department of Statistical Science, Southern Methodist University, Dallas, TX, USA
| |
Collapse
|
30
|
Xu J, Zheng G, Yuan A. Case-Control Genome-wide Joint Association Study Using Semiparametric Empirical Model and Approximate Bayes Factor. J STAT COMPUT SIM 2013; 83:1191-1209. [PMID: 24532860 DOI: 10.1080/00949655.2011.654119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
We propose a semiparametric approach for the analysis of case-control genome-wide association study. Parametric components are used to model both the conditional distribution of the case status given the covariates and the distribution of genotype counts, whereas the distribution of the covariates are modeled nonparametrically. This yields a direct and joint modeling of the case status, covariates and genotype counts, and gives better understanding of the disease mechanism and results in more reliable conclusions. Side information, such as the disease prevalence, can be conveniently incorporated into the model by empirical likelihood approach and leads to more efficient estimates and powerful test in the detection of disease-associated SNPs. Profiling is used to eliminate a nuisance nonparametric component, and the resulting profile empirical likelihood estimates are shown to be consistent and asymptotically normal. For the hypothesis test on disease association, we apply the approximate Bayes factor (ABF) which is computationally simple and most desirable in genome-wide association studies where hundreds of thousands to a million genetic markers are tested. We treat the approximate Bayes factor as a hybrid Bayes factor which replaces the full data by the maximum likelihood estimates of the parameters of interest in the full model and derive it under a general setting. The deviation from Hardy-Weinberg Equilibrium (HWE) is also taken into account and the ABF for HWE using cases is shown to provide evidence of association between a disease and a genetic marker. Simulation studies and an application are further provided to illustrate the utility of the proposed methodology.
Collapse
Affiliation(s)
- Jinfeng Xu
- Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546
| | - Gang Zheng
- Office of Biostatistics Research, DPPS, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, Bethesda, MD 20892, USA
| | - Ao Yuan
- National Human Genome Center, Howard University, 2216 Sixth Street N.W., Washington, DC 20059
| |
Collapse
|
31
|
Yu Z, Gillen D, Li CF, Demetriou M. Incorporating parental information into family-based association tests. Biostatistics 2012; 14:556-72. [PMID: 23266418 DOI: 10.1093/biostatistics/kxs048] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Assumptions regarding the true underlying genetic model, or mode of inheritance, are necessary when quantifying genetic associations with disease phenotypes. Here we propose new methods to ascertain the underlying genetic model from parental data in family-based association studies. Specifically, for parental mating-type data, we propose a novel statistic to test whether the underlying genetic model is additive, dominant, or recessive; for parental genotype-phenotype data, we propose three strategies to determine the true mode of inheritance. We illustrate how to incorporate the information gleaned from these strategies into family-based association tests. Because family-based association tests are conducted conditional on parental genotypes, the type I error rate of these procedures is not inflated by the information learned from parental data. This result holds even if such information is weak or when the assumption of Hardy-Weinberg equilibrium is violated. Our simulations demonstrate that incorporating parental data into family-based association tests can improve power under common inheritance models. The application of our proposed methods to a candidate-gene study of type 1 diabetes successfully detects a recessive effect in MGAT5 that would otherwise be missed by conventional family-based association tests.
Collapse
Affiliation(s)
- Zhaoxia Yu
- Department of Statistics, University of California at Irvine, Irvine, CA 92697, USA.
| | | | | | | |
Collapse
|
32
|
Chen Z, Huang H, Ng HKT. Design and analysis of multiple diseases genome-wide association studies without controls. Gene 2012; 510:87-92. [PMID: 22951808 PMCID: PMC3463729 DOI: 10.1016/j.gene.2012.07.089] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2012] [Revised: 07/19/2012] [Accepted: 07/30/2012] [Indexed: 12/31/2022]
Abstract
In genome-wide association studies (GWAS), multiple diseases with shared controls is one of the case-control study designs. If data obtained from these studies are appropriately analyzed, this design can have several advantages such as improving statistical power in detecting associations and reducing the time and cost in the data collection process. In this paper, we propose a study design for GWAS which involves multiple diseases but without controls. We also propose corresponding statistical data analysis strategy for GWAS with multiple diseases but no controls. Through a simulation study, we show that the statistical association test with the proposed study design is more powerful than the test with single disease sharing common controls, and it has comparable power to the overall test based on the whole dataset including the controls. We also apply the proposed method to a real GWAS dataset to illustrate the methodologies and the advantages of the proposed design. Some possible limitations of this study design and testing method and their solutions are also discussed. Our findings indicate that the proposed study design and statistical analysis strategy could be more efficient than the usual case-control GWAS as well as those with shared controls.
Collapse
Affiliation(s)
- Zhongxue Chen
- Department of Epidemiology and Biostatistics, School of Public Health, Indiana University Bloomington, 1025 E. 7th Street, Bloomington, IN 47405-7109, USA.
| | | | | |
Collapse
|
33
|
Ro M, Park J, Nam M, Bang HJ, Yang J, Choi KS, Kim SK, Chung JH, Kwack K. Association between peroxisomal biogenesis factor 7 and autism spectrum disorders in a Korean population. J Child Neurol 2012; 27:1270-5. [PMID: 22378669 DOI: 10.1177/0883073811435507] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Autism spectrum disorder is a neurodevelopmental disorder characterized by deficits in social communication, impaired reciprocal social interaction, and repetitive patterns of behaviors or interests. Although the cause of autism spectrum disorder remains elusive, the present study identified peroxisomal biogenesis factor 7 (PEX7) as a gene associated with autism spectrum disorder, and this association was examined in a Korean population. PEX7 encodes a cytosolic receptor for peroxisome targeting signal 2 of peroxisomal matrix enzymes that are targeted to and translocated into the peroxisome. PEX7 defects are associated with rhizomelic chondrodysplasia punctata type 1 and Refsum disease. Mutations in PEX7 are related to a variety of mild to severe clinical symptoms, including mental retardation. The analysis of 9 intronic single nucleotide polymorphisms in 214 patients with autism spectrum disorder and 258 controls revealed the association of 2 single nucleotide polymorphisms and 1 haplotype with autism spectrum disorder (P < .05).
Collapse
Affiliation(s)
- MyungJa Ro
- Department of Biomedical Science, College of Life Science, CHA University, Seongnam, Korea
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Xu J, Yuan A, Zheng G. Bayes factor based on the trend test incorporating Hardy-Weinberg disequilibrium: more power to detect genetic association. Ann Hum Genet 2012; 76:301-11. [PMID: 22607017 DOI: 10.1111/j.1469-1809.2012.00714.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
In the analysis of case-control genetic association, the trend test and Pearson's test are the two most commonly used tests. In genome-wide association studies (GWAS), Bayes factor (BF) is a useful tool to support significant P-values, and a better measure than P-value when results are compared across studies with different sample sizes. When reporting the P-value of the trend test, we propose a BF directly based on the trend test. To improve the power to detect association under recessive or dominant genetic models, we propose a BF based on the trend test and incorporating Hardy-Weinberg disequilibrium in cases. When the true model is unknown, or both the trend test and Pearson's test or other robust tests are applied in genome-wide scans, we propose a joint BF, combining the previous two BFs. All three BFs studied in this paper have closed forms and are easy to compute without integrations, so they can be reported along with P-values, especially in GWAS. We discuss how to use each of them and how to specify priors. Simulation studies and applications to three GWAS are provided to illustrate their usefulness to detect nonadditive gene susceptibility in practice.
Collapse
Affiliation(s)
- Jinfeng Xu
- Department of Statistics and Applied Probability, National University of Singapore, Singapore
| | | | | |
Collapse
|
35
|
Pyun JA, Kim S, Cha DH, Ko JJ, Kwack K. Epistasis between the HSD17B4 and TG polymorphisms is associated with premature ovarian failure. Fertil Steril 2012; 97:968-73. [PMID: 22265031 DOI: 10.1016/j.fertnstert.2011.12.044] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2011] [Revised: 12/19/2011] [Accepted: 12/21/2011] [Indexed: 01/22/2023]
Abstract
OBJECTIVE To identify whether epistasis between TG and HSD17B4 and whether polymorphisms in HSD17B4 are associated with premature ovarian failure (POF). DESIGN Case-control genetic association study. SETTING Research laboratory of a university. PATIENT(S) Female patients with POF (98) and controls (218) of Korean ethnicity participated in this study. INTERVENTION(S) None. MAIN OUTCOME MEASURE(S) Genotype distribution, haplotype (HT) inference, and gene-gene interaction. RESULT(S) Distribution of one haplotype (A-G-A-A-G-G) on the HSD17B4 gene was significantly different between the POF group and the control group in a dominant model. In addition, the combined effect of the single nucleotide polymorphisms (SNPs) HSD17B4 rs28943592 and TG rs2076740 was significantly associated with POF (odds ratio = 7.74, 95% confidence interval = 1.67-35.94), although a significant association was not observed in the single SNP model. CONCLUSION(S) A haplotype in the HSD17B4 gene was identified that was significantly associated with resistance to POF. In addition, epistasis between two missense SNPs (rs28943592, rs2076740) located in HSD17B4 and TG was significantly associated with susceptibility to POF.
Collapse
Affiliation(s)
- Jung-A Pyun
- Department of Biomedical Science, College of Life Science, CHA University, Seongnam, South Korea
| | | | | | | | | |
Collapse
|
36
|
Gao G, Kang G, Wang J, Chen W, Qin H, Jiang B, Li Q, Sun C, Liu N, Archer KJ, Allison DB. A generalized sequential Bonferroni procedure using smoothed weights for genome-wide association studies incorporating information on Hardy-Weinberg disequilibrium among cases. Hum Hered 2011; 73:1-13. [PMID: 22212195 DOI: 10.1159/000332916] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 09/07/2011] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND/OBJECTIVES For genome-wide association studies (GWAS) with case-control designs, one of the most widely used association tests is the Cochran-Armitage (CA) trend test assuming an additive mode of inheritance. The CA trend test often has higher power than other association tests under additive and multiplicative disease models. However, it can have very low power under a recessive disease model in GWAS. Although tests (such as MAX3) robust to different genetic models have been developed, they often have relatively lower power than the CA trend test under additive and multiplicative models. The goal of this study is to propose an efficient method that not only has higher power than the CA trend test under dominant and recessive models but also maintains the power of the CA trend test under additive and multiplicative models. METHODS We employed the generalized sequential Bonferroni (GSB) procedure of Holm to incorporate information from a Hardy-Weinberg disequilibrium (HWD) test into the CA trend test based on estimating weights from the p values of the HWD test. We proposed to smooth the weights to reduce possible noise. RESULTS AND CONCLUSIONS Results from extensive simulation studies showed that the proposed GSB procedure can achieve the goal described above.
Collapse
Affiliation(s)
- Guimin Gao
- Department of Biostatistics, Virginia Commonwealth University, Richmond, Va. 23298-0032, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Chen Z, Ng HKT. A robust method for testing association in genome-wide association studies. Hum Hered 2011; 73:26-34. [PMID: 22212363 DOI: 10.1159/000334719] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 10/25/2011] [Indexed: 01/30/2023] Open
Abstract
In genetic association studies, due to the varying underlying genetic models, no single statistical test can be the most powerful test under all situations. Current studies show that if the underlying genetic models are known, trend-based tests, which outperform the classical Pearson χ² test, can be constructed. However, when the underlying genetic models are unknown, the χ² test is usually more robust than trend-based tests. In this paper, we propose a new association test based on a generalized genetic model, namely the generalized order-restricted relative risks model. Through a Monte Carlo simulation study, we show that the proposed association test is generally more powerful than the χ² test, and more robust than those trend-based tests. The proposed methodologies are also illustrated by some real SNP datasets.
Collapse
Affiliation(s)
- Zhongxue Chen
- Biostatistics Epidemiology Research Design Core, Center for Clinical and Translational Sciences, The University of Texas Health Science Center at Houston, Houston, Tex., USA
| | | |
Collapse
|
38
|
Yu Z, Deng L. Pseudosibship methods in the case-parents design. Stat Med 2011; 30:3236-51. [PMID: 21953439 DOI: 10.1002/sim.4397] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2011] [Revised: 06/06/2011] [Accepted: 08/10/2011] [Indexed: 01/21/2023]
Abstract
Recent evidence suggests that complex traits are likely determined by multiple loci, each of which contributes a weak to moderate individual effect. Although extensive literature exists on multilocus analysis of unrelated subjects, there are relatively fewer strategies for jointly analyzing multiple loci using family data. Here we address this issue by evaluating two pseudosibship methods: the 1:1 matching, which matches each affected offspring to the pseudosibling formed by the alleles not transmitted to the affected offspring, and the exhaustive matching, which matches each affected offspring to the pseudosiblings formed by all the other possible combinations of parental alleles. We prove that the two matching strategies use exactly and approximately the same amount of information from data under additive and multiplicative genetic models, respectively. Using numerical calculations under a variety of models and testing assumptions, we show that compared with the exhaustive matching, the 1:1 matching has comparable asymptotic power in detecting multiplicative/additive effects in single-locus analysis and main effects in multilocus analysis, and it allows association testing of multiple linked loci. These results pave the way for many existing multilocus analysis methods developed for the case-control (or matched case-control) design to be applied to case-parents data with minor modifications. As an example, with the 1:1 matching, we applied an L1 regularized regression to a Crohn's disease dataset. Using the multiple loci selected in our approach, we obtained an order-of-magnitude decrease in p-value and an 18.9% increase in prediction accuracy when compared with using the most significant individual locus.
Collapse
Affiliation(s)
- Zhaoxia Yu
- Department of Statistics, University of California, Irvine, CA 92697, USA.
| | | |
Collapse
|
39
|
Chen Z. A new association test based on Chi-square partition for case-control GWA studies. Genet Epidemiol 2011; 35:658-63. [PMID: 22009790 DOI: 10.1002/gepi.20615] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 06/13/2011] [Accepted: 06/15/2011] [Indexed: 01/03/2023]
Abstract
In case-control genetic association studies, the robust procedure, Pearson's Chi-square test, is commonly used for testing association between disease status and genetic markers. However, this test does not take the possible trend of relative risks, which are due to genotype, into account. On the contrary, although Cochran-Armitage trend test with optimal scores is more powerful; it is usually difficult to assign the correct scores in advance since the true genetic model is rarely known in practice. If the unknown underlying genetic models are misspecified, the trend test may lose power dramatically. Therefore, it is desirable to find a powerful yet robust statistical test for genome-wide association studies. In this paper, we propose a new test based on the partition of Pearson's Chi-square test statistic. The new test utilizes the information of the monotonic (increasing or decreasing) trend of relative risks and therefore in general is more powerful than the Chi-square test; furthermore, it reserves the robustness. Using simulated and real single nucleotide polymorphism data, we compare the performance of the proposed test with existing methods.
Collapse
Affiliation(s)
- Zhongxue Chen
- Biostatistics/Epidemiology/Research Design Core, Center for Clinical and Translational Sciences, The University of Texas Health Science Center at Houston, Houston, Texas 77030, USA.
| |
Collapse
|
40
|
Robust association tests under different genetic models, allowing for binary or quantitative traits and covariates. Behav Genet 2011; 41:768-75. [PMID: 21305351 PMCID: PMC3162964 DOI: 10.1007/s10519-011-9450-9] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2010] [Accepted: 01/18/2011] [Indexed: 12/02/2022]
Abstract
The association of genetic variants with outcomes is usually assessed under an additive model, for example by the trend test. However, misspecification of the genetic model will lead to a reduction in power. More robust tests for association might therefore be preferred. A useful approach is to consider the maximum of the three test statistics under additive, dominant and recessive models (MAX3). The p-value however has to be adjusted to maintain the type I error rate. Previous studies and software on robust association tests have focused on binary traits without covariates. In this study we developed an analytic approach to robust association tests using MAX3, allowing for quantitative or binary traits as well as covariates. The p-values from our theoretical calculations match very well with those from a bootstrap resampling procedure. The methodology is implemented in the R package RobustSNP which is able to handle both small-scale studies and GWAS. The package and documentation are available at http://sites.google.com/site/honcheongso/software/robustsnp.
Collapse
|
41
|
Pan D, Li Q, Jiang N, Liu A, Yu K. Robust joint analysis allowing for model uncertainty in two-stage genetic association studies. BMC Bioinformatics 2011; 12:9. [PMID: 21211060 PMCID: PMC3027114 DOI: 10.1186/1471-2105-12-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 01/07/2011] [Indexed: 01/10/2023] Open
Abstract
Background The cost efficient two-stage design is often used in genome-wide association studies (GWASs) in searching for genetic loci underlying the susceptibility for complex diseases. Replication-based analysis, which considers data from each stage separately, often suffers from loss of efficiency. Joint test that combines data from both stages has been proposed and widely used to improve efficiency. However, existing joint analyses are based on test statistics derived under an assumed genetic model, and thus might not have robust performance when the assumed genetic model is not appropriate. Results In this paper, we propose joint analyses based on two robust tests, MERT and MAX3, for GWASs under a two-stage design. We developed computationally efficient procedures and formulas for significant level evaluation and power calculation. The performances of the proposed approaches are investigated through the extensive simulation studies and a real example. Numerical results show that the joint analysis based on the MAX3 test statistic has the best overall performance. Conclusions MAX3 joint analysis is the most robust procedure among the considered joint analyses, and we recommend using it in a two-stage genome-wide association study.
Collapse
Affiliation(s)
- Dongdong Pan
- Department of Statistics, Yunnan University, Kunming 650091, PR China
| | | | | | | | | |
Collapse
|
42
|
Zang Y, Fung WK. Robust tests for matched case-control genetic association studies. BMC Genet 2010; 11:91. [PMID: 20937159 PMCID: PMC2964553 DOI: 10.1186/1471-2156-11-91] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Accepted: 10/12/2010] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND The Cochran-Armitage trend test (CATT) is powerful in detecting association between a susceptible marker and a disease. This test, however, may suffer from a substantial loss of power when the underlying genetic model is unknown and incorrectly specified. Thus, it is useful to derive tests obtaining the plausible power against all common genetic models. For this purpose, the genetic model selection (GMS) and genetic model exclusion (GME) methods were proposed recently. Simulation results showed that GMS and GME can obtain the plausible power against three common genetic models while the overall type I error is well controlled. RESULTS Although GMS and GME are powerful statistically, they could be seriously affected by known confounding factors such as gender, age and race. Therefore, in this paper, via comparing the difference of Hardy-Weinberg disequilibrium coefficients between the cases and the controls within each sub-population, we propose the stratified genetic model selection (SGMS) and exclusion (SGME) methods which could eliminate the effect of confounding factors by adopting a matching framework. Our goal in this paper is to investigate the robustness of the proposed statistics and compare them with other commonly used efficiency robust tests such as MAX3 and χ2 with 2 degrees of freedom (df) test in matched case-control association designs through simulation studies. CONCLUSION Simulation results showed that if the mean genetic effect of the heterozygous genotype is between those of the two homozygous genotypes, then the proposed tests and MAX3 are preferred. Otherwise, χ2 with 2 df test may be used. To illustrate the robust procedures, the proposed tests are applied to a real matched pair case-control etiologic study of sarcoidosis.
Collapse
Affiliation(s)
- Yong Zang
- Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong, China
| | | |
Collapse
|
43
|
Gupta V, Khadgawat R, Ng HKT, Kumar S, Aggarwal A, Rao VR, Sachdeva MP. A validation study of type 2 diabetes-related variants of the TCF7L2, HHEX, KCNJ11, and ADIPOQ genes in one endogamous ethnic group of north India. Ann Hum Genet 2010; 74:361-8. [PMID: 20597906 DOI: 10.1111/j.1469-1809.2010.00580.x] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The aim of this study was to validate the single nucleotide polymorphisms (SNPs) of four candidate genes (TCF7L2, HHEX, KCNJ11, and ADIPOQ) related to type 2 diabetes (T2D) in an endogamous population of north India; the Aggarwal population, having 18-clans. This endogamous population model was heavily supported by recent land mark work and we also verified the homogeneity of this population by clan-based stratification analysis. Two SNPs (rs4506565; rs7903146) in TCF7L2 were found to be significant (p-value = 0.00191; p-value = 0.00179, respectively), and odds ratios of 2.1 (dominant-model) and 2.0 (recessive-model) respectively, were obtained for this population. The TTT haplotype in the TCF7L2 gene was significantly associated with T2D. Waist-Hip ratio (WHR), systolic blood pressure (SBP), and age were significant covariates for increasing risk of T2D. Single-SNP, combined-SNPs and haplotype analysis provides clear evidence that the causal mutation is near to or within the significant haplotype (TTT) of the TCF7L2 gene. In spite of a culturally-learned sedentary lifestyle and fat-enriched dietary habits, WHR rather than body-mass-index emerged as a robust predictor of risk for T2D in this population.
Collapse
Affiliation(s)
- Vipin Gupta
- South Asia Network for Chronic Disease, Public Health Foundation of India, Delhi-110016
| | | | | | | | | | | | | |
Collapse
|
44
|
Abstract
A two-stage design is cost-effective for genome-wide association studies (GWAS) testing hundreds of thousands of single nucleotide polymorphisms (SNPs). In this design, each SNP is genotyped in stage 1 using a fraction of case-control samples. Top-ranked SNPs are selected and genotyped in stage 2 using additional samples. A joint analysis, combining statistics from both stages, is applied in the second stage. Follow-up studies can be regarded as a two-stage design. Once some potential SNPs are identified, independent samples are further genotyped and analyzed separately or jointly with previous data to confirm the findings. When the underlying genetic model is known, an asymptotically optimal trend test (TT) can be used at each analysis. In practice, however, genetic models for SNPs with true associations are usually unknown. In this case, the existing methods for analysis of the two-stage design and follow-up studies are not robust across different genetic models. We propose a simple robust procedure with genetic model selection to the two-stage GWAS. Our results show that, if the optimal TT has about 80% power when the genetic model is known, then the existing methods for analysis of the two-stage design have minimum powers about 20% across the four common genetic models (when the true model is unknown), while our robust procedure has minimum powers about 70% across the same genetic models. The results can be also applied to follow-up and replication studies with a joint analysis.
Collapse
Affiliation(s)
- Minjung Kwak
- Office of Biostatistics Research, National Heart, Lung and Blood Institute, 6701 Rockledge Drive, MSC 7913, Bethesda, Maryland 20892-7913, USA
| | | | | |
Collapse
|
45
|
Joo J, Kwak M, Chen Z, Zheng G. Efficiency robust statistics for genetic linkage and association studies under genetic model uncertainty. Stat Med 2010; 29:158-80. [PMID: 19918942 DOI: 10.1002/sim.3759] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
When testing genetic linkage and association, test statistics that follow a normal or Chi-square distributions are often used. These statistics are usually derived under a specific mode of inheritance (genetic model). Common genetic models include, but not limited to, the recessive, additive, multiplicative, and dominant models. For many diseases, their underlying genetic models are often unknown. Instead, a family of scientifically plausible genetic models may be available, which includes the four commonly used models. Hence, the optimal test is not available. Employing a single test statistic which is optimal for one model may suffer from substantial loss of power when the model is misspecified. In this situation efficient robust tests are useful. In this tutorial, we first review several commonly used robust statistics, including maximum efficiency robust tests, maximal tests, and constrained likelihood ratio tests for three common designs in genetic studies: (i) linkage analysis using affected sib-pairs, (ii) association studies using parents-offspring trios, and (iii) case-control association studies (unmatched and matched). Codes in the R statistical language for applying these robust statistics to test for linkage and association are presented with examples. We also provide some comparisons of the performance of the various robust tests via simulation studies. Guidelines for applications are also given for each study design. Finally, applications of robust tests to genome-wide association studies and meta-analysis are discussed.
Collapse
Affiliation(s)
- Jungnam Joo
- Office of Biostatistics Research, National Heart, Lung and Blood Institute, Bethesda, MD 20892, USA
| | | | | | | |
Collapse
|
46
|
Pan W, Han F, Shen X. Test selection with application to detecting disease association with multiple SNPs. Hum Hered 2009; 69:120-30. [PMID: 19996609 DOI: 10.1159/000264449] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2009] [Accepted: 07/08/2009] [Indexed: 11/19/2022] Open
Abstract
We consider the motivating problem of testing for association between a phenotype and multiple single nucleotide polymorphisms (SNPs) within a candidate gene or region. Various statistical approaches have been proposed, including those based on either (combining univariate) single-locus analyses or (multivariate) multilocus analyses. However, it is known in theory that there is no single uniformly most powerful test to detect association with multiple SNPs. On the other hand, several tests have been shown to be among frequent winners across a range of practical situations, but the identity of the most powerful one changes with the situation in an unknown way. Here we propose a novel test selection procedure to select from five such tests: a so-called UminP test that combines multiple univariate/single-locus score tests by taking the minimum of their p values as its test statistic, a multivariate score test and its two modifications, and a so-called sum test. We also illustrate its application to selecting genotype codings for the sum test since the performance of the sum test depends on its genotype coding in an unknown way. Our major contributions include the methodology of estimating the power of a given test with a given dataset and the idea of using the estimated power as the criterion for test selection. We also propose a fast simulation-based method to calculate p values for the test selection procedure and for any method of combining p values. Our numerical results indicated that the proposed test selection procedure always yielded power close to the most powerful test among the candidate tests at any given situation, and in particular, our proposed test selection performed either better than or as well as the popular combining method of taking the minimum p value of the candidate tests.
Collapse
Affiliation(s)
- Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minn. 55455-0392, USA.
| | | | | |
Collapse
|
47
|
Zheng G, Joo J, Zaykin D, Wu C, Geller N. Robust Tests in Genome-Wide Scans under Incomplete Linkage Disequilibrium. Stat Sci 2009. [DOI: 10.1214/09-sts314] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
48
|
Abstract
The possible evidence for association comprises three types of information: differences between cases and controls in allele frequencies, in parameters for Hardy-Weinberg disequilibrium (HWD), and in parameters for linkage disequilibrium (LD). LD between marker and disease alleles results in a difference in at least one of the three types of parameters [Won and Elston, 2008]. However, the parameters for LD require knowledge about phase, which is usually unknown, making the LD contrast test without modification infeasible in practice. Methods for handling phase uncertainty are: (1) the most probable haplotype pair for each individual can be considered as the true phase; (2) a weighted average of haplotypes can be used; (3) we can consider the composite LD, which does not require any information about phase. We compare these methods to handle phase uncertainty in terms of validity and efficiency, and the effect on them of HWD in the population, at the same time confirming results for the three types of information. When the LD between markers is high, the LD contrast test that uses a weighted average of haplotypes or the most probable haplotypes to calculate the LD is recommended, but otherwise the LD contrast test that uses the composite LD is recommended. We conclude that, even though the difference in allele frequencies is usually the most informative test except in the case of a recessive disease, the LD contrast test can be more powerful if the markers are dense enough.
Collapse
Affiliation(s)
- Sungho Won
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, USA
| | | | | |
Collapse
|
49
|
Abstract
Fisher (1925) was the first to suggest a method of combining the p-values obtained from several statistics and many other methods have been proposed since then. However, there is no agreement about what is the best method. Motivated by a situation that now often arises in genetic epidemiology, we consider the problem when it is possible to define a simple alternative hypothesis of interest for which the expected effect size of each test statistic is known and we determine the most powerful test for this simple alternative hypothesis. Based on the proposed method, we show that information about the effect sizes can be used to obtain the best weights for Liptak's method of combining p-values. We present extensive simulation results comparing methods of combining p-values and illustrate for a real example in genetic epidemiology how information about effect sizes can be deduced.
Collapse
Affiliation(s)
- Sungho Won
- Department of Biostatistics, Harvard School of Public Health, MA, U.S.A
| | | | | | | |
Collapse
|
50
|
Joo J, Kwak M, Zheng G. Improving Power for Testing Genetic Association in Case-Control Studies by Reducing the Alternative Space. Biometrics 2009; 66:266-76. [DOI: 10.1111/j.1541-0420.2009.01241.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|