1
|
Sun X, Bulekova K, Yang J, Lai M, Pitsillides AN, Liu X, Zhang Y, Guo X, Yong Q, Raffield LM, Rotter JI, Rich SS, Abecasis G, Carson AP, Vasan RS, Bis JC, Psaty BM, Boerwinkle E, Fitzpatrick AL, Satizabal CL, Arking DE, Ding J, Levy D, Liu C. Association analysis of mitochondrial DNA heteroplasmic variants: Methods and application. Mitochondrion 2024; 79:101954. [PMID: 39245194 PMCID: PMC11568909 DOI: 10.1016/j.mito.2024.101954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 08/26/2024] [Accepted: 08/31/2024] [Indexed: 09/10/2024]
Abstract
We rigorously assessed a comprehensive association testing framework for heteroplasmy, employing both simulated and real-world data. This framework employed a variant allele fraction (VAF) threshold and harnessed multiple gene-based tests for robust identification and association testing of heteroplasmy. Our simulation studies demonstrated that gene-based tests maintained an appropriate type I error rate at α = 0.001. Notably, when 5 % or more heteroplasmic variants within a target region were linked to an outcome, burden-extension tests (including the adaptive burden test, variable threshold burden test, and z-score weighting burden test) outperformed the sequence kernel association test (SKAT) and the original burden test. Applying this framework, we conducted association analyses on whole-blood derived heteroplasmy in 17,507 individuals of African and European ancestries (31 % of African Ancestry, mean age of 62, with 58 % women) with whole genome sequencing data. We performed both cohort- and ancestry-specific association analyses, followed by meta-analysis on both pooled samples and within each ancestry group. Our results suggest that mtDNA-encoded genes/regions are likely to exhibit varying rates in somatic aging, with the notably strong associations observed between heteroplasmy in the RNR1 and RNR2 genes (p < 0.001) and advance aging by the Original Burden test. In contrast, SKAT identified significant associations (p < 0.001) between diabetes and the aggregated effects of heteroplasmy in several protein-coding genes. Further research is warranted to validate these findings. In summary, our proposed statistical framework represents a valuable tool for facilitating association testing of heteroplasmy with disease traits in large human populations.
Collapse
Affiliation(s)
- Xianbang Sun
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Katia Bulekova
- Research Computing Services, Boston University, Boston, MA 02215, USA
| | - Jian Yang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Meng Lai
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Achilleas N Pitsillides
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Xue Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Yuankai Zhang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Qian Yong
- Longitudinal Studies Section, Translational Gerontology Branch, NIA/NIH, Baltimore, MD 21224, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Stephen S Rich
- Department of Public Health Services, Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Goncalo Abecasis
- TOPMed Informatics Research Center, University of Michigan, Ann Arbor, MI 48109, USA
| | - April P Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Ramachandran S Vasan
- Sections of Preventive Medicine and Epidemiology, and Cardiovascular Medicine, Boston University School of Medicine, Boston, MA, 02118, USA; Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA
| | - Joshua C Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA
| | - Bruce M Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA; Departments of Epidemiology, and Health Services, University of Washington, Seattle, WA 98101, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Annette L Fitzpatrick
- Departments of Family Medicine, Epidemiology, and Global Health, University of Washington, Seattle, WA 98195, USA
| | - Claudia L Satizabal
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA; Glenn Biggs Institute for Alzheimer's and Neurodegenerative Diseases, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| | - Dan E Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, MD 21205, USA
| | - Jun Ding
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Daniel Levy
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA; Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA; Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA.
| |
Collapse
|
2
|
Sun X, Bulekova K, Yang J, Lai M, Pitsillides AN, Liu X, Zhang Y, Guo X, Yong Q, Raffield LM, Rotter JI, Rich SS, Abecasis G, Carson AP, Vasan RS, Bis JC, Psaty BM, Boerwinkle E, Fitzpatrick AL, Satizabal CL, Arking DE, Ding J, Levy D, Liu C. Association analysis of mitochondrial DNA heteroplasmic variants: methods and application. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.12.24301233. [PMID: 38260412 PMCID: PMC10802757 DOI: 10.1101/2024.01.12.24301233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
We rigorously assessed a comprehensive association testing framework for heteroplasmy, employing both simulated and real-world data. This framework employed a variant allele fraction (VAF) threshold and harnessed multiple gene-based tests for robust identification and association testing of heteroplasmy. Our simulation studies demonstrated that gene-based tests maintained an appropriate type I error rate at α=0.001. Notably, when 5% or more heteroplasmic variants within a target region were linked to an outcome, burden-extension tests (including the adaptive burden test, variable threshold burden test, and z-score weighting burden test) outperformed the sequence kernel association test (SKAT) and the original burden test. Applying this framework, we conducted association analyses on whole-blood derived heteroplasmy in 17,507 individuals of African and European ancestries (31% of African Ancestry, mean age of 62, with 58% women) with whole genome sequencing data. We performed both cohort- and ancestry-specific association analyses, followed by meta-analysis on both pooled samples and within each ancestry group. Our results suggest that mtDNA-encoded genes/regions are likely to exhibit varying rates in somatic aging, with the notably strong associations observed between heteroplasmy in the RNR1 and RNR2 genes (p<0.001) and advance aging by the Original Burden test. In contrast, SKAT identified significant associations (p<0.001) between diabetes and the aggregated effects of heteroplasmy in several protein-coding genes. Further research is warranted to validate these findings. In summary, our proposed statistical framework represents a valuable tool for facilitating association testing of heteroplasmy with disease traits in large human populations.
Collapse
Affiliation(s)
- Xianbang Sun
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
| | - Katia Bulekova
- Research Computing Services, Boston University, Boston, MA 02215, USA
| | - Jian Yang
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
| | - Meng Lai
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
| | | | - Xue Liu
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
| | - Yuankai Zhang
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
| | - Xiuqing Guo
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Qian Yong
- Longitudinal Studies Section, Translational Gerontology Branch, NIA/NIH, Baltimore, MD 21224, USA
| | - Laura M. Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Jerome I. Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Stephen S. Rich
- Department of Public Health Services, Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Goncalo Abecasis
- TOPMed Informatics Research Center, University of Michigan, Ann Arbor, MI 48109, USA
| | - April P. Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Ramachandran S. Vasan
- Sections of Preventive Medicine and Epidemiology, and Cardiovascular Medicine, Boston University School of Medicine, Boston, MA, 02118, USA
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA 98101, USA
- Departments of Epidemiology, and Health Services, University of Washington, Seattle, WA 98101, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Annette L. Fitzpatrick
- Departments of Family Medicine, Epidemiology, and Global Health, University of Washington, Seattle, WA 98195, USA
| | - Claudia L. Satizabal
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA
- Glenn Biggs Institute for Alzheimer’s and Neurodegenerative Diseases, University of Texas Health Science Center at San Antonio, San Antonio, TX 78229, USA
| | - Dan E. Arking
- McKusick-Nathans Institute, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, Maryland, MD 21205, USA
| | - Jun Ding
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27514, USA
| | - Daniel Levy
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA
- Population Sciences Branch, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | - Chunyu Liu
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02118, USA
- Framingham Heart Study, NHLBI/NIH, Framingham, MA 01702, USA
| |
Collapse
|
3
|
Yang T, Kim J, Wu C, Ma Y, Wei P, Pan W. An adaptive test for meta-analysis of rare variant association studies. Genet Epidemiol 2020; 44:104-116. [PMID: 31830326 PMCID: PMC6980317 DOI: 10.1002/gepi.22273] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 11/12/2019] [Accepted: 11/25/2019] [Indexed: 01/02/2023]
Abstract
Single genome-wide studies may be underpowered to detect trait-associated rare variants with moderate or weak effect sizes. As a viable alternative, meta-analysis is widely used to increase power by combining different studies. The power of meta-analysis critically depends on the underlying association patterns and heterogeneity levels, which are unknown and vary from locus to locus. However, existing methods mainly focus on one or only a few combinations of the association pattern and heterogeneity level, thus may lose power in many situations. To address this issue, we propose a general and unified framework by combining a class of tests including and beyond some existing ones, leading to high power across a wide range of scenarios. We demonstrate that the proposed test is more powerful than some existing methods in simulation studies, then show their performance with the NHLBI Exome-Sequencing Project (ESP) data. One gene (B4GALNT2) was found by our proposed test, but not by others, to be statistically significantly associated with plasma triglyceride. The signal was driven by African-ancestry subjects but it was previously reported to be associated with coronary artery disease among European-ancestry subjects. We implemented our method in an R package aSPUmeta, publicly available at https://github.com/ytzhong/metaRV and will be on CRAN soon.
Collapse
Affiliation(s)
- Tianzhong Yang
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Junghi Kim
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, USA
| | - Yiding Ma
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Peng Wei
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
4
|
Zhang J, Wu B, Sha Q, Zhang S, Wang X. A general statistic to test an optimally weighted combination of common and/or rare variants. Genet Epidemiol 2019; 43:966-979. [PMID: 31498476 DOI: 10.1002/gepi.22255] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2019] [Revised: 06/17/2019] [Accepted: 07/30/2019] [Indexed: 11/10/2022]
Abstract
Both genome-wide association study and next-generation sequencing data analyses are widely employed to identify disease susceptible common and/or rare genetic variants. Rare variants generally have large effects though they are hard to detect due to their low frequencies. Currently, many existing statistical methods for rare variants association studies employ a weighted combination scheme, which usually puts subjective weights or suboptimal weights based on some adhoc assumptions (e.g., ignoring dependence between rare variants). In this study, we analytically derived optimal weights for both common and rare variants and proposed a general and novel approach to test association between an optimally weighted combination of variants (G-TOW) in a gene or pathway for a continuous or dichotomous trait while easily adjusting for covariates. Results of the simulation studies show that G-TOW has properly controlled type I error rates and it is the most powerful test among the methods we compared when testing effects of either both rare and common variants or rare variants only. We also illustrate the effectiveness of G-TOW using the Genetic Analysis Workshop 17 (GAW17) data. Additionally, we applied G-TOW and other competitive methods to test disease-associated genes in real data of schizophrenia. The G-TOW has successfully verified genes FYN and VPS39 which are associated with schizophrenia reported in existing publications. Both of these genes are missed by the weighted sum statistic and the sequence kernel association test. Simulation study and real data analysis indicate that G-TOW is a powerful test.
Collapse
Affiliation(s)
- Jianjun Zhang
- Department of Mathematics, University of North Texas, Denton, Texas
| | - Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, Denton, Texas
| |
Collapse
|
5
|
Bocher O, Marenne G, Saint Pierre A, Ludwig TE, Guey S, Tournier-Lasserve E, Perdry H, Génin E. Rare variant association testing for multicategory phenotype. Genet Epidemiol 2019; 43:646-656. [PMID: 31087445 DOI: 10.1002/gepi.22210] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2018] [Revised: 04/03/2019] [Accepted: 04/17/2019] [Indexed: 01/09/2023]
Abstract
Genetic association studies have provided new insights into the genetic variability of human complex traits with a focus mainly on continuous or binary traits. Methods have been proposed to take into account disease heterogeneity between subgroups of patients when studying common variants but none was specifically designed for rare variants. Because rare variants are expected to have stronger effects and to be more heterogeneously distributed among cases than common ones, subgroup analyses might be particularly attractive in this context. To address this issue, we propose an extension of burden tests by using a multinomial regression model, which enables association tests between rare variants and multicategory phenotypes. We evaluated the type I error and the power of two burden tests, CAST and WSS, by simulating data under different scenarios. In the case of genetic heterogeneity between case subgroups, we showed an advantage of multinomial regression over logistic regression, which considers all the cases against the controls. We replicated these results on real data from Moyamoya disease where the burden tests performed better when cases were stratified according to age-of-onset. We implemented the functions for association tests in the R package "Ravages" available on Github.
Collapse
Affiliation(s)
- Ozvan Bocher
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France
| | | | | | - Thomas E Ludwig
- Univ Brest, Inserm, EFS, UMR 1078, GGB, Brest, France.,CHU Brest, Brest, France
| | - Stéphanie Guey
- Inserm UMR-S1161, Génétique et Physiopathologie des Maladies Cérébro-vasculaires, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Elisabeth Tournier-Lasserve
- Inserm UMR-S1161, Génétique et Physiopathologie des Maladies Cérébro-vasculaires, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Hervé Perdry
- CESP Inserm, U1018, UFR Médecine, Univ Paris-Sud, Université Paris-Saclay, Villejuif, France
| | | |
Collapse
|
6
|
Yang X, Wang S, Zhang S, Sha Q. Detecting association of rare and common variants based on cross-validation prediction error. Genet Epidemiol 2017; 41:233-243. [PMID: 28176359 DOI: 10.1002/gepi.22034] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 11/22/2016] [Accepted: 11/26/2016] [Indexed: 12/13/2022]
Abstract
Despite the extensive discovery of disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants may explain additional disease risk or trait variability. Although sequencing technology provides a supreme opportunity to investigate the roles of rare variants in complex diseases, detection of these variants in sequencing-based association studies presents substantial challenges. In this article, we propose novel statistical tests to test the association between rare and common variants in a genomic region and a complex trait of interest based on cross-validation prediction error (PE). We first propose a PE method based on Ridge regression. Based on PE, we also propose another two tests PE-WS and PE-TOW by testing a weighted combination of variants with two different weighting schemes. PE-WS is the PE version of the test based on the weighted sum statistic (WS) and PE-TOW is the PE version of the test based on the optimally weighted combination of variants (TOW). Using extensive simulation studies, we are able to show that (1) PE-TOW and PE-WS are consistently more powerful than TOW and WS, respectively, and (2) PE is the most powerful test when causal variants contain both common and rare variants.
Collapse
Affiliation(s)
- Xinlan Yang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | | | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| |
Collapse
|
7
|
Richardson TG, Shihab HA, Hemani G, Zheng J, Hannon E, Mill J, Carnero-Montoro E, Bell JT, Lyttleton O, McArdle WL, Ring SM, Rodriguez S, Campbell C, Smith GD, Relton CL, Timpson NJ, Gaunt TR. Collapsed methylation quantitative trait loci analysis for low frequency and rare variants. Hum Mol Genet 2016; 25:4339-4349. [PMID: 27559110 PMCID: PMC5291201 DOI: 10.1093/hmg/ddw283] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 06/30/2016] [Accepted: 08/12/2016] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Single variant approaches have been successful in identifying DNA methylation quantitative trait loci (mQTL), although as with complex traits they lack the statistical power to identify the effects from rare genetic variants. We have undertaken extensive analyses to identify regions of low frequency and rare variants that are associated with DNA methylation levels. METHODS We used repeated measurements of DNA methylation from five different life stages in human blood, taken from the Avon Longitudinal Study of Parents and Children (ALSPAC) cohort. Variants were collapsed across CpG islands and their flanking regions to identify variants collectively associated with methylation, where no single variant was individually responsible for the observed signal. All analyses were undertaken using the sequence kernel association test. RESULTS For loci where no individual variant mQTL was observed based on a single variant analysis, we identified 95 unique regions where the combined effect of low frequency variants (MAF ≤ 5%) provided strong evidence of association with methylation. For loci where there was previous evidence of an individual variant mQTL, a further 3 regions provided evidence of association between multiple low frequency variants and methylation levels. Effects were observed consistently across 5 different time points in the lifecourse and evidence of replication in the TwinsUK and Exeter cohorts was also identified. CONCLUSION We have demonstrated the potential of this novel approach to mQTL analysis by analysing the combined effect of multiple low frequency or rare variants. Future studies should benefit from applying this approach as a complementary follow up to single variant analyses.
Collapse
Affiliation(s)
- Tom G Richardson
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Hashem A Shihab
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Gibran Hemani
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Jie Zheng
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Eilis Hannon
- University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Jonathan Mill
- University of Exeter Medical School, University of Exeter, Exeter, UK
- Institute of Psychiatry, King's College London, London, UK
| | - Elena Carnero-Montoro
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Jordana T Bell
- Department of Twin Research and Genetic Epidemiology, King's College London, London, UK
| | - Oliver Lyttleton
- Avon Longitudinal Study of Parents and Children (ALSPAC) & School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Wendy L McArdle
- Avon Longitudinal Study of Parents and Children (ALSPAC) & School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Susan M Ring
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
- Avon Longitudinal Study of Parents and Children (ALSPAC) & School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Santiago Rodriguez
- Bristol Genetic Epidemiology Laboratories, School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Colin Campbell
- Intelligent Systems Laboratory, University of Bristol, Bristol, UK
| | - George Davey Smith
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Caroline L Relton
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Nicholas J Timpson
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| | - Tom R Gaunt
- MRC Integrative Epidemiology Unit (IEU), School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol, UK
| |
Collapse
|
8
|
Detecting association of rare and common variants by adaptive combination of P-values. Genet Res (Camb) 2015; 97:e20. [PMID: 26440553 DOI: 10.1017/s0016672315000208] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Genome-wide association studies (GWAS) can detect common variants associated with diseases. Next generation sequencing technology has made it possible to detect rare variants. Most of association tests, including burden tests and nonburden tests, mainly target rare variants by upweighting rare variant effects and downweighting common variant effects. But there is increasing evidence that complex diseases are caused by both common and rare variants. In this paper, we extend the ADA method (adaptive combination of P-values; Lin et al., 2014) for rare variants only and propose a RC-ADA method (common and rare variants by adaptive combination of P-values). Our proposed method combines the per-site P-values with the weights based on minor allele frequencies (MAFs). The RC-ADA is robust to directions of effects of causal variants and inclusion of a high proportion of neutral variants. The performance of the RC-ADA method is compared with several other association methods. Extensive simulation studies show that the RC-ADA method is more powerful than other association methods over a wide range of models.
Collapse
|
9
|
Wang X, Zhang S, Li Y, Li M, Sha Q. A powerful approach to test an optimally weighted combination of rare variants in admixed populations. Genet Epidemiol 2015; 39:294-305. [PMID: 25758547 DOI: 10.1002/gepi.21894] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 01/09/2015] [Accepted: 01/26/2015] [Indexed: 11/09/2022]
Abstract
Population stratification has long been recognized as an issue in genetic association studies because unrecognized population stratification can lead to both false-positive and false-negative findings and can obscure true association signals if not appropriately corrected. This issue can be even worse in rare variant association analyses because rare variants often demonstrate stronger and potentially different patterns of stratification than common variants. To correct for population stratification in genetic association studies, we proposed a novel method to Test the effect of an Optimally Weighted combination of variants in Admixed populations (TOWA) in which the analytically derived optimal weights can be calculated from existing phenotype and genotype data. TOWA up weights rare variants and those variants that have strong associations with the phenotype. Additionally, it can adjust for the direction of the association, and allows for local ancestry difference among study subjects. Extensive simulations show that the type I error rate of TOWA is under control in the presence of population stratification and it is more powerful than existing methods. We have also applied TOWA to a real sequencing data. Our simulation studies as well as real data analysis results indicate that TOWA is a useful tool for rare variant association analyses in admixed populations.
Collapse
Affiliation(s)
- Xuexia Wang
- Joseph J. Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, United States of America
| | | | | | | | | |
Collapse
|
10
|
Sha Q, Zhang S. A rare variant association test based on combinations of single-variant tests. Genet Epidemiol 2014; 38:494-501. [PMID: 25065727 DOI: 10.1002/gepi.21834] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Revised: 04/17/2014] [Accepted: 05/19/2014] [Indexed: 01/22/2023]
Abstract
Next generation sequencing technologies make direct testing rare variant associations possible. However, the development of powerful statistical methods for rare variant association studies is still underway. Most of existing methods are burden and quadratic tests. Recent studies show that the performance of each of burden and quadratic tests depends strongly upon the underlying assumption and no test demonstrates consistently acceptable power. Thus, combined tests by combining information from the burden and quadratic tests have been proposed recently. However, results from recent studies (including this study) show that there exist tests that can outperform both burden and quadratic tests. In this article, we propose three classes of tests that include tests outperforming both burden and quadratic tests. Then, we propose the optimal combination of single-variant tests (OCST) by combining information from tests of the three classes. We use extensive simulation studies to compare the performance of OCST with that of burden, quadratic and optimal single-variant tests. Our results show that OCST either is the most powerful test or has similar power with the most powerful test. We also compare the performance of OCST with that of the two existing combined tests. Our results show that OCST has better power than the two combined tests.
Collapse
Affiliation(s)
- Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | | |
Collapse
|
11
|
Test of rare variant association based on affected sib-pairs. Eur J Hum Genet 2014; 23:229-37. [PMID: 24667785 DOI: 10.1038/ejhg.2014.43] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Revised: 11/06/2013] [Accepted: 12/30/2013] [Indexed: 11/08/2022] Open
Abstract
With the development of sequencing techniques, there is increasing interest to detect associations between rare variants and complex traits. Quite a few statistical methods to detect associations between rare variants and complex traits have been developed for unrelated individuals. Statistical methods for detecting rare variant associations under family-based designs have not received as much attention as methods for unrelated individuals. Recent studies show that rare disease variants will be enriched in family data and thus family-based designs may improve power to detect rare variant associations. In this article, we propose a novel test to test association between the optimally weighted combination of variants and trait of interests for affected sib-pairs. The optimal weights are analytically derived and can be calculated from sampled genotypes and phenotypes. Based on the optimal weights, the proposed method is robust to the directions of the effects of causal variants and is less affected by neutral variants than existing methods are. Our simulation results show that, in all the cases, the proposed method is substantially more powerful than existing methods based on unrelated individuals and existing methods based on affected sib-pairs.
Collapse
|
12
|
Sha Q, Zhang S. A novel test for testing the optimally weighted combination of rare and common variants based on data of parents and affected children. Genet Epidemiol 2014; 38:135-43. [PMID: 24382753 PMCID: PMC4162402 DOI: 10.1002/gepi.21787] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Revised: 10/28/2013] [Accepted: 12/02/2013] [Indexed: 11/10/2022]
Abstract
With the development of sequencing technologies, the direct testing of rare variant associations has become possible. Many statistical methods for detecting associations between rare variants and complex diseases have recently been developed, most of which are population-based methods for unrelated individuals. A limitation of population-based methods is that spurious associations can occur when there is a population structure. For rare variants, this problem can be more serious, because the spectrum of rare variation can be very different in diverse populations, as well as the current nonexistence of methods to control for population stratification in population-based rare variant associations. A solution to the problem of population stratification is to use family-based association tests, which use family members to control for population stratification. In this article, we propose a novel test for Testing the Optimally Weighted combination of variants based on data of Parents and Affected Children (TOW-PAC). TOW-PAC is a family-based association test that tests the combined effect of rare and common variants in a genomic region, and is robust to the directions of the effects of causal variants. Simulation studies confirm that, for rare variant associations, family-based association tests are robust to population stratification although population-based association tests can be seriously confounded by population stratification. The results of power comparisons show that the power of TOW-PAC increases with an increase of the number of affected children in each family and TOW-PAC based on multiple affected children per family is more powerful than TOW based on unrelated individuals.
Collapse
Affiliation(s)
- Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | | |
Collapse
|