1
|
Hecker J, Townes FW, Kachroo P, Laurie C, Lasky-Su J, Ziniti J, Cho MH, Weiss ST, Laird NM, Lange C. A unifying framework for rare variant association testing in family-based designs, including higher criticism approaches, SKATs, and burden tests. Bioinformatics 2021; 36:5432-5438. [PMID: 33367522 PMCID: PMC8016468 DOI: 10.1093/bioinformatics/btaa1055] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Revised: 11/20/2020] [Accepted: 12/10/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Analysis of rare variants in family-based studies remains a challenge. Transmission-based approaches provide robustness against population stratification, but the evaluation of the significance of test statistics based on asymptotic theory can be imprecise. Also, power will depend heavily on the choice of the test statistic and on the underlying genetic architecture of the locus, which will be generally unknown. RESULTS In our proposed framework, we utilize the FBAT haplotype algorithm to obtain the conditional offspring genotype distribution under the null hypothesis given the sufficient statistic. Based on this conditional offspring genotype distribution, the significance of virtually any association test statistic can be evaluated based on simulations or exact computations, without the need for asymptotic approximations. Besides standard linear burden-type statistics, this enables our approach to also evaluate other test statistics such as variance components statistics, higher criticism approaches, and maximum-single-variant-statistics, where asymptotic theory might be involved or does not provide accurate approximations for rare variant data. Based on these P-values, combined test statistics such as the aggregated Cauchy association test (ACAT) can also be utilized. In simulation studies, we show that our framework outperforms existing approaches for family-based studies in several scenarios. We also applied our methodology to a TOPMed whole-genome sequencing dataset with 897 asthmatic trios from Costa Rica. AVAILABILITY AND IMPLEMENTATION FBAT software is available at https://sites.google.com/view/fbatwebpage. Simulation code is available at https://github.com/julianhecker/FBAT_rare_variant_test_simulations. Whole-genome sequencing data for 'NHLBI TOPMed: The Genetic Epidemiology of Asthma in Costa Rica' is available at https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000988.v4.p1. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Julian Hecker
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - F William Townes
- Department of Computer Science, Princeton University, Princeton, NJ 08540-5233, USA
| | - Priyadarshini Kachroo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Cecelia Laurie
- Department of Biostatistics, University of Washington, Seattle, WA 98195-1617, USA
| | - Jessica Lasky-Su
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - John Ziniti
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Scott T Weiss
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Nan M Laird
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Christoph Lange
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
2
|
TNFSF10/TRAIL regulates human T4 effector memory lymphocyte radiosensitivity and predicts radiation-induced acute and subacute dermatitis. Oncotarget 2017; 7:21416-27. [PMID: 26982083 PMCID: PMC5008295 DOI: 10.18632/oncotarget.7893] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 02/18/2016] [Indexed: 12/31/2022] Open
Abstract
Sensitivity of T4 effector-memory (T4EM) lymphocytes to radiation-induced apoptosis shows heritability compatible with a Mendelian mode of transmission. Using gene expression studies and flow cytometry, we show a higher TNF-Related Apoptosis Inducing Ligand (TRAIL/TNFSF10) mRNA level and a higher level of membrane bound TRAIL (mTRAIL) on radiosensitive compared to radioresistant T4EM lymphocytes. Functionally, we show that mTRAIL mediates a pro-apoptotic autocrine signaling after irradiation of T4EM lymphocytes linking mTRAIL expression to T4EM radiosensitivity. Using single marker and multimarker Family-Based Association Testing, we identified 3 SNPs in the TRAIL gene that are significantly associated with T4EM lymphocytes radiosensitivity. Among these 3 SNPs, two are also associated with acute and subacute dermatitis after radiotherapy in breast cancer indicating that T4EM lymphocytes radiosensitivity may be used to predict response to radiotherapy. Altogether, these results show that mTRAIL level regulates the response of T4EM lymphocytes to ionizing radiation and suggest that TRAIL/TNFSF10 genetic variants hold promise as markers of individual radiosensitivity.
Collapse
|
3
|
Loucoubar C, Grant AV, Bureau JF, Casademont I, Bar NA, Bar-Hen A, Diop M, Faye J, Sarr FD, Badiane A, Tall A, Trape JF, Cliquet F, Schwikowski B, Lathrop M, Paul RE, Sakuntabhai A. Detecting multi-way epistasis in family-based association studies. Brief Bioinform 2017; 18:394-402. [PMID: 27178992 DOI: 10.1093/bib/bbw039] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Indexed: 11/13/2022] Open
Abstract
The era of genome-wide association studies (GWAS) has led to the discovery of numerous genetic variants associated with disease. Better understanding of whether these or other variants interact leading to differential risk compared with individual marker effects will increase our understanding of the genetic architecture of disease, which may be investigated using the family-based study design. We present M-TDT (the multi-locus transmission disequilibrium test), a tool for detecting family-based multi-locus multi-allelic effects for qualitative or quantitative traits, extended from the original transmission disequilibrium test (TDT). Tests to handle the comparison between additive and epistatic models, lack of independence between markers and multiple offspring are described. Performance of M-TDT is compared with a multifactor dimensionality reduction (MDR) approach designed for investigating families in the hypothesis-free genome-wide setting (the multifactor dimensionality reduction pedigree disequilibrium test, MDR-PDT). Other methods derived from the TDT or MDR to investigate genetic interaction in the family-based design are also discussed. The case of three independent biallelic loci is illustrated using simulations for one- to three-locus alternative hypotheses. M-TDT identified joint-locus effects and distinguished effectively between additive and epistatic models. We showed a practical example of M-TDT based on three genes already known to be implicated in malaria susceptibility. Our findings demonstrate the value of M-TDT in a hypothesis-driven context to test for multi-way epistasis underlying common disease etiology, whereas MDR-PDT-based methods are more appropriate in a hypothesis-free genome-wide setting.
Collapse
|
4
|
Villamil-Ramírez H, León-Mimila P, Macias-Kauffer LR, Canizalez-Román A, Villalobos-Comparán M, León-Sicairos N, Vega-Badillo J, Sánchez-Muñoz F, López-Contreras B, Morán-Ramos S, Villarreal-Molina T, Zurita LC, Campos-Pérez F, Huertas-Vazquez A, Bojalil R, Romero-Hidalgo S, Aguilar-Salinas CA, Canizales-Quinteros S. A combined linkage and association strategy identifies a variant near the GSTP1 gene associated with BMI in the Mexican population. J Hum Genet 2016; 62:413-418. [PMID: 27881840 DOI: 10.1038/jhg.2016.145] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 10/24/2016] [Accepted: 10/25/2016] [Indexed: 12/27/2022]
Abstract
Obesity is a major public health concern in Mexico and worldwide. Although the estimated heritability is high, common variants identified by genome-wide association studies explain only a small proportion of this heritability. A combination of linkage and association strategies could be a more robust and powerful approach to identify other obesity-susceptibility variants. We thus sought to identify novel genetic variants associated with obesity-related traits in the Mexican population by combining these methods. We performed a genome-wide linkage scan for body mass index (BMI) and other obesity-related phenotypes in 16 Mexican families using the Sequential Oligogenic Linkage Analysis Routines Program. Associated single-nucleotide polymorphisms (SNPs) were tested for associations in an independent cohort. Two suggestive BMI-linkage peaks (logarithm of odds ⩾1.5) were observed at chromosomal regions 11q13 and 13q22. Only rs614080 in the 11q13 region was significantly associated with BMI and related traits in these families. This association was also significant in an independent cohort of Mexican adults. Moreover, this variant was significantly associated with GSTP1 gene expression levels in adipose tissue. In conclusion, the rs614080 SNP near the GSTP1 gene was significantly associated with BMI and GSTP1 expression levels in the Mexican population.
Collapse
Affiliation(s)
- Hugo Villamil-Ramírez
- Programa de Doctorado en Ciencias Biológicas y de la Salud, Universidad Autónoma Metropolitana, México City, México.,Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | - Paola León-Mimila
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | - Luis R Macias-Kauffer
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | | | | | | | - Joel Vega-Badillo
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | - Fausto Sánchez-Muñoz
- Departamento de Inmunología, Instituto Nacional de Cardiología Ignacio Chávez (INCICh), México City, México
| | - Blanca López-Contreras
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | - Sofía Morán-Ramos
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| | | | - Luis C Zurita
- Clínica Integral de Cirugía para la Obesidad y Enfermedades Metabólicas, Hospital General 'Dr Rubén Leñero', México City, México
| | - Francisco Campos-Pérez
- Clínica Integral de Cirugía para la Obesidad y Enfermedades Metabólicas, Hospital General 'Dr Rubén Leñero', México City, México
| | | | - Rafael Bojalil
- Departamento de Inmunología, Instituto Nacional de Cardiología Ignacio Chávez (INCICh), México City, México.,Departmento de Atención a la salud, Universidad Autónoma Metropolitana-Xochimilco, México City, México
| | | | - Carlos A Aguilar-Salinas
- Departamento de Endocrinología y Metabolismo, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, México City, México
| | - Samuel Canizales-Quinteros
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM/Instituto Nacional de Medicina Genómica (INMEGEN), México City, México
| |
Collapse
|
5
|
Darst BF, Engelman CD. Transmission and decorrelation methods for detecting rare variants using sequencing data from related individuals. BMC Proc 2016; 10:203-207. [PMID: 27980637 PMCID: PMC5133523 DOI: 10.1186/s12919-016-0031-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Advances in whole genome sequencing have enabled the investigation of rare variants, which could explain some of the missing heritability that genome-wide association studies are unable to detect. Most methods to detect associations with rare variants are developed for unrelated individuals; however, several methods exist that utilize family studies and could have better power to detect such associations. METHODS Using whole genome sequencing data and simulated phenotypes provided by the organizers of the Genetic Analysis Workshop 19 (GAW19), we compared family-based methods that test for associations between rare and common variants with a quantitative trait. This was done using 2 fairly novel methods: family-based association test for rare variants (FBAT-RV), which is a transmission-based method that utilizes the transmission of genetic information from parent to offspring; and Minimum p value Optimized Nuisance parameter Score Test Extended to Relatives (MONSTER), which is a decorrelation method that instead attempts to adjust for relatedness using a regression-based method. We also considered family-based association test linear combination (FBAT-LC) and FBAT-Min P, which are slightly older methods that do not allow for the weighting of rare or common variants, but contrast some of the limitations of FBAT-RV. RESULTS MONSTER had much higher overall power than FBAT-RV and FBAT-Min P. Interestingly, FBAT-LC had similar overall power as MONSTER. MONSTER had the highest power for a gene accounting for a larger percent of the phenotypic variance, whereas MONSTER and FBAT-LC both had the highest power for a gene accounting for moderate variance. FBAT-LC had the highest power for a gene accounting for the least variance. CONCLUSIONS Based on the simulated data from GAW19, MONSTER and FBAT-LC were the most powerful of the methods assessed. However, there are limitations to each of these methods that should be carefully considered when conducting an analysis of rare variants in related individuals. This emphasizes the need for methods that can incorporate the advantages of each of these methods into 1 family-based association test for rare variants.
Collapse
Affiliation(s)
- Burcu F. Darst
- University of Wisconsin, Madison, WI USA
- Department of Population Health Sciences, University of Wisconsin School of Medicine and Public Health, Madison, WI USA
| | - Corinne D. Engelman
- University of Wisconsin, Madison, WI USA
- Department of Population Health Sciences, University of Wisconsin School of Medicine and Public Health, Madison, WI USA
| |
Collapse
|
6
|
Kim W. Transmission Disequilibrium Tests Based on Read Counts for Low-Coverage Next-Generation Sequence Data. Hum Hered 2015; 80:36-49. [PMID: 26278553 DOI: 10.1159/000434645] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 05/30/2015] [Indexed: 11/19/2022] Open
Abstract
The purpose of this paper is the introduction of new statistical methods for case-parent trio association studies based on the read counts that can be obtained from next-generation sequencing (NGS) experiments. This work focuses on the inclusion of low-coverage data into the case-parent trio design without genotype classification or imputation. Two different approaches are considered: (1) a likelihood-based approach implementing a 15-component parametric mixture model and (2) a model-free approach that applies non-parametric statistical methods to the ratios of the read counts to coverage. Simulation studies are conducted to evaluate the performances of the proposed tests. In addition, the non-centrality parameters of the mixture likelihood-based tests are derived to determine sample sizes and coverage for a NGS experimental design. As an example, the sample sizes to maintain specified powers of a published adolescent idiopathic scoliosis (AIS) study are presented. The simulation results show that the tests using the genotypes classified by the maximum Bayesian posterior probability have significantly inflated type I error rates for low-coverage data. The tests using the posterior probabilities instead of the classified genotypes show lower power than the proposed tests. Generally, power for the likelihood-based approach is higher than that for the non-parametric ratio-based approach. For the AIS example, approximately 654 trios with 4× coverage are necessary to maintain 90% power when detecting an association of odds ratio 2 at a locus with a minor allele frequency of 0.35 at the level of significance α = 5 × 10(-8). By comparison, approximately 416 trios with 25× coverage are required to maintain the same power with the same settings. The R and C source codes to calculate the proposed test statistics, the sample sizes and power can be obtained by contacting the author (wkim@cau.ac.kr).
Collapse
Affiliation(s)
- Wonkuk Kim
- Department of Applied Statistics, Chung-Ang University, Seoul, South Korea
| |
Collapse
|
7
|
Association test with the principal component analysis in case-parents studies. J Genet 2015; 94:347-9. [PMID: 26174687 DOI: 10.1007/s12041-015-0522-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
8
|
Wang YT, Sung PY, Lin PL, Yu YW, Chung RH. A multi-SNP association test for complex diseases incorporating an optimal P-value threshold algorithm in nuclear families. BMC Genomics 2015; 16:381. [PMID: 25975968 PMCID: PMC4433014 DOI: 10.1186/s12864-015-1620-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Accepted: 05/05/2015] [Indexed: 01/22/2023] Open
Abstract
Background Genome-wide association studies (GWAS) have become a common approach to identifying single nucleotide polymorphisms (SNPs) associated with complex diseases. As complex diseases are caused by the joint effects of multiple genes, while the effect of individual gene or SNP is modest, a method considering the joint effects of multiple SNPs can be more powerful than testing individual SNPs. The multi-SNP analysis aims to test association based on a SNP set, usually defined based on biological knowledge such as gene or pathway, which may contain only a portion of SNPs with effects on the disease. Therefore, a challenge for the multi-SNP analysis is how to effectively select a subset of SNPs with promising association signals from the SNP set. Results We developed the Optimal P-value Threshold Pedigree Disequilibrium Test (OPTPDT). The OPTPDT uses general nuclear families. A variable p-value threshold algorithm is used to determine an optimal p-value threshold for selecting a subset of SNPs. A permutation procedure is used to assess the significance of the test. We used simulations to verify that the OPTPDT has correct type I error rates. Our power studies showed that the OPTPDT can be more powerful than the set-based test in PLINK, the multi-SNP FBAT test, and the p-value based test GATES. We applied the OPTPDT to a family-based autism GWAS dataset for gene-based association analysis and identified MACROD2-AS1 with genome-wide significance (p-value= 2.5 × 10− 6). Conclusions Our simulation results suggested that the OPTPDT is a valid and powerful test. The OPTPDT will be helpful for gene-based or pathway association analysis. The method is ideal for the secondary analysis of existing GWAS datasets, which may identify a set of SNPs with joint effects on the disease. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1620-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yi-Ting Wang
- Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan.
| | - Pei-Yuan Sung
- Institute of Statistics, National Tsing Hua University, Hsin-Chu, Taiwan.
| | - Peng-Lin Lin
- Department of Medical Science, National Tsing Hua University, Hsin-Chu, Taiwan.
| | - Ya-Wen Yu
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan.
| | - Ren-Hua Chung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan.
| |
Collapse
|
9
|
Wade M, Hoffmann TJ, Jenkins JM. Association between the arginine vasopressin receptor 1A (AVPR1A) gene and preschoolers’ executive functioning. Brain Cogn 2014; 90:116-23. [DOI: 10.1016/j.bandc.2014.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Revised: 06/02/2014] [Accepted: 06/03/2014] [Indexed: 01/08/2023]
|
10
|
Li D, Zhou J, Thomas DC, Fardo DW. Complex pedigrees in the sequencing era: to track transmissions or decorrelate? Genet Epidemiol 2014; 38 Suppl 1:S29-36. [PMID: 25112185 DOI: 10.1002/gepi.21822] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Next-generation sequencing (NGS) studies are becoming commonplace, and the NGS field is continuing to develop rapidly. Analytic methods aimed at testing for the various roles that genetic susceptibility plays in disease are also rapidly being developed and optimized. Studies that incorporate large, complex pedigrees are of particular importance because they provide detailed information about inheritance patterns and can be analyzed in a variety of complementary ways. The nine contributions from our Genetic Analysis Workshop 18 working group on family-based tests of association for rare variants using simulated data examined analytic methods for testing genetic association using whole-genome sequencing data from 20 large pedigrees with 200 phenotype simulation replicates. What distinguishes the approaches explored is how the complexities of analyzing familial genetic data were handled. Here, we explore the methods that either harness inheritance patterns and transmission information or attempt to adjust for the correlation between family members in order to utilize computationally and conceptually simpler statistical testing procedures. Although directly comparing these two classes of approaches across contributions is difficult, we note that the two classes balance robustness to population stratification and computational complexity (the transmission-based approaches) with simplicity and increased power, assuming no population stratification or proper adjustment for it (decorrelation approaches).
Collapse
Affiliation(s)
- Dalin Li
- Medical Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, California, United States of America; David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | | | | | | |
Collapse
|
11
|
Wade M, Hoffmann TJ, Wigg K, Jenkins JM. Association between the oxytocin receptor (OXTR) gene and children's social cognition at 18 months. GENES BRAIN AND BEHAVIOR 2014; 13:603-10. [DOI: 10.1111/gbb.12148] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2014] [Revised: 06/05/2014] [Accepted: 06/09/2014] [Indexed: 11/26/2022]
Affiliation(s)
- M. Wade
- Department of Applied Psychology and Human Development; University of Toronto; Toronto ON Canada
| | - T. J. Hoffmann
- Department of Epidemiology and Biostatistics and Institute for Human Genetics; University of California at San Francisco; San Francisco CA USA
| | - K. Wigg
- Genetics and Development Division; Toronto Western Research Institute; Toronto ON Canada
| | - J. M. Jenkins
- Department of Applied Psychology and Human Development; University of Toronto; Toronto ON Canada
| |
Collapse
|
12
|
Girardi A, Martinelli M, Cura F, Palmieri A, Carinci F, Sesenna E, Scapoli L. RFC1 and non-syndromic cleft lip with or without cleft palate: an association based study in Italy. J Craniomaxillofac Surg 2014; 42:1503-5. [PMID: 24942095 DOI: 10.1016/j.jcms.2014.04.021] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Revised: 03/07/2014] [Accepted: 04/22/2014] [Indexed: 11/27/2022] Open
Abstract
The molecular basis of orofacial development is largely unknown and needs to be unravelled. Non-syndromic cleft lip with or without cleft palate (NSCL/P) is the most common craniofacial malformation, with an incidence of about 1/700 live births, although variable according to ethnicity. Being a multifactorial disease, it arises as a result of an interplay between genetic and environmental factors. Several approaches have been developed to identify susceptibility genes. Genes belonging to the folate/homocysteine pathway are attracting increasing interest because folate supplementation before and during early pregnancy can reduce the risk of NSCL/P. We performed a family based association study in order to assess if a genetic variant of RFC1 could be involved in NSCL/P onset. We genotyped 404 unrelated probands and their relatives for three biallelic polymorphic variants (rs1051266, rs4818789 and rs3788205), that were selected because they produced conflicting results on previous investigations. Evidence of association was found between the investigated polymorphisms and NSCL/P in our sample of the Italian population, albeit with weak significance levels. Results from this investigation provided a support of previous studies suggesting a role of RFC1 in NSCL/P aetiology, reinforcing the concept that genetic predisposition to NSCL/P varies enormously within different ethnic groups.
Collapse
Affiliation(s)
- Ambra Girardi
- Department of Experimental, Diagnostic and Specialty Medicine, University di Bologna, Via Belmeloro 8, 40126 Bologna, Italy
| | - Marcella Martinelli
- Department of Experimental, Diagnostic and Specialty Medicine, University di Bologna, Via Belmeloro 8, 40126 Bologna, Italy.
| | - Francesca Cura
- Department of Experimental, Diagnostic and Specialty Medicine, University di Bologna, Via Belmeloro 8, 40126 Bologna, Italy
| | - Annalisa Palmieri
- Department of Experimental, Diagnostic and Specialty Medicine, University di Bologna, Via Belmeloro 8, 40126 Bologna, Italy
| | - Francesco Carinci
- Department of Morphology, Surgery and Experimental Medicine, University of Ferrara, Via Luigi Borsari 46, 44121 Ferrara, Italy
| | - Enrico Sesenna
- Head and Neck Department, University Hospital of Parma, Via Gramsci 14, 43100 Parma, Italy
| | - Luca Scapoli
- Department of Experimental, Diagnostic and Specialty Medicine, University di Bologna, Via Belmeloro 8, 40126 Bologna, Italy
| |
Collapse
|
13
|
Zhou JJ, Yip WK, Cho MH, Qiao D, McDonald MLN, Laird NM. A comparative analysis of family-based and population-based association tests using whole genome sequence data. BMC Proc 2014; 8:S33. [PMID: 25519381 PMCID: PMC4143682 DOI: 10.1186/1753-6561-8-s1-s33] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies, numerous methods have been developed to assess the impact of both common and rare variants on disease traits. In this report, whole genome sequencing data from Genetic Analysis Workshop 18 was used to compare the power of several methods, considering both family-based and population-based designs, to detect association with variants in the MAP4 gene region and on chromosome 3 with blood pressure. To prioritize variants across the genome for testing, variants were first functionally assessed using prediction algorithms and expression quantitative trait loci (eQTLs) data. Four set-based tests in the family-based association tests (FBAT) framework--FBAT-v, FBAT-lmm, FBAT-m, and FBAT-l--were used to analyze 20 pedigrees, and 2 variance component tests, sequence kernel association test (SKAT) and genome-wide complex trait analysis (GCTA), were used with 142 unrelated individuals in the sample. Both set-based and variance-component-based tests had high power and an adequate type I error rate. Of the various FBATs, FBAT-l demonstrated superior performance, indicating the potential for it to be used in rare-variant analysis. The updated FBAT package is available at: http://www.hsph.harvard.edu/fbat/.
Collapse
Affiliation(s)
- Jin J Zhou
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA ; Division of Epidemiology and Biostatistics, College of Public Health, University of Arizona, Tucson, AZ 85724, USA
| | - Wai-Ki Yip
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA ; Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Dandi Qiao
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| | - Merry-Lynn N McDonald
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Nan M Laird
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| |
Collapse
|
14
|
Turkmen AS, Lin S. Blocking approach for identification of rare variants in family-based association studies. PLoS One 2014; 9:e86126. [PMID: 24465912 PMCID: PMC3900483 DOI: 10.1371/journal.pone.0086126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2013] [Accepted: 12/09/2013] [Indexed: 01/14/2023] Open
Abstract
With the advent of next-generation sequencing technology, rare variant association analysis is increasingly being conducted to identify genetic variants associated with complex traits. In recent years, significant effort has been devoted to develop powerful statistical methods to test such associations for population-based designs. However, there has been relatively little development for family-based designs although family data have been shown to be more powerful to detect rare variants. This study introduces a blocking approach that extends two popular family-based common variant association tests to rare variants association studies. Several options are considered to partition a genomic region (gene) into "independent" blocks by which information from SNVs is aggregated within a block and an overall test statistic for the entire genomic region is calculated by combining information across these blocks. The proposed methodology allows different variants to have different directions (risk or protective) and specification of minor allele frequency threshold is not needed. We carried out a simulation to verify the validity of the method by showing that type I error is well under control when the underlying null hypothesis and the assumption of independence across blocks are satisfied. Further, data from the Genetic Analysis Workshop [Formula: see text] are utilized to illustrate the feasibility and performance of the proposed methodology in a realistic setting.
Collapse
Affiliation(s)
- Asuman S Turkmen
- Statistics Department, The Ohio State University, Columbus, Ohio, United States of America ; Statistics Department, The Ohio State University, Newark, Ohio, United States of America
| | - Shili Lin
- Statistics Department, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
15
|
Family-based association tests for sequence data, and comparisons with population-based association tests. Eur J Hum Genet 2013; 21:1158-62. [PMID: 23386037 DOI: 10.1038/ejhg.2012.308] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Revised: 10/22/2012] [Accepted: 11/21/2012] [Indexed: 11/08/2022] Open
Abstract
Recent advances in high-throughput sequencing technologies make it increasingly more efficient to sequence large cohorts for many complex traits. We discuss here a class of sequence-based association tests for family-based designs that corresponds naturally to previously proposed population-based tests, including the classical Burden and variance-component tests. This framework allows for a direct comparison between the powers of sequence-based association tests with family- vs population-based designs. We show that for dichotomous traits using family-based controls results in similar power levels as the population-based design (although at an increased sequencing cost for the family-based design), while for continuous traits (in random samples, no ascertainment) the population-based design can be substantially more powerful. A possible disadvantage of population-based designs is that they can lead to increased false-positive rates in the presence of population stratification, while the family-based designs are robust to population stratification. We show also an application to a small exome-sequencing family-based study on autism spectrum disorders. The tests are implemented in publicly available software.
Collapse
|
16
|
Kamens HM, Corley RP, McQueen MB, Stallings MC, Hopfer CJ, Crowley TJ, Brown SA, Hewitt JK, Ehringer MA. Nominal association with CHRNA4 variants and nicotine dependence. GENES BRAIN AND BEHAVIOR 2013; 12:297-304. [PMID: 23350800 DOI: 10.1111/gbb.12021] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Revised: 11/06/2012] [Accepted: 01/10/2013] [Indexed: 01/05/2023]
Abstract
Nicotine binds to nicotinic acetylcholine receptors and studies in animal models have shown that α4β2 receptors mediate many behavioral effects of nicotine. Human genetics studies have provided support that variation in the gene that codes for the α4 subunit influences nicotine dependence (ND), but the evidence for the involvement of the β2 subunit gene is less convincing. In this study, we examined the genetic association between variation in the genes that code for the α4 (CHRNA4) and β2 (CHRNB2) subunits of the nicotinic acetylcholine receptor and a quantitative measure of lifetime DSM-IV ND symptom counts. We performed this analysis in two longitudinal family-based studies focused on adolescent antisocial drug abuse: the Center on Antisocial Drug Dependence (CADD, N = 313 families) and Genetics of Antisocial Drug Dependence (GADD, N = 111 families). Family-based association tests were used to examine associations between 14 single nucleotide polymorphisms (SNPs) in CHRNA4 and CHRNB2 and ND symptoms. Symptom counts were corrected for age, sex and clinical status prior to the association analysis. Results, when the samples were combined, provided modest evidence that SNPs in CHRNA4 are associated with ND. The minor allele at both rs1044394 (A; Z = 1.988, P = 0.047, unadjusted P-value) and rs1044396 (G; Z = 2.398, P = 0.017, unadjusted P-value) was associated with increased risk of ND symptoms. These data provide suggestive evidence that variation in the α4 subunit of the nicotinic acetylcholine receptor may influence ND liability.
Collapse
Affiliation(s)
- H M Kamens
- Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, CO, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
17
|
De G, Yip WK, Ionita-Laza I, Laird N. Rare variant analysis for family-based design. PLoS One 2013; 8:e48495. [PMID: 23341868 PMCID: PMC3546113 DOI: 10.1371/journal.pone.0048495] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Accepted: 10/01/2012] [Indexed: 12/21/2022] Open
Abstract
Genome-wide association studies have been able to identify disease associations with many common variants; however most of the estimated genetic contribution explained by these variants appears to be very modest. Rare variants are thought to have larger effect sizes compared to common SNPs but effects of rare variants cannot be tested in the GWAS setting. Here we propose a novel method to test for association of rare variants obtained by sequencing in family-based samples by collapsing the standard family-based association test (FBAT) statistic over a region of interest. We also propose a suitable weighting scheme so that low frequency SNPs that may be enriched in functional variants can be upweighted compared to common variants. Using simulations we show that the family-based methods perform at par with the population-based methods under no population stratification. By construction, family-based tests are completely robust to population stratification; we show that our proposed methods remain valid even when population stratification is present.
Collapse
Affiliation(s)
- Gourab De
- Department of Biostatistics, Harvard University, Boston, MA, USA.
| | | | | | | |
Collapse
|
18
|
Schifano ED, Epstein MP, Bielak LF, Jhun MA, Kardia SLR, Peyser PA, Lin X. SNP set association analysis for familial data. Genet Epidemiol 2012; 36:797-810. [PMID: 22968922 DOI: 10.1002/gepi.21676] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Revised: 07/06/2012] [Accepted: 07/30/2012] [Indexed: 11/06/2022]
Abstract
Genome-wide association studies (GWAS) are a popular approach for identifying common genetic variants and epistatic effects associated with a disease phenotype. The traditional statistical analysis of such GWAS attempts to assess the association between each individual single-nucleotide polymorphism (SNP) and the observed phenotype. Recently, kernel machine-based tests for association between a SNP set (e.g., SNPs in a gene) and the disease phenotype have been proposed as a useful alternative to the traditional individual-SNP approach, and allow for flexible modeling of the potentially complicated joint SNP effects in a SNP set while adjusting for covariates. We extend the kernel machine framework to accommodate related subjects from multiple independent families, and provide a score-based variance component test for assessing the association of a given SNP set with a continuous phenotype, while adjusting for additional covariates and accounting for within-family correlation. We illustrate the proposed method using simulation studies and an application to genetic data from the Genetic Epidemiology Network of Arteriopathy (GENOA) study.
Collapse
Affiliation(s)
- Elizabeth D Schifano
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts
| | | | | | | | | | | | | |
Collapse
|
19
|
Guo W, Shugart YY. Detecting rare variants for quantitative traits using nuclear families. Hum Hered 2012; 73:148-58. [PMID: 22699804 DOI: 10.1159/000338439] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2011] [Accepted: 03/30/2012] [Indexed: 01/17/2023] Open
Abstract
With the advent of sequencing technology opening up a new era of personal genome sequencing, huge amounts of rare variant data have suddenly become available to researchers seeking genetic variants related to human complex disorders. There is an urgent need for the development of novel statistical methods to analyze rare variants in a statistically powerful manner. While a number of statistical tests have already been developed to analyze collapsed rare variants identified by association tests in case-control studies, to date, only two FBAT tests-for-rare (described in the updated FBAT version v2.0.4) have applied collapsing methods analogously in family-based designs. For further research in this area, this study aims to introduce three new beta-determined weight tests for detecting rare variants for quantitative traits in nuclear families. In addition to evaluating the performance of these new methods, it also evaluates that of the two FBAT tests-for-rare, using extensive simulations of situations with and without linkage disequilibrium. Results from these simulations suggest that the four tests using beta-determined weights outperform the two collapsing methods used in FBAT (-v0 and -v1). In addition, both the linear combination method (detailed in the FBAT menu v2.0.4) and the multiple regression method (mixing LASSO and Ridge penalties) performed better than the other two beta-determined weight tests we proposed. Following testing and evaluation, we submitted four new beta-determined weight methods of statistical analysis in a computer program to the Comprehensive R Archive Network (CRAN) for general use.
Collapse
Affiliation(s)
- Wei Guo
- Division of Intramural Division Program, National Institute of Mental Health, National Institute of Health, Bethesda, MD 20892, USA
| | | |
Collapse
|
20
|
Ionita-Laza I, Makarov V, Yoon S, Raby B, Buxbaum J, Nicolae DL, Lin X. Finding disease variants in Mendelian disorders by using sequence data: methods and applications. Am J Hum Genet 2011; 89:701-12. [PMID: 22137099 DOI: 10.1016/j.ajhg.2011.11.003] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2011] [Revised: 09/19/2011] [Accepted: 11/03/2011] [Indexed: 12/11/2022] Open
Abstract
Many sequencing studies are now underway to identify the genetic causes for both Mendelian and complex traits. Via exome-sequencing, genes harboring variants implicated in several Mendelian traits have already been identified. The underlying methodology in these studies is a multistep algorithm based on filtering variants identified in a small number of affected individuals and depends on whether they are novel (not yet seen in public resources such as dbSNP), shared among affected individuals, and other external functional information on the variants. Although intuitive, these filter-based methods are nonoptimal and do not provide any measure of statistical uncertainty. We describe here a formal statistical approach that has several distinct advantages: (1) it provides fast computation of approximate p values for individual genes, (2) it adjusts for the background variation in each gene, (3) it allows for incorporation of functional or linkage-based information, and (4) it accommodates designs based on both affected relative pairs and unrelated affected individuals. We show via simulations that the proposed approach can be used in conjunction with the existing filter-based methods to achieve a substantially better ranking of a gene relevant for disease when compared to currently used filter-based approaches, this is especially so in the presence of disease locus heterogeneity. We revisit recent studies on three Mendelian diseases and show that the proposed approach results in the implicated gene being ranked first in all studies, and approximate p values of 10(-6) for the Miller Syndrome gene, 1.0 × 10(-4) for the Freeman-Sheldon Syndrome gene, and 3.5 × 10(-5) for the Kabuki Syndrome gene.
Collapse
|
21
|
Yu Z, Wang S. Contrasting linkage disequilibrium as a multilocus family-based association test. Genet Epidemiol 2011; 35:487-98. [PMID: 21769928 DOI: 10.1002/gepi.20598] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2010] [Revised: 04/20/2011] [Accepted: 04/24/2011] [Indexed: 02/04/2023]
Abstract
Linkage disequilibrium (LD) of genetic loci is routinely estimated and graphically illustrated in genetic association studies. It has been suggested that the information in LD is also useful for association mapping and genetic association can be detected by comparing LD patterns between cases and controls. Here, we extend this idea to analyze case-parents data by comparing LD patterns between transmitted and nontransmitted genotypes. We provide the condition when contrasting LD is valid for testing gene-gene interactions. A permutation procedure is given to assess statistical significance. One advantage of our proposed methods is that haplotype information is not required. Thus, the implementation of our methods is straightforward and the resulted tests are free from potential bias caused by assumptions made to estimate haplotypes in silico. Since our test statistics use pairwise LD measurements, they are less affected by missing data than many other multilocus methods. With simulated data, we demonstrate that examining LD patterns of case-parents data is a useful multilocus association mapping strategy and it complements existing association mapping methods. The application of our methods to a Crohn's disease data set shows that our methods can detect multilocus association that might be missed by other association methods. Our permutation procedure can also be modified to allow multiple offspring from a family to be analyzed.
Collapse
Affiliation(s)
- Zhaoxia Yu
- Department of Statistics, University of California-Irvine, CA 92697, USA.
| | | |
Collapse
|
22
|
Van Steen K. Perspectives on genome-wide multi-stage family-based association studies. Stat Med 2011; 30:2201-21. [DOI: 10.1002/sim.4259] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Accepted: 03/07/2011] [Indexed: 01/03/2023]
|
23
|
Bureau A, Croteau J, Tayeb A, Mérette C, Labbe A. Latent class model with familial dependence to address heterogeneity in complex diseases: adapting the approach to family-based association studies. Genet Epidemiol 2011; 35:182-9. [PMID: 21308764 DOI: 10.1002/gepi.20566] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2010] [Revised: 11/29/2010] [Accepted: 01/04/2011] [Indexed: 11/10/2022]
Abstract
Clinical diagnoses of complex diseases may often encompass multiple genetically heterogeneous disorders. One way of dissecting this heterogeneity is to apply latent class (LC) analysis to measurements related to the diagnosis, such as detailed symptoms, to define more homogeneous disease sub-types, influenced by a smaller number of genes that will thus be more easily detectable. We have previously developed a LC model allowing dependence between the latent disease class status of relatives within families. We have also proposed a strategy to incorporate the posterior probability of class membership of each subject in parametric linkage analysis, which is not directly transferable to genetic association methods. Under the framework of family-based association tests (FBAT), we now propose to make the contribution of an affected subject to the FBAT statistic proportional to his or her posterior class membership probability. Simulations showed a modest but robust power advantage compared to simply assigning each subject to his or her most probable class, and important power gains over the analysis of the disease diagnosis without LC modeling under certain scenarios. The use of LC analysis with FBAT is illustrated using autism spectrum disorder (ASD) symptoms on families from the Autism Genetics Research Exchange, where we examined eight regions previously associated to autism in this sample. The analysis using the posterior probability of membership to an LC detected an association in the JARID2 gene as significant as that for ASD (P = 3 × 10(-5)) but with a larger effect size (odds ratio = 2.17 vs. 1.55).
Collapse
Affiliation(s)
- Alexandre Bureau
- Centre de recherche Université Laval Robert-Giffard, Quebec City, Quebec, Canada.
| | | | | | | | | |
Collapse
|
24
|
Hussman JP, Chung RH, Griswold AJ, Jaworski JM, Salyakina D, Ma D, Konidari I, Whitehead PL, Vance JM, Martin ER, Cuccaro ML, Gilbert JR, Haines JL, Pericak-Vance MA. A noise-reduction GWAS analysis implicates altered regulation of neurite outgrowth and guidance in autism. Mol Autism 2011; 2:1. [PMID: 21247446 PMCID: PMC3035032 DOI: 10.1186/2040-2392-2-1] [Citation(s) in RCA: 130] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2010] [Accepted: 01/19/2011] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Genome-wide Association Studies (GWAS) have proved invaluable for the identification of disease susceptibility genes. However, the prioritization of candidate genes and regions for follow-up studies often proves difficult due to false-positive associations caused by statistical noise and multiple-testing. In order to address this issue, we propose the novel GWAS noise reduction (GWAS-NR) method as a way to increase the power to detect true associations in GWAS, particularly in complex diseases such as autism. METHODS GWAS-NR utilizes a linear filter to identify genomic regions demonstrating correlation among association signals in multiple datasets. We used computer simulations to assess the ability of GWAS-NR to detect association against the commonly used joint analysis and Fisher's methods. Furthermore, we applied GWAS-NR to a family-based autism GWAS of 597 families and a second existing autism GWAS of 696 families from the Autism Genetic Resource Exchange (AGRE) to arrive at a compendium of autism candidate genes. These genes were manually annotated and classified by a literature review and functional grouping in order to reveal biological pathways which might contribute to autism aetiology. RESULTS Computer simulations indicate that GWAS-NR achieves a significantly higher classification rate for true positive association signals than either the joint analysis or Fisher's methods and that it can also achieve this when there is imperfect marker overlap across datasets or when the closest disease-related polymorphism is not directly typed. In two autism datasets, GWAS-NR analysis resulted in 1535 significant linkage disequilibrium (LD) blocks overlapping 431 unique reference sequencing (RefSeq) genes. Moreover, we identified the nearest RefSeq gene to the non-gene overlapping LD blocks, producing a final candidate set of 860 genes. Functional categorization of these implicated genes indicates that a significant proportion of them cooperate in a coherent pathway that regulates the directional protrusion of axons and dendrites to their appropriate synaptic targets. CONCLUSIONS As statistical noise is likely to particularly affect studies of complex disorders, where genetic heterogeneity or interaction between genes may confound the ability to detect association, GWAS-NR offers a powerful method for prioritizing regions for follow-up studies. Applying this method to autism datasets, GWAS-NR analysis indicates that a large subset of genes involved in the outgrowth and guidance of axons and dendrites is implicated in the aetiology of autism.
Collapse
Affiliation(s)
| | - Ren-Hua Chung
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Anthony J Griswold
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - James M Jaworski
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Daria Salyakina
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Deqiong Ma
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Ioanna Konidari
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Patrice L Whitehead
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Jeffery M Vance
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Eden R Martin
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Michael L Cuccaro
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - John R Gilbert
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| | - Jonathan L Haines
- Vanderbilt Center for Human Genetics Research, Vanderbilt University, Nashville, TN, USA
| | - Margaret A Pericak-Vance
- John P. Hussman Institute for Human Genomics, University of Miami, 1501 NW 10th Avenue, Miami, FL 33136, USA
| |
Collapse
|
25
|
Wang WC, Hsiung CA, Wang LC, Chuang LM, Quertermous T, Chang IS. Distribution of the number of false discoveries in large-scale family-based association testing with application to the association between PTPN1 and hypertension and obesity. Hum Genet 2010; 129:425-32. [PMID: 21188419 DOI: 10.1007/s00439-010-0936-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Accepted: 12/19/2010] [Indexed: 01/20/2023]
Abstract
We present a model-free approach to the study of the number of false discoveries for large-scale simultaneous family-based association tests (FBATs) in which the set of discoveries is decided by applying a threshold to the test statistics. When the association between a set of markers in a candidate gene and a group of phenotypes is studied by a class of FBATs, we indicate that a joint null hypothesis distribution for these statistics can be obtained by the fundamental statistical method of conditioning on sufficient statistics for the null hypothesis. Based on the joint null distribution of these statistics, we can obtain the distribution of the number of false discoveries for the set of discoveries defined by a threshold; the size of this set is referred to as its tail count. Simulation studies are presented to demonstrate that the conditional, not the unconditional, distribution of the tail count is appropriate for the study of false discoveries. The usefulness of this approach is illustrated by re-examining the association between PTPN1 and a group of blood-pressure-related phenotypes reported by Olivier et al. (Hum Mol Genet 13:1885-1892, 2004); our results refine and reinforce this association.
Collapse
Affiliation(s)
- Wen-Chang Wang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Taiwan
| | | | | | | | | | | |
Collapse
|
26
|
Statistical challenges for genome-wide association studies of suicidality using family data. Eur Psychiatry 2010; 25:307-9. [PMID: 20447807 DOI: 10.1016/j.eurpsy.2009.12.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2009] [Accepted: 12/26/2009] [Indexed: 12/26/2022] Open
Abstract
The etiology of suicide is complex in nature with both environmental and genetic causes that are extremely diverse. This extensive heterogeneity weakens the relationship between genotype and phenotype and as a result, we face many challenges when studying the genetic etiology of suicide. We are now in the midst of a genetics revolution, where genotyping costs are decreasing and genotyping speed is increasing at a fast rate, allowing genetic association studies to genotype thousands to millions of SNPs that cover the entire human genome. As such, genome-wide association studies (GWAS) are now the norm. In this article we address several statistical challenges that occur when studying the genetic etiology of suicidality in the age of the genetics revolution. These challenges include: (1) the large number of statistical tests; (2) complex phenotypes that are difficult to quantify; and (3) modest genetic effect sizes. We address these statistical issues in the context of family-based study designs. Specifically, we discuss several statistical extensions of family-based association tests (FBATs) that work to alleviate these challenges. As our intention is to describe how statistical methodology may work to identify disease variants for suicidality, we avoid the mathematical details of the methodologies presented.
Collapse
|
27
|
Lee HJ, Kim MJ, Park MR. A Review of Genetic Association Analyses in Population and Family Based Data: Methods and Software. KOREAN JOURNAL OF APPLIED STATISTICS 2010. [DOI: 10.5351/kjas.2010.23.1.095] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
28
|
Lasky-Su J, Murphy A, McQueen MB, Weiss S, Lange C. An omnibus test for family-based association studies with multiple SNPs and multiple phenotypes. Eur J Hum Genet 2010; 18:720-5. [PMID: 20087406 DOI: 10.1038/ejhg.2009.221] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
We propose an omnibus family-based association test (MFBAT) that can be applied to multiple markers and multiple phenotypes and that has only one degree of freedom. The proposed test statistic extends current FBAT methodology to incorporate multiple markers as well as multiple phenotypes. Using simulation studies, power estimates for the proposed methodology are compared with the standard methodologies. On the basis of these simulations, we find that MFBAT substantially outperforms other methods, including haplotypic approaches and doing multiple tests with single single-nucleotide polymorphisms (SNPs) and single phenotypes. The practical relevance of the approach is illustrated by an application to asthma in which SNP/phenotype combinations are identified and reach overall significance that would not have been identified using other approaches. This methodology is directly applicable to cases in which there are multiple SNPs, such as candidate gene studies, cases in which there are multiple phenotypes, such as expression data, and cases in which there are multiple phenotypes and genotypes, such as genome-wide association studies that incorporate expression profiles as phenotypes. This program is available in the PBAT analysis package.
Collapse
Affiliation(s)
- Jessica Lasky-Su
- Channing Laboratory, Brigham and Women's Hospital, Boston, MA, USA
| | | | | | | | | |
Collapse
|
29
|
Hoffmann TJ, Lange C, Vansteelandt S, Raby BA, DeMeo DL, Silverman EK, Weiss ST, Laird NM. Parsing the effects of individual SNPs in candidate genes with family data. Hum Hered 2009; 69:91-103. [PMID: 19996607 DOI: 10.1159/000264447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2009] [Accepted: 07/17/2009] [Indexed: 11/19/2022] Open
Abstract
We introduce a stepwise approach for family-based designs for selecting a set of markers in a gene that are independently associated with the disease. The approach is based on testing the effect of a set of markers conditional on another set of markers. Several likelihood-based approaches have been proposed for special cases, but no model-free based tests have been proposed. We propose two types of tests in a family-based framework that are applicable to arbitrary family structures and completely robust to population stratification. We propose methods for ascertained dichotomous traits and unascertained quantitative traits. We first propose a completely model-free extension of the FBAT main genetic effect test. Then, for power issues, we introduce two model-based tests, one for dichotomous traits and one for continuous traits. Lastly, we utilize these tests to analyze a continuous lung function phenotype as a proxy for asthma in the Childhood Asthma Management Program. The methods are implemented in the free R package fbati.
Collapse
Affiliation(s)
- Thomas J Hoffmann
- Department of Biostatistics, Harvard School of Public Health, Boston, Mass. 02115, USA.
| | | | | | | | | | | | | | | |
Collapse
|
30
|
|
31
|
Sex-specific effect of IL9 polymorphisms on lung function and polysensitization. Genes Immun 2009; 10:559-65. [PMID: 19536153 DOI: 10.1038/gene.2009.46] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Sex differences in asthma-associated phenotypes are well known but the genetic factors that may account for these differences have received little attention. This study aimed to characterize sex-specific and pleiotropic genetic factors underlying four quantitative phenotypes involved in the main asthma physiopathological pathways: immunoglobulin E levels, a measure of polysensitization (SPTQ), eosinophil counts and a measure of lung function FEV(1)/H(2) (forced expiratory volume in one second divided by height square). Sex-stratified univariate and bivariate linkage analyses were conducted in 295 families from the Epidemiological study on the Genetics and Environment of Asthma study. We found genome-wide significant evidence for a male-specific pleiotropic QTL (quantitative trait loci) on 5q31 (P=7 x 10(-9)) influencing both FEV(1)/H(2) and SPTQ and for a female-specific pleiotropic QTL on 11q23 underlying SPTQ and immunoglobulin E (P=2 x 10(-5)). Three other sex-specific regions of linkage were detected for eosinophil: 4q24 and 22q13 in females, and 3p25 in males. Further, bivariate association analysis of FEV(1)/H(2) and SPTQ with 5q31 candidate genes in males showed a significant association with two single-nucleotide polymorphisms within IL9 gene, rs2069885 and rs2069882 (P=0.02 and P=0.002, respectively, after Bonferroni's correction). This study underlies the importance of taking into account complex mechanisms, such as heterogeneity according to sex and pleiotropy to unravel the genes involved in asthma phenotypes.
Collapse
|
32
|
Chung RH, Schmidt S, Martin ER, Hauser ER. Ordered-subset analysis (OSA) for family-based association mapping of complex traits. Genet Epidemiol 2009; 32:627-37. [PMID: 18473393 DOI: 10.1002/gepi.20340] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Association analysis provides a powerful tool for complex disease gene mapping. However, in the presence of genetic heterogeneity, the power for association analysis can be low since only a fraction of the collected families may carry a specific disease susceptibility allele. Ordered-subset analysis (OSA) is a linkage test that can be powerful in the presence of genetic heterogeneity. OSA uses trait-related covariates to identify a subset of families that provide the most evidence for linkage. A similar strategy applied to genetic association analysis would likely result in increased power to detect association. Association in the presence of linkage (APL) is a family-based association test (FBAT) for nuclear families with multiple affected siblings that properly infers missing parental genotypes when linkage is present. We propose here APL-OSA, which applies the OSA method to the APL statistic to identify a subset of families that provide the most evidence for association. A permutation procedure is used to approximate the distribution of the APL-OSA statistic under the null hypothesis that there is no relationship between the family-specific covariate and the family-specific evidence for allelic association. We performed a comprehensive simulation study to verify that APL-OSA has the correct type I error rate under the null hypothesis. This simulation study also showed that APL-OSA can increase power relative to other commonly used association tests (APL, FBAT and FBAT with covariate adjustment) in the presence of genetic heterogeneity. Finally, we applied APL-OSA to a family study of age-related macular degeneration, where cigarette smoking was used as a covariate.
Collapse
Affiliation(s)
- Ren-Hua Chung
- Center for Human Genetics, Duke University Medical Center, Durham, North Carolina 27710, USA
| | | | | | | |
Collapse
|
33
|
Laird NM, Lange C. Family-based methods for linkage and association analysis. ADVANCES IN GENETICS 2008; 60:219-52. [PMID: 18358323 DOI: 10.1016/s0065-2660(07)00410-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Traditional epidemiological study concepts such as case-control or cohort designs can be used in the design of genetic association studies, giving them a prominent role in genetic association analysis. A different class of designs based on related individuals, typically families, uses the concept of Mendelian transmission to achieve design-independent randomization, which permits the testing of linkage and association. Family-based designs require specialized analytic methods but they have distinct advantages: They are robust to confounding and variance inflation, which can arise in standard designs in the presence of population substructure; they test for both linkage and association; and they offer a natural solution to the multiple comparison problem. This chapter focuses on family-based designs. We describe some basic study designs as well as general approaches to analysis for qualitative, quantitative, and complex traits. Finally, we review available software.
Collapse
Affiliation(s)
- Nan M Laird
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| | | |
Collapse
|
34
|
Rakovski CS, Weiss ST, Laird NM, Lange C. FBAT-SNP-PC: an approach for multiple markers and single trait in family-based association tests. Hum Hered 2008; 66:122-6. [PMID: 18382091 DOI: 10.1159/000119111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Abstract
OBJECTIVE Develop a new test for family-based association studies and continuous traits that incorporates power- enhancing techniques from two existing testing strategies. METHODS The new procedure initiates with an extraction of the relevant information from the variability of the genotypes and an assessment of the approximate individual markers effects and their directions. This information is incorporated in the construction of the actual test statistic through a selection of a data-determined number of optimal linear combinations of the offspring genotypes which, in a power enhancing step, are consequently combined into a single degree of freedom test. We conduct a comparison simulation study in which the performance of the new test is contrasted with the test that is currently known to offer the highest overall power, FBAT-LC. RESULTS The new test has an overall performance very similar to that of FBAT-LC but attains higher power in candidate genes with lower average pairwise correlations and moderate to high allele frequencies with large gains (up to 80%) for some of the analyzed genes possessing the above-mentioned characteristics. CONCLUSION The new test is a promising tool for candidate gene studies with substantial power gains for genes that are characterized by SNPs with low mean pairwise correlation.
Collapse
Affiliation(s)
- Cyril S Rakovski
- Department of Biostatistics, Harvard School of Public Health, Harvard Medical School, Boston, MA 02115, USA.
| | | | | | | |
Collapse
|
35
|
Lee JH, Cheng R, Rogaeva E, Meng Y, Stern Y, Santana V, Lantigua R, Medrano M, Jimenez-Velazquez IZ, Farrer LA, St George-Hyslop P, Mayeux R. Further examination of the candidate genes in chromosome 12p13 locus for late-onset Alzheimer disease. Neurogenetics 2008; 9:127-38. [PMID: 18340469 DOI: 10.1007/s10048-008-0122-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2007] [Accepted: 02/07/2008] [Indexed: 12/29/2022]
Abstract
A broad region on chromosome 12p13 has been intensely investigated for novel genetic variants associated with Alzheimer disease (AD). We examined this region with 23 microsatellite markers using 124 North European (NE) families and 209 Caribbean Hispanic families with late-onset AD (FAD). Significant evidence for linkage was present in a 5-cM interval near 20 cM in both the NE FAD (LOD = 3.5) and the Caribbean Hispanic FAD (LOD = 2.2) datasets. We further investigated these families and an independent NE case-control dataset using 14 single nucleotide polymorphisms (SNPs). The initial screening of the region at approximately 20 cM in the NE case-control dataset revealed significant association between AD and seven SNPs in several genes, with the strongest result for rs2532500 in TAPBPL (p = 0.006). For rs3741916 in GAPDH, the C allele, rather than the G allele as was observed by Li et al. (Proc Natl Acad Sci U S A 101(44):15688-15693, 2004), was the risk allele. When the two family datasets were examined, none of the SNPs were significant in NE families, but two SNPs were associated with AD in Caribbean Hispanics: rs740850 in NCAPD2 (p = 0.0097) and rs1060620 in GAPDH (p = 0.042). In a separate analysis combining the Caribbean Hispanic families and NE cases and controls, rs740850 was significant after correcting for multiple testing (empirical p = 0.0048). Subsequent haplotype analyses revealed that two haplotype sets-haplotype C-A at SNPs 6-7 within NCAPD2 in Caribbean Hispanics, and haplotypes containing C-A-T at SNPs 8-10 within GAPDH in Caribbean Hispanic family and NE case-control datasets-were associated with AD. Taken together, these SNPs may be in linkage disequilibrium with a pathogenic variant(s) on or near NCAPD2 and GAPDH.
Collapse
Affiliation(s)
- Joseph H Lee
- Taub Institute for Research of Alzheimer's Disease and the Aging Brain, Columbia University, New York, NY 10032, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Al-Kateb H, Boright AP, Mirea L, Xie X, Sutradhar R, Mowjoodi A, Bharaj B, Liu M, Bucksa JM, Arends VL, Steffes MW, Cleary PA, Sun W, Lachin JM, Thorner PS, Ho M, McKnight AJ, Maxwell AP, Savage DA, Kidd KK, Kidd JR, Speed WC, Orchard TJ, Miller RG, Sun L, Bull SB, Paterson AD. Multiple superoxide dismutase 1/splicing factor serine alanine 15 variants are associated with the development and progression of diabetic nephropathy: the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications Genetics study. Diabetes 2008; 57:218-28. [PMID: 17914031 PMCID: PMC2655325 DOI: 10.2337/db07-1059] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
BACKGROUND Despite familial clustering of nephropathy and retinopathy severity in type 1 diabetes, few gene variants have been consistently associated with these outcomes. RESEARCH DESIGN AND METHODS We performed an individual-based genetic association study with time to renal and retinal outcomes in 1,362 white probands with type 1 diabetes from the Diabetes Control and Complications Trial/Epidemiology of Diabetes Interventions and Complications (DCCT/EDIC) study. Specifically, we genotyped 1,411 SNPs that capture common variations in 212 candidate genes for long-term complications and analyzed them for association with the time from DCCT baseline to event for renal and retinal outcomes using multivariate Cox proportion hazards models. To address multiple testing and assist interpretation of the results, false discovery rate q values were calculated separately for each outcome. RESULTS We observed association between rs17880135 in the 3' region of superoxide dismutase 1 (SOD1) and the incidence of both severe nephropathy (hazard ratio [HR] 2.62 [95% CI 1.64-4.18], P = 5.6 x 10(-5), q = 0.06) and persistent microalbuminuria (1.82 [1.29-2.57], P = 6.4 x 10(-4), q = 0.46). Sequencing and fine-mapping identified additional SOD1 variants, including rs202446, rs9974610, and rs204732, which were also associated (P < 10(-3)) with persistent microalbuminuria, whereas rs17880135 and rs17881180 were similarly associated with the development of severe nephropathy. Attempts to replicate the findings in three cross-sectional case-control studies produced equivocal results. We observed no striking differences between risk genotypes in serum SOD activity, serum SOD1 mass, or SOD1 mRNA expression in lymphoblastoid cell lines. CONCLUSIONS Multiple variations in SOD1 are significantly associated with persistent microalbuminuria and severe nephropathy in the DCCT/EDIC study.
Collapse
Affiliation(s)
- Hussam Al-Kateb
- Program in Genetics and Genome Biology, The Hospital for Sick Children, TMDT Building East Tower, Rm. 15-707, 101 College St., Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Al-Kateb H, Mirea L, Xie X, Sun L, Liu M, Chen H, Bull SB, Boright AP, Paterson AD. Multiple variants in vascular endothelial growth factor (VEGFA) are risk factors for time to severe retinopathy in type 1 diabetes: the DCCT/EDIC genetics study. Diabetes 2007; 56:2161-8. [PMID: 17513698 DOI: 10.2337/db07-0376] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE We sought to determine if any common variants in the gene for vascular endothelial growth factor (VEGFA) are associated with long-term renal and retinal complications in type 1 diabetes. RESEARCH DESIGN AND METHODS A total of 1,369 Caucasian subjects with type 1 diabetes from the Diabetes Control and Complications Trial (DCCT)/Epidemiology of Diabetes Interventions and Complications (EDIC) Study had an average of 17 retinal photographs and 10 renal measures over 15 years. In the DCCT/EDIC, we studied 18 single nucleotide polymorphisms (SNPs) in VEGFA that represent all linkage disequilibrium bins (pairwise r(2) > or = 0.64) and tested them for association with time to development of severe retinopathy, three or more step progression of retinopathy, clinically significant macular edema, persistent microalbuminuria, and severe nephropathy. RESULTS In a global multi-SNP test, there was a highly significant association of VEGFA SNPs with severe retinopathy (P = 6.8 x 10(-5))-the four other outcomes were all nonsignificant. In survival analyses controlling for covariate risk factors, eight SNPs showed significant association with severe retinopathy (P < 0.05). The most significant single SNP association was rs3025021 (hazard ratio 1.37 [95% CI 1.13-1.66], P = 0.0017). Family-based analyses of severe retinopathy provide evidence of excess transmission of C at rs699947 (P = 0.029), T at rs3025021 (P = 0.013), and the C-T haplotype from both SNPs (P = 0.035). Multi-SNP regression analysis including 15 SNPs, and allowing for pairwise interactions, independently selected 6 significant SNPs (P < 0.05). CONCLUSIONS These data demonstrate that multiple VEGFA variants are associated with the development of severe retinopathy in type 1 diabetes.
Collapse
Affiliation(s)
- Hussam Al-Kateb
- Program in Genetics and Genome Biology, Hospital of Sick Children Research Institute, Toronto, ON, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|