1451
|
Li A, Meyre D. Jumping on the Train of Personalized Medicine: A Primer for Non-Geneticist Clinicians: Part 2. Fundamental Concepts in Genetic Epidemiology. ACTA ACUST UNITED AC 2014; 10:101-117. [PMID: 25598767 PMCID: PMC4287874 DOI: 10.2174/1573400510666140319235334] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Revised: 02/07/2014] [Accepted: 04/18/2014] [Indexed: 12/12/2022]
Abstract
With the decrease in sequencing costs, personalized genome sequencing will eventually become common in medical practice. We therefore write this series of three reviews to help non-geneticist clinicians to jump into the fast-moving field of personalized medicine. In the first article of this series, we reviewed the fundamental concepts in molecular genetics. In this second article, we cover the key concepts and methods in genetic epidemiology including the classification of genetic disorders, study designs and their implementation, genetic marker selection, genotyping and sequencing technologies, gene identification strategies, data analyses and data interpretation. This review will help the reader critically appraise a genetic association study. In the next article, we will discuss the clinical applications of genetic epidemiology in the personalized medicine area.
Collapse
Affiliation(s)
- Aihua Li
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8N 3Z5, Canada
| | - David Meyre
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8N 3Z5, Canada
| |
Collapse
|
1452
|
Lippert C, Xiang J, Horta D, Widmer C, Kadie C, Heckerman D, Listgarten J. Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. ACTA ACUST UNITED AC 2014; 30:3206-14. [PMID: 25075117 PMCID: PMC4221116 DOI: 10.1093/bioinformatics/btu504] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Motivation: Set-based variance component tests have been identified as a way to increase power in association studies by aggregating weak individual effects. However, the choice of test statistic has been largely ignored even though it may play an important role in obtaining optimal power. We compared a standard statistical test—a score test—with a recently developed likelihood ratio (LR) test. Further, when correction for hidden structure is needed, or gene–gene interactions are sought, state-of-the art algorithms for both the score and LR tests can be computationally impractical. Thus we develop new computationally efficient methods. Results: After reviewing theoretical differences in performance between the score and LR tests, we find empirically on real data that the LR test generally has more power. In particular, on 15 of 17 real datasets, the LR test yielded at least as many associations as the score test—up to 23 more associations—whereas the score test yielded at most one more association than the LR test in the two remaining datasets. On synthetic data, we find that the LR test yielded up to 12% more associations, consistent with our results on real data, but also observe a regime of extremely small signal where the score test yielded up to 25% more associations than the LR test, consistent with theory. Finally, our computational speedups now enable (i) efficient LR testing when the background kernel is full rank, and (ii) efficient score testing when the background kernel changes with each test, as for gene–gene interaction tests. The latter yielded a factor of 2000 speedup on a cohort of size 13 500. Availability: Software available at http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/. Contact:heckerma@microsoft.com Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christoph Lippert
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jing Xiang
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Danilo Horta
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Christian Widmer
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Carl Kadie
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - David Heckerman
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| | - Jennifer Listgarten
- eScience Research Group, Microsoft Research, Los Angeles, CA, 90024 and eScience Research Group, Microsoft Research, Redmond, WA, 98052, USA
| |
Collapse
|
1453
|
Choi S, Lee S, Cichon S, Nöthen MM, Lange C, Park T, Won S. FARVAT: a family-based rare variant association test. ACTA ACUST UNITED AC 2014; 30:3197-205. [PMID: 25075118 DOI: 10.1093/bioinformatics/btu496] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
MOTIVATION Individuals in each family are genetically more homogeneous than unrelated individuals, and family-based designs are often recommended for the analysis of rare variants. However, despite the importance of family-based samples analysis, few statistical methods for rare variant association analysis are available. RESULTS In this report, we propose a FAmily-based Rare Variant Association Test (FARVAT). FARVAT is based on the quasi-likelihood of whole families, and is statistically and computationally efficient for the extended families. FARVAT assumed that families were ascertained with the disease status of family members, and incorporation of the estimated genetic relationship matrix to the proposed method provided robustness under the presence of the population substructure. Depending on the choice of working matrix, our method could be a burden test or a variance component test, and could be extended to the SKAT-O-type statistic. FARVAT was implemented in C++, and application of the proposed method to schizophrenia data and simulated data for GAW17 illustrated its practical importance. AVAILABILITY The software calculates various statistics for the analysis of related samples, and it is freely downloadable from http://healthstats.snu.ac.kr/software/farvat. CONTACT won1@snu.ac.kr or tspark@stats.snu.ac.kr SUPPLEMENTARY INFORMATION supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sungkyoung Choi
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sungyoung Lee
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sven Cichon
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Markus M Nöthen
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Christoph Lange
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Taesung Park
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sungho Won
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| |
Collapse
|
1454
|
Guo W, Shugart YY. The power comparison of the haplotype-based collapsing tests and the variant-based collapsing tests for detecting rare variants in pedigrees. BMC Genomics 2014; 15:632. [PMID: 25070353 PMCID: PMC4131059 DOI: 10.1186/1471-2164-15-632] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2013] [Accepted: 07/18/2014] [Indexed: 11/20/2022] Open
Abstract
Background Both common and rare genetic variants have been shown to contribute to the etiology of complex diseases. Recent genome-wide association studies (GWAS) have successfully investigated how common variants contribute to the genetic factors associated with common human diseases. However, understanding the impact of rare variants, which are abundant in the human population (one in every 17 bases), remains challenging. A number of statistical tests have been developed to analyze collapsed rare variants identified by association tests. Here, we propose a haplotype-based approach. This work inspired by an existing statistical framework of the pedigree disequilibrium test (PDT), which uses genetic data to assess the effects of variants in general pedigrees. We aim to compare the performance between the haplotype-based approach and the rare variant-based approach for detecting rare causal variants in pedigrees. Results Extensive simulations in the sequencing setting were carried out to evaluate and compare the haplotype-based approach with the rare variant methods that drew on a more conventional collapsing strategy. As assessed through a variety of scenarios, the haplotype-based pedigree tests had enhanced statistical power compared with the rare variants based pedigree tests when the disease of interest was mainly caused by rare haplotypes (with multiple rare alleles), and vice versa when disease was caused by rare variants acting independently. For most of other situations when disease was caused both by haplotypes with multiple rare alleles and by rare variants with similar effects, these two approaches provided similar power in testing for association. Conclusions The haplotype-based approach was designed to assess the role of rare and potentially causal haplotypes. The proposed rare variants-based pedigree tests were designed to assess the role of rare and potentially causal variants. This study clearly documented the situations under which either method performs better than the other. All tests have been implemented in a software, which was submitted to the Comprehensive R Archive Network (CRAN) for general use as a computer program named rvHPDT.
Collapse
Affiliation(s)
| | - Yin Yao Shugart
- Division of Intramural Division Program, National Institute of Mental Health, National Institute of Health, 35 Convent Drive, Bethesda, MD 20892, USA.
| |
Collapse
|
1455
|
Sha Q, Zhang S. A rare variant association test based on combinations of single-variant tests. Genet Epidemiol 2014; 38:494-501. [PMID: 25065727 DOI: 10.1002/gepi.21834] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Revised: 04/17/2014] [Accepted: 05/19/2014] [Indexed: 01/22/2023]
Abstract
Next generation sequencing technologies make direct testing rare variant associations possible. However, the development of powerful statistical methods for rare variant association studies is still underway. Most of existing methods are burden and quadratic tests. Recent studies show that the performance of each of burden and quadratic tests depends strongly upon the underlying assumption and no test demonstrates consistently acceptable power. Thus, combined tests by combining information from the burden and quadratic tests have been proposed recently. However, results from recent studies (including this study) show that there exist tests that can outperform both burden and quadratic tests. In this article, we propose three classes of tests that include tests outperforming both burden and quadratic tests. Then, we propose the optimal combination of single-variant tests (OCST) by combining information from tests of the three classes. We use extensive simulation studies to compare the performance of OCST with that of burden, quadratic and optimal single-variant tests. Our results show that OCST either is the most powerful test or has similar power with the most powerful test. We also compare the performance of OCST with that of the two existing combined tests. Our results show that OCST has better power than the two combined tests.
Collapse
Affiliation(s)
- Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, United States of America
| | | |
Collapse
|
1456
|
Chen H, Meigs JB, Dupuis J. Incorporating gene-environment interaction in testing for association with rare genetic variants. Hum Hered 2014; 78:81-90. [PMID: 25060534 PMCID: PMC4169076 DOI: 10.1159/000363347] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2014] [Accepted: 05/03/2014] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES The incorporation of gene-environment interactions could improve the ability to detect genetic associations with complex traits. For common genetic variants, single-marker interaction tests and joint tests of genetic main effects and gene-environment interaction have been well-established and used to identify novel association loci for complex diseases and continuous traits. For rare genetic variants, however, single-marker tests are severely underpowered due to the low minor allele frequency, and only a few gene-environment interaction tests have been developed. We aimed at developing powerful and computationally efficient tests for gene-environment interaction with rare variants. METHODS In this paper, we propose interaction and joint tests for testing gene-environment interaction of rare genetic variants. Our approach is a generalization of existing gene-environment interaction tests for multiple genetic variants under certain conditions. RESULTS We show in our simulation studies that our interaction and joint tests have correct type I errors, and that the joint test is a powerful approach for testing genetic association, allowing for gene-environment interaction. We also illustrate our approach in a real data example from the Framingham Heart Study. CONCLUSION Our approach can be applied to both binary and continuous traits, it is powerful and computationally efficient.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | - James B Meigs
- General Medicine Division, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- The National Heart, Lung and Blood Institute’s Framingham Heart Study, Framingham, MA, USA
| |
Collapse
|
1457
|
Marttinen P, Pirinen M, Sarin AP, Gillberg J, Kettunen J, Surakka I, Kangas AJ, Soininen P, O'Reilly P, Kaakinen M, Kähönen M, Lehtimäki T, Ala-Korpela M, Raitakari OT, Salomaa V, Järvelin MR, Ripatti S, Kaski S. Assessing multivariate gene-metabolome associations with rare variants using Bayesian reduced rank regression. Bioinformatics 2014; 30:2026-34. [PMID: 24665129 PMCID: PMC4080737 DOI: 10.1093/bioinformatics/btu140] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2013] [Revised: 02/27/2014] [Accepted: 03/04/2014] [Indexed: 01/31/2023] Open
Abstract
MOTIVATION A typical genome-wide association study searches for associations between single nucleotide polymorphisms (SNPs) and a univariate phenotype. However, there is a growing interest to investigate associations between genomics data and multivariate phenotypes, for example, in gene expression or metabolomics studies. A common approach is to perform a univariate test between each genotype-phenotype pair, and then to apply a stringent significance cutoff to account for the large number of tests performed. However, this approach has limited ability to uncover dependencies involving multiple variables. Another trend in the current genetics is the investigation of the impact of rare variants on the phenotype, where the standard methods often fail owing to lack of power when the minor allele is present in only a limited number of individuals. RESULTS We propose a new statistical approach based on Bayesian reduced rank regression to assess the impact of multiple SNPs on a high-dimensional phenotype. Because of the method's ability to combine information over multiple SNPs and phenotypes, it is particularly suitable for detecting associations involving rare variants. We demonstrate the potential of our method and compare it with alternatives using the Northern Finland Birth Cohort with 4702 individuals, for whom genome-wide SNP data along with lipoprotein profiles comprising 74 traits are available. We discovered two genes (XRCC4 and MTHFD2L) without previously reported associations, which replicated in a combined analysis of two additional cohorts: 2390 individuals from the Cardiovascular Risk in Young Finns study and 3659 individuals from the FINRISK study. AVAILABILITY AND IMPLEMENTATION R-code freely available for download at http://users.ics.aalto.fi/pemartti/gene_metabolome/.
Collapse
Affiliation(s)
- Pekka Marttinen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Matti Pirinen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Antti-Pekka Sarin
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Jussi Gillberg
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Johannes Kettunen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Ida Surakka
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Antti J Kangas
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Pasi Soininen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Paul O'Reilly
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Marika Kaakinen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Mika Kähönen
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Terho Lehtimäki
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Mika Ala-Korpela
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Olli T Raitakari
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Veikko Salomaa
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, Finland
| | - Marjo-Riitta Järvelin
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Samuli Ripatti
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| | - Samuel Kaski
- Department of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Disease Dynamics, Harvard School of Public Health, Boston, MA, USA Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Unit of Public Health Genomics, National Institute for Health and Welfare, Helsinki, Computational Medicine, Institute of Health Sciences, University of Oulu and Oulu University Hospital, Oulu, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Department of Epidemiology and Biostatistics, MRC Health Protection, Agency (HPA) Centre for Environment and Health, School of Public Health, Imperial College, London, UK, Institute of Health Sciences, Biocenter Oulu, University of Oulu, Oulu, Department of Clinical Physiology, Tampere University Hospital and University of Tampere, Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK, Department of Clinical Physiology and Nuclear Medicine, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Hospital, Turku, Department of Chronic Disease Prevention, National Institute for Health and Welfare, Helsinki, Unit of Primary Care, Oulu University Hospital, Department of Children and Young People and Families, National Institute for Health and Welfare, Oulu, Finland, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK, Hjelt Institute and Department of Computer Science, Helsinki Institute for Information Technology HIIT, University of Helsinki, Helsinki, FinlandDepartment of Information and Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Esbo, Finland, Center for Communicable Dise
| |
Collapse
|
1458
|
Jiang Y, Conneely KN, Epstein MP. Flexible and robust methods for rare-variant testing of quantitative traits in trios and nuclear families. Genet Epidemiol 2014; 38:542-51. [PMID: 25044337 DOI: 10.1002/gepi.21839] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Revised: 05/21/2014] [Accepted: 05/29/2014] [Indexed: 11/07/2022]
Abstract
Most rare-variant association tests for complex traits are applicable only to population-based or case-control resequencing studies. There are fewer rare-variant association tests for family-based resequencing studies, which is unfortunate because pedigrees possess many attractive characteristics for such analyses. Family-based studies can be more powerful than their population-based counterparts due to increased genetic load and further enable the implementation of rare-variant association tests that, by design, are robust to confounding due to population stratification. With this in mind, we propose a rare-variant association test for quantitative traits in families; this test integrates the QTDT approach of Abecasis et al. [Abecasis et al., ] into the kernel-based SNP association test KMFAM of Schifano et al. [Schifano et al., ]. The resulting within-family test enjoys the many benefits of the kernel framework for rare-variant association testing, including rapid evaluation of P-values and preservation of power when a region harbors rare causal variation that acts in different directions on phenotype. Additionally, by design, this within-family test is robust to confounding due to population stratification. Although within-family association tests are generally less powerful than their counterparts that use all genetic information, we show that we can recover much of this power (although still ensuring robustness to population stratification) using a straightforward screening procedure. Our method accommodates covariates and allows for missing parental genotype data, and we have written software implementing the approach in R for public use.
Collapse
Affiliation(s)
- Yunxuan Jiang
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
| | | | | |
Collapse
|
1459
|
Courtenay MD, Cade W, Schwartz SG, Kovach JL, Agarwal A, Wang G, Haines JL, Pericak-Vance MA, Scott WK. Set-based joint test of interaction between SNPs in the VEGF pathway and exogenous estrogen finds association with age-related macular degeneration. Invest Ophthalmol Vis Sci 2014; 55:IOVS-14-14494. [PMID: 25015356 PMCID: PMC4126792 DOI: 10.1167/iovs.14-14494] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2014] [Accepted: 06/27/2014] [Indexed: 11/24/2022] Open
Abstract
Purpose:Age-Related Macular Degeneration (AMD) is the leading cause of irreversible visual loss in developed countries. Its etiology includes genetic and environmental factors. Although VEGFA variants are associated with AMD, the joint action of variants within the VEGF pathway and their interaction with non-genetic factors has not been investigated. Methods:Affymetrix 6.0 chipsets were used to genotype 668,238 SNPs in 1,207 AMD cases and 686 controls. Environmental exposures were collected by questionnaire. A set-based test was conducted using the chi-square statistic at each SNP derived from Kraft's 2df joint test. Pathway and gene-based test statistics were calculated as the mean of all independent SNP statistics. Phenotype labels were permuted 10,000 times to generate an empirical p-value. Results: While a main effect of the VEGF pathway was not identified, the pathway was associated with neovascular AMD in women when accounting for birth control pill (BCP) use (P= 0.017). Analysis of VEGF's subpathways found that SNPs in the Proliferation subpathway were associated with neovascular AMD (P=0.029) when accounting for BCP use. Nominally significant genes within this subpathway were also observed. Stratification by BCP use revealed novel significant genetic effects in women who had taken BCPs. Conclusions: These results illustrate that some AMD genetic risk factors may only be revealed when considering complex relationships among risk factors. This shows the utility of exploring pathways of previously associated genes to find novel effects. It also demonstrates the importance of incorporating environmental exposures in tests of genetic association at the SNP, gene, or pathway level.
Collapse
Affiliation(s)
- Monique D Courtenay
- Human Genetics and Genomics, University of Miami Miller School Medicine, 1501 NW 10th Ave, Miami, FL, 33136, United States
| | - William Cade
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, 1501 NW 10 Ave, BRB-314 (M860), Miami, Florida, 33136, United States
| | - Stephen G Schwartz
- Ophthalmology, Bascom Palmer Eye Institute, Retina Center of Naples, 311 9th Street North, Naples, Florida, 34102, United States of America
| | - Jaclyn L Kovach
- Bascom Palmer Eye Institute, University of Miami Miller School of Medicine, 311 9th St N, Naples, FL, 34102, United States of America
| | - Anita Agarwal
- VEI, Vanderbilt University, 2311 Pierce avenue, Nashville, Tennessee, 37232-8808, United States of America
| | - Gaofeng Wang
- Human Genetics, University of Miami Miller School of Medicine, 1501 NW 10th Avenue; BRB 525, Miami, Florida, 33136, United States
| | - Jonathan L Haines
- Department of Epidemiology & Biostatistics, Case Western Reserve University, 2-529 Wolstein Research Building, 2103 Cornell Road, Cleveland, Ohio, 44106, United States
| | - Margaret A Pericak-Vance
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, 1501 NW 10th Avenue, BRB-314 (M860), Miami, Florida, 33136, United States of America
| | - Wiliam K Scott
- Dr. John T. Macdonald Foundation Department of Human Genetics, John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, 1501 NW 10 Ave., Biomedical Research Building (BRB) # 414, Miami, Florida, 33136, United States
| |
Collapse
|
1460
|
Lee S, Abecasis G, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet 2014; 95:5-23. [PMID: 24995866 DOI: 10.1016/j.ajhg.2014.06.009] [Citation(s) in RCA: 689] [Impact Index Per Article: 62.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Indexed: 12/30/2022] Open
Abstract
Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions.
Collapse
|
1461
|
Schulte EC, Kousi M, Tan PL, Tilch E, Knauf F, Lichtner P, Trenkwalder C, Högl B, Frauscher B, Berger K, Fietze I, Hornyak M, Oertel WH, Bachmann CG, Zimprich A, Peters A, Gieger C, Meitinger T, Müller-Myhsok B, Katsanis N, Winkelmann J. Targeted resequencing and systematic in vivo functional testing identifies rare variants in MEIS1 as significant contributors to restless legs syndrome. Am J Hum Genet 2014; 95:85-95. [PMID: 24995868 DOI: 10.1016/j.ajhg.2014.06.005] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 06/10/2014] [Indexed: 11/19/2022] Open
Abstract
Restless legs syndrome (RLS) is a common neurologic condition characterized by nocturnal dysesthesias and an urge to move, affecting the legs. RLS is a complex trait, for which genome-wide association studies (GWASs) have identified common susceptibility alleles of modest (OR 1.2-1.7) risk at six genomic loci. Among these, variants in MEIS1 have emerged as the largest risk factors for RLS, suggesting that perturbations in this transcription factor might be causally related to RLS susceptibility. To establish this causality, direction of effect, and total genetic burden of MEIS1, we interrogated 188 case subjects and 182 control subjects for rare alleles not captured by previous GWASs, followed by genotyping of ∼3,000 case subjects and 3,000 control subjects, and concluded with systematic functionalization of all discovered variants using a previously established in vivo model of neurogenesis. We observed a significant excess of rare MEIS1 variants in individuals with RLS. Subsequent assessment of all nonsynonymous variants by in vivo complementation revealed an excess of loss-of-function alleles in individuals with RLS. Strikingly, these alleles compromised the function of the canonical MEIS1 splice isoform but were irrelevant to an isoform known to utilize an alternative 3' sequence. Our data link MEIS1 loss of function to the etiopathology of RLS, highlight how combined sequencing and systematic functional annotation of rare variation at GWAS loci can detect risk burden, and offer a plausible explanation for the specificity of phenotypic expressivity of loss-of-function alleles at a locus broadly necessary for neurogenesis and neurodevelopment.
Collapse
Affiliation(s)
- Eva C Schulte
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität München, 81675 Munich, Germany; Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany
| | - Maria Kousi
- Center for Human Disease Modeling, Department of Cell Biology, Duke University, Durham, NC 27710, USA
| | - Perciliz L Tan
- Center for Human Disease Modeling, Department of Cell Biology, Duke University, Durham, NC 27710, USA
| | - Erik Tilch
- Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany; Institut für Humangenetik, Technische Universität München, 81675 Munich, Germany
| | - Franziska Knauf
- Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany
| | - Peter Lichtner
- Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany; Institut für Humangenetik, Technische Universität München, 81675 Munich, Germany
| | - Claudia Trenkwalder
- Paracelsus Elena Klinik, 34128 Kassel, Germany; Klinik für Neurochirurgie, Georg August Universität, 37075 Göttingen, Germany
| | - Birgit Högl
- Department of Neurology, Medizinische Universität Innsbruck, 6020 Innsbruck, Austria
| | - Birgit Frauscher
- Department of Neurology, Medizinische Universität Innsbruck, 6020 Innsbruck, Austria
| | - Klaus Berger
- Institut für Epidemiologie und Sozialmedizin, Westfälische Wilhelms Universität Münster, 48149 Münster, Germany
| | - Ingo Fietze
- Zentrum für Schlafmedizin, Charité Universitätsmedizin, 10117 Berlin, Germany
| | - Magdolna Hornyak
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität München, 81675 Munich, Germany; Interdisziplinäres Schmerzzentrum, Albert-Ludwigs Universität Freiburg, 79106 Freiburg, Germany; Diakoniewerk München-Maxvorstadt, 80799 Munich, Germany
| | - Wolfgang H Oertel
- Klinik für Neurologie, Philipps Universität Marburg, 35039 Marburg, Germany
| | - Cornelius G Bachmann
- Abteilung für Neurologie, Paracelsus Klinikum Osnabrück, 49076 Osnabrück, Germany; Klinische Neurophysiologie, Georg August Universität, 37075 Göttingen, Germany
| | - Alexander Zimprich
- Department of Neurology, Medizinische Universität Wien, 1090 Vienna, Austria
| | - Annette Peters
- Institute of Epidemiology II, Helmholtz Zentrum München, 85764 Munich, Germany
| | - Christian Gieger
- Institute of Genetic Epidemiology, Helmholtz Zentrum München, 85764 Munich, Germany
| | - Thomas Meitinger
- Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany; Institut für Humangenetik, Technische Universität München, 81675 Munich, Germany; Munich Cluster for Systems Neurology (SyNergy), Munich, Germany
| | - Bertram Müller-Myhsok
- Munich Cluster for Systems Neurology (SyNergy), Munich, Germany; Max-Planck Institut für Psychiatrie München, 80804 Munich, Germany; Institute of Translational Medicine, University of Liverpool, Liverpool L69 3BX, UK
| | - Nicholas Katsanis
- Center for Human Disease Modeling, Department of Cell Biology, Duke University, Durham, NC 27710, USA
| | - Juliane Winkelmann
- Neurologische Klinik und Poliklinik, Klinikum rechts der Isar, Technische Universität München, 81675 Munich, Germany; Institut für Humangenetik, Helmholtz Zentrum München, 85764 Munich, Germany; Institut für Humangenetik, Technische Universität München, 81675 Munich, Germany; Munich Cluster for Systems Neurology (SyNergy), Munich, Germany; Department of Neurology and Neurosciences, Center for Sleep Sciences and Medicine, Stanford University, Palo Alto, CA 94304, USA.
| |
Collapse
|
1462
|
Zhao J, Zhu Y, Boerwinkle E, Xiong M. Pathway analysis with next-generation sequencing data. Eur J Hum Genet 2014; 23:507-15. [PMID: 24986826 DOI: 10.1038/ejhg.2014.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Revised: 03/29/2014] [Accepted: 04/26/2014] [Indexed: 12/21/2022] Open
Abstract
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
Collapse
Affiliation(s)
- Jinying Zhao
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Yun Zhu
- Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA
| | - Eric Boerwinkle
- Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Momiao Xiong
- Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
1463
|
Babron MC, Kazma R, Gaborieau V, McKay J, Brennan P, Sarasin A, Benhamou S. Genetic variants in DNA repair pathways and risk of upper aerodigestive tract cancers: combined analysis of data from two genome-wide association studies in European populations. Carcinogenesis 2014; 35:1523-7. [PMID: 24658182 DOI: 10.1093/carcin/bgu075] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
DNA repair pathways are good candidates for upper aerodigestive tract cancer susceptibility because of their critical role in maintaining genome integrity. We have selected 13 pathways involved in DNA repair representing 212 autosomal genes. To assess the role of these pathways and their associated genes, two European data sets from the International Head and Neck Cancer Epidemiology consortium were pooled, totaling 1954 cases and 3121 controls, with documented demographic, lifetime alcohol and tobacco consumption information. We applied an innovative approach that tests single nucleotide polymorphism (SNP)-sets within DNA repair pathways and then within genes belonging to the significant pathways. We showed an association between the polymerase pathway and oral cavity/pharynx cancers (P-corrected = 4.45 × 10(-) (2)), explained entirely by the association with one SNP, rs1494961 (P = 2.65 × 10(-) (4)), a missense mutation V306I in the second exon of HELQ gene. We also found an association between the cell cycle regulation pathway and esophagus cancer (P-corrected = 1.48 × 10(-) (2)), explained by three SNPs located within or near CSNK1E gene: rs1534891 (P = 1.27 × 10(-) (4)), rs7289981 (P = 3.37 × 10(-) (3)) and rs13054361 (P = 4.09 × 10(-) (3)). As a first attempt to investigate pathway-level associations, our results suggest a role of specific DNA repair genes/pathways in specific upper aerodigestive tract cancer sites.
Collapse
Affiliation(s)
- Marie-Claude Babron
- Inserm, U946, Genetic Variation and Human, Diseases and Université Paris-Diderot, Sorbonne Paris-Cité, UMRS-946, Paris, F-75010, France
| | - Rémi Kazma
- Department of Epidemiology and Biostatistics, Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Valérie Gaborieau
- Department of Genetic Epidemiology, International Agency for Research on Cancer, Lyon, F-69008, France
| | - James McKay
- Department of Genetic Epidemiology, International Agency for Research on Cancer, Lyon, F-69008, France
| | - Paul Brennan
- Department of Genetic Epidemiology, International Agency for Research on Cancer, Lyon, F-69008, France
| | - Alain Sarasin
- Université Paris-Sud, Faculty of Medicine, Villejuif, F-94805, France, CNRS, UMR8200, Genomes and Cancers and Gustave Roussy, Villejuif, F-94805, France
| | - Simone Benhamou
- Inserm, U946, Genetic Variation and Human, Diseases and Université Paris-Diderot, Sorbonne Paris-Cité, UMRS-946, Paris, F-75010, France, Gustave Roussy, Villejuif, F-94805, France
| |
Collapse
|
1464
|
Huang W, Massouras A, Inoue Y, Peiffer J, Ràmia M, Tarone AM, Turlapati L, Zichner T, Zhu D, Lyman RF, Magwire MM, Blankenburg K, Carbone MA, Chang K, Ellis LL, Fernandez S, Han Y, Highnam G, Hjelmen CE, Jack JR, Javaid M, Jayaseelan J, Kalra D, Lee S, Lewis L, Munidasa M, Ongeri F, Patel S, Perales L, Perez A, Pu L, Rollmann SM, Ruth R, Saada N, Warner C, Williams A, Wu YQ, Yamamoto A, Zhang Y, Zhu Y, Anholt RR, Korbel JO, Mittelman D, Muzny DM, Gibbs RA, Barbadilla A, Johnston JS, Stone EA, Richards S, Deplancke B, Mackay TF. Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Res 2014; 24:1193-208. [PMID: 24714809 PMCID: PMC4079974 DOI: 10.1101/gr.171546.113] [Citation(s) in RCA: 428] [Impact Index Per Article: 38.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 04/01/2014] [Indexed: 12/30/2022]
Abstract
The Drosophila melanogaster Genetic Reference Panel (DGRP) is a community resource of 205 sequenced inbred lines, derived to improve our understanding of the effects of naturally occurring genetic variation on molecular and organismal phenotypes. We used an integrated genotyping strategy to identify 4,853,802 single nucleotide polymorphisms (SNPs) and 1,296,080 non-SNP variants. Our molecular population genomic analyses show higher deletion than insertion mutation rates and stronger purifying selection on deletions. Weaker selection on insertions than deletions is consistent with our observed distribution of genome size determined by flow cytometry, which is skewed toward larger genomes. Insertion/deletion and single nucleotide polymorphisms are positively correlated with each other and with local recombination, suggesting that their nonrandom distributions are due to hitchhiking and background selection. Our cytogenetic analysis identified 16 polymorphic inversions in the DGRP. Common inverted and standard karyotypes are genetically divergent and account for most of the variation in relatedness among the DGRP lines. Intriguingly, variation in genome size and many quantitative traits are significantly associated with inversions. Approximately 50% of the DGRP lines are infected with Wolbachia, and four lines have germline insertions of Wolbachia sequences, but effects of Wolbachia infection on quantitative traits are rarely significant. The DGRP complements ongoing efforts to functionally annotate the Drosophila genome. Indeed, 15% of all D. melanogaster genes segregate for potentially damaged proteins in the DGRP, and genome-wide analyses of quantitative traits identify novel candidate genes. The DGRP lines, sequence data, genotypes, quality scores, phenotypes, and analysis and visualization tools are publicly available.
Collapse
Affiliation(s)
- Wen Huang
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Andreas Massouras
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Yutaka Inoue
- Center for Education in Liberal Arts and Sciences, Osaka University, Osaka-fu, 560-0043 Japan
| | - Jason Peiffer
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Miquel Ràmia
- Genomics, Bioinformatics and Evolution Group, Institut de Biotecnologia i de Biomedicina (IBB), Department of Genetics and Microbiology, Campus Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Aaron M. Tarone
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Lavanya Turlapati
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Thomas Zichner
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Dianhui Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Richard F. Lyman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Michael M. Magwire
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Kerstin Blankenburg
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Mary Anna Carbone
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Kyle Chang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lisa L. Ellis
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Sonia Fernandez
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yi Han
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Gareth Highnam
- Virginia Tech Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Carl E. Hjelmen
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - John R. Jack
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Mehwish Javaid
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Joy Jayaseelan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Divya Kalra
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Sandy Lee
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lora Lewis
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Mala Munidasa
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Fiona Ongeri
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Shohba Patel
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lora Perales
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Agapito Perez
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - LingLing Pu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Stephanie M. Rollmann
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Robert Ruth
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Nehad Saada
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Crystal Warner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Aneisa Williams
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yuan-Qing Wu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Akihiko Yamamoto
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Yiqing Zhang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Robert R.H. Anholt
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - David Mittelman
- Virginia Tech Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Donna M. Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Antonio Barbadilla
- Genomics, Bioinformatics and Evolution Group, Institut de Biotecnologia i de Biomedicina (IBB), Department of Genetics and Microbiology, Campus Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - J. Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Eric A. Stone
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Stephen Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Trudy F.C. Mackay
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| |
Collapse
|
1465
|
Zhao SD, Cai TT, Li H. More powerful genetic association testing via a new statistical framework for integrative genomics. Biometrics 2014; 70:881-90. [PMID: 24975802 DOI: 10.1111/biom.12206] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Revised: 05/01/2014] [Accepted: 05/01/2014] [Indexed: 11/30/2022]
Abstract
Integrative genomics offers a promising approach to more powerful genetic association studies. The hope is that combining outcome and genotype data with other types of genomic information can lead to more powerful SNP detection. We present a new association test based on a statistical model that explicitly assumes that genetic variations affect the outcome through perturbing gene expression levels. It is shown analytically that the proposed approach can have more power to detect SNPs that are associated with the outcome through transcriptional regulation, compared to tests using the outcome and genotype data alone, and simulations show that our method is relatively robust to misspecification. We also provide a strategy for applying our approach to high-dimensional genomic data. We use this strategy to identify a potentially new association between a SNP and a yeast cell's response to the natural product tomatidine, which standard association analysis did not detect.
Collapse
Affiliation(s)
- Sihai D Zhao
- Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, Illinois 61820, U.S.A
| | | | | |
Collapse
|
1466
|
Bis JC, DeStefano A, Liu X, Brody JA, Choi SH, Verhaaren BFJ, Debette S, Ikram MA, Shahar E, Butler KR, Gottesman RF, Muzny D, Kovar CL, Psaty BM, Hofman A, Lumley T, Gupta M, Wolf PA, van Duijn C, Gibbs RA, Mosley TH, Longstreth WT, Boerwinkle E, Seshadri S, Fornage M. Associations of NINJ2 sequence variants with incident ischemic stroke in the Cohorts for Heart and Aging in Genomic Epidemiology (CHARGE) consortium. PLoS One 2014; 9:e99798. [PMID: 24959832 PMCID: PMC4069013 DOI: 10.1371/journal.pone.0099798] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Accepted: 05/16/2014] [Indexed: 01/09/2023] Open
Abstract
Background Stroke, the leading neurologic cause of death and disability, has a substantial genetic component. We previously conducted a genome-wide association study (GWAS) in four prospective studies from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium and demonstrated that sequence variants near the NINJ2 gene are associated with incident ischemic stroke. Here, we sought to fine-map functional variants in the region and evaluate the contribution of rare variants to ischemic stroke risk. Methods and Results We sequenced 196 kb around NINJ2 on chromosome 12p13 among 3,986 European ancestry participants, including 475 ischemic stroke cases, from the Atherosclerosis Risk in Communities Study, Cardiovascular Health Study, and Framingham Heart Study. Meta-analyses of single-variant tests for 425 common variants (minor allele frequency [MAF] ≥ 1%) confirmed the original GWAS results and identified an independent intronic variant, rs34166160 (MAF = 0.012), most significantly associated with incident ischemic stroke (HR = 1.80, p = 0.0003). Aggregating 278 putatively-functional variants with MAF≤ 1% using count statistics, we observed a nominally statistically significant association, with the burden of rare NINJ2 variants contributing to decreased ischemic stroke incidence (HR = 0.81; p = 0.026). Conclusion Common and rare variants in the NINJ2 region were nominally associated with incident ischemic stroke among a subset of CHARGE participants. Allelic heterogeneity at this locus, caused by multiple rare, low frequency, and common variants with disparate effects on risk, may explain the difficulties in replicating the original GWAS results. Additional studies that take into account the complex allelic architecture at this locus are needed to confirm these findings.
Collapse
Affiliation(s)
- Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Anita DeStefano
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | - Xiaoming Liu
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Seung Hoan Choi
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | - Benjamin F. J. Verhaaren
- Department of Radiology, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Stéphanie Debette
- Institut National de la Santé et de la Recherche Médicale (INSERM), U708, Neuroepidemiology, Paris, France
- Department of Epidemiology, University of Versailles Saint-Quentin-en-Yvelines, Paris, France
| | - M. Arfan Ikram
- Department of Radiology, Erasmus MC, Rotterdam, The Netherlands
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
- Department of Neurology, Erasmus MC, Rotterdam, The Netherlands (M.A.I.); Netherlands
- Consortium for Healthy Aging, Leiden, The Netherlands
| | - Eyal Shahar
- Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona, United States of America
| | - Kenneth R. Butler
- Department of Medicine (Geriatrics), University of Mississippi Medical Center, Jackson, Mississippi, United States of America
| | - Rebecca F. Gottesman
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Christie L. Kovar
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, Washington, United States of America
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
- Group Health Research Institute, Group Health, Seattle, Washington, United States of America
| | - Albert Hofman
- Department of Epidemiology, Erasmus MC, Rotterdam, The Netherlands
| | - Thomas Lumley
- Department of Statistics, University of Auckland, Auckland, New Zealand
| | - Mayetri Gupta
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | - Philip A. Wolf
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | | | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
| | - Thomas H. Mosley
- Department of Medicine (Geriatrics), University of Mississippi Medical Center, Jackson, Mississippi, United States of America
| | - W. T. Longstreth
- Department of Epidemiology, University of Washington, Seattle, Washington, United States of America
- Department of Neurology, University of Washington, Seattle, Washington, United States of America
| | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, United States of America
- Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Texas, United States of America
| | - Sudha Seshadri
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Myriam Fornage
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, Texas, United States of America
- Brown Foundation Institute of Molecular Medicine, University of Texas Health Science Center at Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
1467
|
Gui H, Bao JY, Tang CSM, So MT, Ngo DN, Tran AQ, Bui DH, Pham DH, Nguyen TL, Tong A, Lok S, Sham PC, Tam PKH, Cherny SS, Garcia-Barcelo MM. Targeted next-generation sequencing on Hirschsprung disease: a pilot study exploits DNA pooling. Ann Hum Genet 2014; 78:381-7. [PMID: 24947032 DOI: 10.1111/ahg.12076] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2014] [Accepted: 05/07/2014] [Indexed: 12/11/2022]
Abstract
To adopt an efficient approach of identifying rare variants possibly related to Hirschsprung disease (HSCR), a pilot study was set up to evaluate the performance of a newly designed protocol for next generation targeted resquencing. In total, 20 Chinese HSCR patients and 20 Chinese sex-matched individuals with no HSCR were included, for which coding sequences (CDS) of 62 genes known to be in signaling pathways relevant to enteric nervous system development were selected for capture and sequencing. Blood DNAs from eight pools of five cases or controls were enriched by PCR-based RainDance technology (RDT) and then sequenced on a 454 FLX platform. As technical validation, five patients from case Pool-3 were also independently enriched by RDT, indexed with barcode and sequenced with sufficient coverage. Assessment for CDS single nucleotide variants showed DNA pooling performed well (specificity/sensitivity at 98.4%/83.7%) at the common variant level; but relatively worse (specificity/sensitivity at 65.5%/61.3%) at the rare variant level. Further Sanger sequencing only validated five out of 12 rare damaging variants likely involved in HSCR. Hence more improvement at variant detection and sequencing technology is needed to realize the potential of DNA pooling for large-scale resequencing projects.
Collapse
Affiliation(s)
- Hongsheng Gui
- Department of Surgery, The University of Hong Kong, Hong Kong, SAR, China; Department of Psychiatry, The University of Hong Kong, Hong Kong, SAR, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1468
|
Bonomo JA, Ng MCY, Palmer ND, Keaton JM, Larsen CP, Hicks PJ, Langefeld CD, Freedman BI, Bowden DW. Coding variants in nephrin (NPHS1) and susceptibility to nephropathy in African Americans. Clin J Am Soc Nephrol 2014; 9:1434-40. [PMID: 24948143 DOI: 10.2215/cjn.00290114] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
BACKGROUND AND OBJECTIVES Presumed genetic risk for diabetic and nondiabetic end stage renal disease is strong in African Americans. DESIGN, SETTING, PARTICIPANTS, & MEASUREMENTS Exome sequencing data from African Americans with type 2 diabetic end stage renal disease and nondiabetic, non-nephropathy controls in the T2D-GENES study (Discovery, n=529 patients and n=535 controls) were evaluated, focusing on missense variants in NPHS1. Associated variants were then evaluated in independent type 2 diabetic end stage renal disease (Replication, n=1305 patients and n=760 controls), nondiabetic end stage renal disease (n=1705), and type 2 diabetes-only, non-nephropathy samples (n=503). All participants were recruited from dialysis facilities and internal medicine clinics across the southeastern United States from 1991 to present. Additional NPHS1 missense variants were identified from exome sequencing resources, genotyped, and sequence kernel association testing was then performed. RESULTS Initial analysis identified rs35238405 (T233A; minor allele frequency=0.0096) as associated with type 2 diabetic end stage renal disease (adjustment for admixture P=0.042; adjustment for admixture+APOL1 P=0.080; odds ratio, 2.89 and 2.36, respectively); with replication in independent type 2 diabetic end stage renal disease samples (P=0.018; odds ratio, 4.30) and nondiabetic end stage renal disease samples (P=0.016; odds ratio, 4.48). In a combined analysis (all patients with end stage renal disease versus all controls), T233A was associated with all-cause end stage renal disease (P=0.0038; odds ratio, 2.82; n=3270 patients and n=1187 controls). A P-value of <0.001 was obtained after adjustment for admixture and APOL1 in sequence kernel association testing. Two additional variants (H800R and Y1174H) were nominally associated with protection from end stage renal disease (P=0.036; odds ratio, 0.44; P=0.0084; odds ratio, 0.040, respectively) in the locus-wide single-variant association tests. CONCLUSIONS Coding variants in NPHS1 are associated with both risk for and protection from common forms of nephropathy in African Americans.
Collapse
Affiliation(s)
- Jason A Bonomo
- Departments of Molecular Medicine and Translational Science, Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and
| | - Maggie C Y Ng
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and Biochemistry
| | - Nicholette D Palmer
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and Biochemistry
| | - Jacob M Keaton
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and
| | | | - Pamela J Hicks
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and
| | | | - Carl D Langefeld
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and Biostatistical Sciences, and
| | | | - Donald W Bowden
- Center for Human Genomics and Personalized Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina; and Biochemistry,
| |
Collapse
|
1469
|
Genome-wide association analysis demonstrates the highly polygenic character of age-related hearing impairment. Eur J Hum Genet 2014; 23:110-5. [PMID: 24939585 DOI: 10.1038/ejhg.2014.56] [Citation(s) in RCA: 75] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 01/17/2014] [Accepted: 03/05/2014] [Indexed: 11/08/2022] Open
Abstract
We performed a genome-wide association study (GWAS) to identify the genes responsible for age-related hearing impairment (ARHI), the most common form of hearing impairment in the elderly. Analysis of common variants, with and without adjustment for stratification and environmental covariates, rare variants and interactions, as well as gene-set enrichment analysis, showed no variants with genome-wide significance. No evidence for replication of any previously reported genes was found. A study of the genetic architecture indicates for the first time that ARHI is highly polygenic in nature, with probably no major genes involved. The phenotype depends on the aggregated effect of a large number of SNPs, of which the individual effects are undetectable in a modestly powered GWAS. We estimated that 22% of the variance in our data set can be explained by the collective effect of all genotyped SNPs. A score analysis showed a modest enrichment in causative SNPs among the SNPs with a P-value below 0.01.
Collapse
|
1470
|
Mallaney C, Sung YJ. Rare variant analysis of blood pressure phenotypes in the Genetic Analysis Workshop 18 whole genome sequencing data using sequence kernel association test. BMC Proc 2014; 8:S10. [PMID: 25519353 PMCID: PMC4143707 DOI: 10.1186/1753-6561-8-s1-s10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Sequence kernel association test (SKAT) has become one of the most commonly used nonburden tests for analyzing rare variants. Performance of burden tests depends on the weighting of rare and common variants when collapsing them in a genomic region. Using the systolic and diastolic blood pressure phenotypes of 142 unrelated individuals in the Genetic Analysis Workshop 18 data, we investigated whether performance of SKAT also depends on the weighting scheme. We analyzed the entire sequencing data for all 200 replications using 3 weighting schemes: equal weighting, Madsen-Browning weighting, and SKAT default linear weighting. We considered two options: all single-nucleotide polymorphisms (SNPs) and only low-frequency SNPs. A SKAT default weighting scheme (which heavily downweights common variants) performed better for the genes in which causal SNPs are mostly rare. This SKAT default weighting scheme behaved similarly to other weighting schemes after eliminating all common SNPs. In contrast, the equal weighting scheme performed the best for MAP4 and FLT3, both of which included a common variant with a large effect. However, SKAT with all 3 weighting schemes performed poorly. Overall power across all causal genes was about 0.05, which was almost identical to the type I error rate. This poor performance is partly due to a small sample size because of the need to analyze only unrelated individuals. Because a half of causal SNPs were not found in the annotation file based on the 1000 Genomes Project, we suspect that performance was also affected by our use of incomplete annotation information.
Collapse
Affiliation(s)
- Cates Mallaney
- Division of Biostatistics, Washington University in St. Louis, School of Medicine, St. Louis, MO 63110, USA
| | - Yun Ju Sung
- Division of Biostatistics, Washington University in St. Louis, School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
1471
|
Chen H, Choi SH, Hong J, Lu C, Milton JN, Allard C, Lacey SM, Lin H, Dupuis J. Rare genetic variant analysis on blood pressure in related samples. BMC Proc 2014; 8:S35. [PMID: 25519320 PMCID: PMC4143757 DOI: 10.1186/1753-6561-8-s1-s35] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The genetic variants associated with blood pressure identified so far explain only a small proportion of the total heritability of this trait. With recent advances in sequencing technology and statistical methodology, it becomes feasible to study the association between blood pressure and rare genetic variants. Using real baseline phenotype data and imputed dosage data from Genetic Analysis Workshop 18, we performed a candidate gene association analysis. We focused on 8 genes shown to be associated with either systolic or diastolic blood pressure to identify the association with both common and rare genetic variants, and then did a genome-wide rare-variant analysis on blood pressure. We performed association analysis for rare coding and splicing variants within each gene region and all rare variants in each sliding window, using either burden tests or sequence kernel association tests accounting for familial correlation. With a sample size of only 747, we failed to find any novel associated genetic loci. Consequently, we performed analyses on simulated data, with knowledge of the underlying simulating model, to evaluate the type I error rate and power for the methods used in real data analysis.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Seung Hoan Choi
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Jaeyoung Hong
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Chen Lu
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Jacqueline N Milton
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Catherine Allard
- Département de Mathématiques, Université de Sherbrooke, 2500 Boulevard de l'Université, Sherbrooke, QC J1K 2R1, Canada
| | - Sean M Lacey
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| | - Honghuang Lin
- Department of Medicine, Boston University School of Medicine, 72 East Concord Street, Boston, MA 02118, USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118, USA
| |
Collapse
|
1472
|
Liu Y, Huang C, Hu I, Lo SH, Zheng T. A dual-clustering framework for association screening with whole genome sequencing data and longitudinal traits. BMC Proc 2014; 8:S47. [PMID: 25519328 PMCID: PMC4143709 DOI: 10.1186/1753-6561-8-s1-s47] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Current sequencing technology enables generation of whole genome sequencing data sets that contain a high density of rare variants, each of which is carried by, at most, 5% of the sampled subjects. Such variants are involved in the etiology of most common diseases in humans. These diseases can be studied by relevant longitudinal phenotype traits. Tests for association between such genotype information and longitudinal traits allow the study of the function of rare variants in complex human disorders. In this paper, we propose an association-screening framework that highlights the genotypic differences observed on rare variants and the longitudinal nature of phenotypes. In particular, both variants within a gene and longitudinal phenotypes are used to create partitions of subjects. Association between the 2 sets of constructed partitions is then evaluated. We apply the proposed strategy to the simulated data from the Genetic Analysis Workshop 18 and compare the obtained results with those from sequence kernel association test using the receiver operating characteristic curves.
Collapse
Affiliation(s)
- Ying Liu
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, New York, NY 10027, USA
| | - ChienHsun Huang
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, New York, NY 10027, USA
| | - Inchi Hu
- ISOM, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Shaw-Hwa Lo
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, New York, NY 10027, USA
| | - Tian Zheng
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, New York, NY 10027, USA
| |
Collapse
|
1473
|
Biswas S, Papachristou C. Evaluation of logistic Bayesian LASSO for identifying association with rare haplotypes. BMC Proc 2014; 8:S54. [PMID: 25519334 PMCID: PMC4144467 DOI: 10.1186/1753-6561-8-s1-s54] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
It has been hypothesized that rare variants may hold the key to unraveling the genetic transmission mechanism of many common complex traits. Currently, there is a dearth of statistical methods that are powerful enough to detect association with rare haplotypes. One of the recently proposed methods is logistic Bayesian LASSO for case-control data. By penalizing the regression coefficients through appropriate priors, logistic Bayesian LASSO weeds out the unassociated haplotypes, making it possible for the associated rare haplotypes to be detected with higher powers. We used the Genetic Analysis Workshop 18 simulated data to evaluate the behavior of logistic Bayesian LASSO in terms of its power and type I error under a complex disease model. We obtained knowledge of the simulation model, including the locations of the functional variants, and we chose to focus on two genomic regions in the MAP4 gene on chromosome 3. The sample size was 142 individuals and there were 200 replicates. Despite the small sample size, logistic Bayesian LASSO showed high power to detect two haplotypes containing functional variants in these regions while maintaining low type I errors. At the same time, a commonly used approach for haplotype association implemented in the software hapassoc failed to converge because of the presence of rare haplotypes. Thus, we conclude that logistic Bayesian LASSO can play an important role in the search for rare haplotypes.
Collapse
Affiliation(s)
- Swati Biswas
- Department of Mathematical Sciences, FO 35, University of Texas at Dallas, 800 West Campbell Road,Richardson, TX 75080, USA
| | - Charalampos Papachristou
- Department of Mathematics, Physics, and Statistics, University of the Sciences in Philadelphia, 600 South 43rd Street, Philadelphia, PA 19104, USA
| |
Collapse
|
1474
|
Zhou JJ, Yip WK, Cho MH, Qiao D, McDonald MLN, Laird NM. A comparative analysis of family-based and population-based association tests using whole genome sequence data. BMC Proc 2014; 8:S33. [PMID: 25519381 PMCID: PMC4143682 DOI: 10.1186/1753-6561-8-s1-s33] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
The revolution in next-generation sequencing has made obtaining both common and rare high-quality sequence variants across the entire genome feasible. Because researchers are now faced with the analytical challenges of handling a massive amount of genetic variant information from sequencing studies, numerous methods have been developed to assess the impact of both common and rare variants on disease traits. In this report, whole genome sequencing data from Genetic Analysis Workshop 18 was used to compare the power of several methods, considering both family-based and population-based designs, to detect association with variants in the MAP4 gene region and on chromosome 3 with blood pressure. To prioritize variants across the genome for testing, variants were first functionally assessed using prediction algorithms and expression quantitative trait loci (eQTLs) data. Four set-based tests in the family-based association tests (FBAT) framework--FBAT-v, FBAT-lmm, FBAT-m, and FBAT-l--were used to analyze 20 pedigrees, and 2 variance component tests, sequence kernel association test (SKAT) and genome-wide complex trait analysis (GCTA), were used with 142 unrelated individuals in the sample. Both set-based and variance-component-based tests had high power and an adequate type I error rate. Of the various FBATs, FBAT-l demonstrated superior performance, indicating the potential for it to be used in rare-variant analysis. The updated FBAT package is available at: http://www.hsph.harvard.edu/fbat/.
Collapse
Affiliation(s)
- Jin J Zhou
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA ; Division of Epidemiology and Biostatistics, College of Public Health, University of Arizona, Tucson, AZ 85724, USA
| | - Wai-Ki Yip
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA ; Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Dandi Qiao
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| | - Merry-Lynn N McDonald
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Nan M Laird
- Biostatistics Department, Harvard School of Public Health, Boston, MA 02115 USA
| |
Collapse
|
1475
|
Zhang TX, Xie YR, Rice JP. Application of noncollapsing methods to the gene-based association test: a comparison study using Genetic Analysis Workshop 18 data. BMC Proc 2014; 8:S53. [PMID: 25519333 PMCID: PMC4143635 DOI: 10.1186/1753-6561-8-s1-s53] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Rare variants have been proposed to play a significant role in the onset and development of common diseases. However, traditional analysis methods have difficulties in detecting association signals for rare causal variants because of a lack of statistical power. We propose a two-stage, gene-based method for association mapping of rare variants by applying four different noncollapsing algorithms. Using the Genome Analysis Workshop18 whole genome sequencing data set of simulated blood pressure phenotypes, we studied and contrasted the false-positive rate of each algorithm using receiver operating characteristic curves. The statistical power of these methods was also evaluated and compared through the analysis of 200 simulated replications in a smaller genotype data set. We showed that the Fisher's method was superior to the other 3 noncollapsing methods, but was no better than the standard method implemented with famSKAT. Further investigation is needed to explore the potential statistical properties of these approaches.
Collapse
Affiliation(s)
- Tian-Xiao Zhang
- Department of Psychiatry, Washington University, 660 S. Euclid Ave., St. Louis, MO 63110, USA
| | - Yi-Ran Xie
- Department of Psychiatry, Washington University, 660 S. Euclid Ave., St. Louis, MO 63110, USA
| | - John P Rice
- Department of Psychiatry, Washington University, 660 S. Euclid Ave., St. Louis, MO 63110, USA
| |
Collapse
|
1476
|
Abstract
Although many genetic factors have been successfully identified for human diseases in genome-wide association studies, genes discovered to date only account for a small proportion of overall genetic contributions to many complex traits. Association studies have difficulty in detecting the remaining true genetic variants that are either common variants with weak allelic effects, or rare variants that have strong allelic effects but are weakly associated at the population level. In this work, we applied a goodness-of-fit test for detecting sets of common and rare variants associated with quantitative or binary traits by using whole genome sequencing data. This test has been proved optimal for detecting weak and sparse signals in the literature, which fits the requirements for targeting the genetic components of missing heritability. Furthermore, this p value-combining method allows one to incorporate different data and/or research results for meta-analysis. The method was used to simultaneously analyse the whole genome sequencing and genome-wide association studies data of Genetic Analysis Workshop 18 for detecting true genetic variants. The results show that goodness-of-fit test is comparable or better than the influential sequence kernel association test in many cases.
Collapse
Affiliation(s)
- Li Yang
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| | - Jing Xuan
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| | - Zheyang Wu
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| |
Collapse
|
1477
|
Abstract
Under the premise that multiple causal variants exist within a disease gene and that we are underpowered to detect these variants individually, a variety of methods have been developed that attempt to cluster rare variants within a gene so that the variants may gather strength from one another. These methods group variants by gene or proximity, and test one gene or marker window at a time. We propose analyzing all genes simultaneously with a penalized regression method that enables grouping of all (rare and common) variants within a gene while subgrouping rare variants, thus borrowing strength from both rare and common variants within the same gene. We apply this approach using a burden based weighting of the rare variants to the Genetic Analysis Workshop 18 data.
Collapse
Affiliation(s)
- Kristin L Ayers
- Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle Upon Tyne, NE1 3BZ, UK
| | - Heather J Cordell
- Institute of Genetic Medicine, Newcastle University, Central Parkway, Newcastle Upon Tyne, NE1 3BZ, UK
| |
Collapse
|
1478
|
Feng T, Zhu X. Whole genome sequencing data from pedigrees suggests linkage disequilibrium among rare variants created by population admixture. BMC Proc 2014; 8:S44. [PMID: 25519326 PMCID: PMC4143626 DOI: 10.1186/1753-6561-8-s1-s44] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Next-generation sequencing technologies have been designed to discover rare and de novo variants and are an important tool for identifying rare disease variants. Many statistical methods have been developed to test, using next-generation sequencing data, for rare variants that are associated with a trait. However, many of these methods make assumptions that rare variants are in linkage equilibrium in a gene. In this report, we studied whether transmitted or untransmitted haplotypes carry an excess of rare variants using the whole genome sequencing data of 15 large Mexican American pedigrees provided by the Genetic Analysis Workshop 18. We observed that an excess of rare variants are carried on either transmitted or nontransmitted haplotypes from parents to offspring. Further analyses suggest that such nonrandom associations among rare variants can be attributed to population admixture and single-nucleotide variant calling errors. Our results have significant implications for rare variant association studies, especially those conducted in admixed populations.
Collapse
Affiliation(s)
- Tao Feng
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
| | - Xiaofeng Zhu
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Wolstein Research Building, 2103 Cornell Road, Cleveland, OH 44106, USA
| |
Collapse
|
1479
|
Abstract
Advances in next-generation sequencing technology have made it possible to comprehensively interrogate the entire spectrum of genomic variations including rare variants. They may help capture the remaining genetic heritability which has not been fully explained by previous genome-wide association studies. Here we performed a gene-based genome-wide scan to identify hypertension susceptibility loci in analysis of a whole genome sequencing cohort of 103 unrelated individuals. We found that collapsing singletons may boost signals for associating rare variants and identified SETX statistically significant by a genome-wide gene-based threshold (p value <5.0 × 10(-6)). The function of SETX in hypertension may be worthy of further investigation.
Collapse
Affiliation(s)
- Wei Wang
- Department of Computer Science, New Jersey Institute of Technology, University Heights Newark, New Jersey 07102, USA
| | - Zhi Wei
- Department of Computer Science, New Jersey Institute of Technology, University Heights Newark, New Jersey 07102, USA
| |
Collapse
|
1480
|
Abstract
The kernel score statistic is a global covariance component test over a set of genetic markers. It provides a flexible modeling framework and does not collapse marker information. We generalize the kernel score statistic to allow for familial dependencies and to adjust for random confounder effects. With this extension, we adjust our analysis of real and simulated baseline systolic blood pressure for polygenic familial background. We find that the kernel score test gains appreciably in power through the use of sequencing compared to tag-single-nucleotide polymorphisms for very rare single nucleotide polymorphisms with <1% minor allele frequency.
Collapse
Affiliation(s)
- Dörthe Malzahn
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Humboldtallee 32, 37073 Göttingen, Germany
| | - Stefanie Friedrichs
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Humboldtallee 32, 37073 Göttingen, Germany
| | - Albert Rosenberger
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Humboldtallee 32, 37073 Göttingen, Germany
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August University Göttingen, Humboldtallee 32, 37073 Göttingen, Germany
| |
Collapse
|
1481
|
Turkmen AS, Lin S. Identifying rare variant associations in population-based and family-based designs. BMC Proc 2014; 8:S58. [PMID: 25519393 PMCID: PMC4143803 DOI: 10.1186/1753-6561-8-s1-s58] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
For almost all complex traits studied in humans, the identified genetic variants discovered to date have accounted for only a small portion of the estimated trait heritability. Consequently, several methods have been developed to identify rare single-nucleotide variants associated with complex traits for population-based designs. Because rare disease variants tend to be enriched in families containing multiple affected individuals, family-based designs can play an important role in the identification of rare causal variants. In this study, we utilize Genetic Analysis Workshop 18 simulated data to examine the performance of some existing rare variant identification methods for unrelated individuals, including our recent method (rPLS). The simulated data is used to investigate whether there is an advantage to using family data compared to case-control data. The results indicate that population-based methods suffer from power loss, especially when the sample size is small. The family-based method employed in this paper results in higher power but fails to control type I error. Our study also highlights the importance of the phenotype choice, which can affect the power of detecting causal genes substantially.
Collapse
Affiliation(s)
- Asuman S Turkmen
- Department of Statistics, The Ohio State University, 1179 University Drive, Newark, OH 43055, USA
| | - Shili Lin
- Department of Statistics, The Ohio State University, 1958 Neil Avenue, Columbus, OH 43210, USA
| |
Collapse
|
1482
|
Almeida M, Peralta JM, Farook V, Puppala S, Kent JW, Duggirala R, Blangero J. Pedigree-based random effect tests to screen gene pathways. BMC Proc 2014; 8:S100. [PMID: 25519354 PMCID: PMC4143680 DOI: 10.1186/1753-6561-8-s1-s100] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
The new generation of sequencing platforms opens new horizons in the genetics field. It is possible to exhaustively assay all genetic variants in an individual and search for phenotypic associations. The whole genome sequencing approach, when applied to a large human sample like the San Antonio Family Study, detects a very large number (>25 million) of single nucleotide variants along with other more complex variants. The analytical challenges imposed by this number of variants are formidable, suggesting that methods are needed to reduce the overall number of statistical tests. In this study, we develop a single degree-of-freedom test of variants in a gene pathway employing a random effect model that uses an empirical pathway-specific genetic relationship matrix as the focal covariance kernel. The empirical pathway-specific genetic relationship uses all variants (or a chosen subset) from gene members of a given biological pathway. Using SOLAR's pedigree-based variance components modeling, which also allows for arbitrary fixed effects, such as principal components, to deal with latent population structure, we employ a likelihood ratio test of the pathway-specific genetic relationship matrix model. We examine all gene pathways in KEGG database gene pathways using our method in the first replicate of the Genetic Analysis Workshop 18 simulation of systolic blood pressure. Our random effect approach was able to detect true association signals in causal gene pathways. Those pathways could be easily be further dissected by the independent analysis of all markers.
Collapse
Affiliation(s)
- Marcio Almeida
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| | - Juan M Peralta
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA.,Centre for Genetic Epidemiology and Biostatistics, University of Western Australia, WA, Australia
| | - Vidya Farook
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| | - Sobha Puppala
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| | - John W Kent
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| | - Ravindranath Duggirala
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| | - John Blangero
- Department of Genetics, Texas Biomedical Research Institute. 7620 NW Loop 410, San Antonio, TX 78245, USA
| |
Collapse
|
1483
|
Johnston I, Carvalho LE. A Bayesian hierarchical gene model on latent genotypes for genome-wide association studies. BMC Proc 2014; 8:S45. [PMID: 25519327 PMCID: PMC4143727 DOI: 10.1186/1753-6561-8-s1-s45] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The primary goal of genome-wide association studies is to determine which genetic markers are associated with genetic traits, most commonly human diseases. As a result of the "large p, small n" nature of genome-wide association study data sets, and especially because of the collinearity due to linkage disequilibrium, multivariate regression results in an ill-posed problem. To overcome these obstacles, we propose preprocessing single-nucleotide polymorphisms to adjust for linkage disequilibrium, and a novel Bayesian statistical model that exploits a hierarchical structure between single-nucleotide polymorphisms and genes. We obtain posterior samples using a hybrid Metropolis-within-Gibbs sampler, and further conduct inference on single-nucleotide polymorphism and gene associations using centroid estimation. Finally, we illustrate the proposed model and estimation procedure and discuss results obtained on the data provided for the Genetic Analysis Workshop 18.
Collapse
Affiliation(s)
- Ian Johnston
- Mathematics and Statistics Department, Boston University, 111 Cummington Mall, Boston, MA 02215, USA
| | - Luis E Carvalho
- Mathematics and Statistics Department, Boston University, 111 Cummington Mall, Boston, MA 02215, USA
| |
Collapse
|
1484
|
Fan R, Huang CH, Hu I, Wang H, Zheng T, Lo SH. A partition-based approach to identify gene-environment interactions in genome wide association studies. BMC Proc 2014; 8:S60. [PMID: 25519395 PMCID: PMC4143762 DOI: 10.1186/1753-6561-8-s1-s60] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
It is believed that almost all common diseases are the consequence of complex interactions between genetic markers and environmental factors. However, few such interactions have been documented to date. Conventional statistical methods for detecting gene and environmental interactions are often based on the linear regression model, which assumes a linear interaction effect. In this study, we propose a nonparametric partition-based approach that is able to capture complex interaction patterns. We apply this method to the real data set of hypertension provided by Genetic Analysis Workshop 18. Compared with the linear regression model, the proposed approach is able to identify many additional variants with significant gene-environmental interaction effects. We further investigate one single-nucleotide polymorphism identified by our method and show that its gene-environmental interaction effect is, indeed, nonlinear. To adjust for the family dependence of phenotypes, we apply different permutation strategies and investigate their effects on the outcomes.
Collapse
Affiliation(s)
- Ruixue Fan
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, 10th Floor, New York, NY 10027, USA
| | - Chien-Hsun Huang
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, 10th Floor, New York, NY 10027, USA
| | - Inchi Hu
- ISOM, Hong Kong University of Science and Technology, Hong Kong
| | - Haitian Wang
- Division of Biostatistics, School of Public Health and Primary Care, The Chinese University of Hong Kong, Hong Kong
| | - Tian Zheng
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, 10th Floor, New York, NY 10027, USA
| | - Shaw-Hwa Lo
- Department of Statistics, Columbia University, 1255 Amsterdam Avenue, 10th Floor, New York, NY 10027, USA
| |
Collapse
|
1485
|
Hu Y, Hui Q, Sun YV. Association analysis of whole genome sequencing data accounting for longitudinal and family designs. BMC Proc 2014; 8:S89. [PMID: 25519416 PMCID: PMC4143808 DOI: 10.1186/1753-6561-8-s1-s89] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Using the whole genome sequencing data and the simulated longitudinal phenotypes for 849 pedigree-based individuals from Genetic Analysis Workshop 18, we investigated various approaches to detecting the association of rare and common variants with blood pressure traits. We compared three strategies for longitudinal data: (a) using the baseline measurement only, (b) using the average from multiple visits, and (c) using all individual measurements. We also compared the power of using all of the pedigree-based data and the unrelated subset. The analyses were performed without knowledge of the underlying simulating model.
Collapse
Affiliation(s)
- Yijuan Hu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA
| | - Qin Hui
- Department of Epidemiology, Emory University, Atlanta, GA, USA
| | - Yan V Sun
- Department of Epidemiology, Emory University, Atlanta, GA, USA ; Department of Biomedical Informatics, Emory University, Atlanta, GA, USA ; Center for Health Research, Kaiser Permanente Georgia, Atlanta, GA, USA
| |
Collapse
|
1486
|
Zhou J, Tantoso E, Wong LP, Ong RTH, Bei JX, Li Y, Liu J, Khor CC, Teo YY. iCall: a genotype-calling algorithm for rare, low-frequency and common variants on the Illumina exome array. Bioinformatics 2014; 30:1714-20. [PMID: 24567545 DOI: 10.1093/bioinformatics/btu107] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Next-generation genotyping microarrays have been designed with insights from 1000 Genomes Project and whole-exome sequencing studies. These arrays additionally include variants that are typically present at lower frequencies. Determining the genotypes of these variants from hybridization intensities is challenging because there is less support to locate the presence of the minor alleles when the allele counts are low. Existing algorithms are mainly designed for calling common variants and are notorious for failing to generate accurate calls for low-frequency and rare variants. Here, we introduce a new calling algorithm, iCall, to call genotypes for variants across the whole spectrum of allele frequencies. RESULTS We benchmarked iCall against four of the most commonly used algorithms, GenCall, optiCall, illuminus and GenoSNP, as well as a post-processing caller zCall that adopted a two-stage calling design. Normalized hybridization intensities for 12 370 individuals genotyped on the Illumina HumanExome BeadChip were considered, of which 81 individuals were also whole-genome sequenced. The sequence calls were used to benchmark the accuracy of the genotype calling, and our comparisons indicated that iCall outperforms all four single-stage calling algorithms in terms of call rates and concordance, particularly in the calling accuracy of minor alleles, which is the principal concern for rare and low-frequency variants. The application of zCall to post-process the output from iCall also produced marginally improved performance to the combination of zCall and GenCall. AVAILABILITY AND IMPLEMENTATION iCall is implemented in C++ for use on Linux operating systems and is available for download at http://www.statgen.nus.edu.sg/∼software/icall.html.
Collapse
Affiliation(s)
- Jin Zhou
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Erwin Tantoso
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Lai-Ping Wong
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Rick Twee-Hee Ong
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Jin-Xin Bei
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Yi Li
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Jianjun Liu
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Chiea-Chuen Khor
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, SingaporeDepartment of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| | - Yik-Ying Teo
- Department of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, SingaporeDepartment of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, SingaporeDepartment of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, SingaporeDepartment of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, SingaporeDepartment of Statistics and Applied Probability, Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Genome Institute of Singapore, Singapore, NUS Graduate School for Integrative Science and Engineering, National University of Singapore, Singapore and Life Sciences Institute, National University of Singapore, Singapore
| |
Collapse
|
1487
|
Lee S, Kim JY, Hwang J, Kim S, Lee JH, Han DH. Investigation of pathogenic genes in peri-implantitis from implant clustering failure patients: a whole-exome sequencing pilot study. PLoS One 2014; 9:e99360. [PMID: 24921256 PMCID: PMC4055653 DOI: 10.1371/journal.pone.0099360] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 05/13/2014] [Indexed: 01/21/2023] Open
Abstract
Peri-implantitis is a frequently occurring gum disease linked to multi-factorial traits with various environmental and genetic causalities and no known concrete pathogenesis. The varying severity of peri-implantitis among patients with relatively similar environments suggests a genetic aspect which needs to be investigated to understand and regulate the pathogenesis of the disease. Six unrelated individuals with multiple clusterization implant failure due to severe peri-implantitis were chosen for this study. These six individuals had relatively healthy lifestyles, with minimal environmental causalities affecting peri-implantitis. Research was undertaken to investigate pathogenic genes in peri-implantitis albeit with a small number of subjects and incomplete elimination of environmental causalities. Whole-exome sequencing was performed on collected saliva samples via self DNA collection kit. Common variants with minor allele frequencies (MAF) > = 0.05 from all control datasets were eliminated and variants having high and moderate impact and loss of function were used for comparison. Gene set enrichment analysis was performed to reveal functional groups associated with the genetic variants. 2,022 genes were left after filtering against dbSNP, the 1000 Genomes East Asian population, and healthy Korean randomized subsample data (GSK project). 175 (p-value <0.05) out of 927 gene sets were obtained via GSEA (DAVID). The top 10 was chosen (p-value <0.05) from cluster enrichment showing significance of cytoskeleton, cell adhesion, and metal ion binding. Network analysis was applied to find relationships between functional clusters. Among the functional groups, ion metal binding was located in the center of all clusters, indicating dysfunction of regulation in metal ion concentration might affect cell morphology or cell adhesion, resulting in implant failure. This result may demonstrate the feasibility of and provide pilot data for a larger research project aimed at discovering biomarkers for early diagnosis of peri-implantitis.
Collapse
Affiliation(s)
- Soohyung Lee
- Department of Prosthodontics, Oral Science Research Center, College of Dentistry, Yonsei University, Seoul, Korea
| | - Ji-Young Kim
- Department of Prosthodontics, Oral Science Research Center, College of Dentistry, Yonsei University, Seoul, Korea
| | - Jihye Hwang
- Department of IT Convergence and Engineering, Pohang University of Science and Technology, Pohang, Korea
| | - Sanguk Kim
- Department of IT Convergence and Engineering, Pohang University of Science and Technology, Pohang, Korea
| | - Jae-Hoon Lee
- Department of Prosthodontics, Oral Science Research Center, College of Dentistry, Yonsei University, Seoul, Korea
- * E-mail: (JHL); (DHH)
| | - Dong-Hoo Han
- Department of Prosthodontics, Oral Science Research Center, College of Dentistry, Yonsei University, Seoul, Korea
- * E-mail: (JHL); (DHH)
| |
Collapse
|
1488
|
Moutsianas L, Morris AP. Methodology for the analysis of rare genetic variation in genome-wide association and re-sequencing studies of complex human traits. Brief Funct Genomics 2014; 13:362-70. [PMID: 24916163 PMCID: PMC4168660 DOI: 10.1093/bfgp/elu012] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Genome-wide association studies have been successful in identifying common variants that impact complex human traits and diseases. However, despite this success, the joint effects of these variants explain only a small proportion of the genetic variance in these phenotypes, leading to speculation that rare genetic variation might account for much of the ‘missing heritability’. Consequently, there has been an exciting period of research and development into the methodology for the analysis of rare genetic variants, typically by considering their joint effects on complex traits within the same functional unit or genomic region. In this review, we describe a general framework for modelling the joint effects of rare genetic variants on complex traits in association studies of unrelated individuals. We summarise a range of widely used association tests that have been developed from this model and provide an overview of the relative performance of these approaches from published simulation studies.
Collapse
|
1489
|
Li MJ, Yan B, Sham PC, Wang J. Exploring the function of genetic variants in the non-coding genomic regions: approaches for identifying human regulatory variants affecting gene expression. Brief Bioinform 2014; 16:393-412. [PMID: 24916300 DOI: 10.1093/bib/bbu018] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 04/23/2014] [Indexed: 12/13/2022] Open
Abstract
Understanding the genetic basis of human traits/diseases and the underlying mechanisms of how these traits/diseases are affected by genetic variations is critical for public health. Current genome-wide functional genomics data uncovered a large number of functional elements in the noncoding regions of human genome, providing new opportunities to study regulatory variants (RVs). RVs play important roles in transcription factor bindings, chromatin states and epigenetic modifications. Here, we systematically review an array of methods currently used to map RVs as well as the computational approaches in annotating and interpreting their regulatory effects, with emphasis on regulatory single-nucleotide polymorphism. We also briefly introduce experimental methods to validate these functional RVs.
Collapse
|
1490
|
Svishcheva GR, Belonogova NM, Axenovich TI. FFBSKAT: fast family-based sequence kernel association test. PLoS One 2014; 9:e99407. [PMID: 24905468 PMCID: PMC4048315 DOI: 10.1371/journal.pone.0099407] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2014] [Accepted: 05/14/2014] [Indexed: 11/28/2022] Open
Abstract
The kernel machine-based regression is an efficient approach to region-based association analysis aimed at identification of rare genetic variants. However, this method is computationally complex. The running time of kernel-based association analysis becomes especially long for samples with genetic (sub) structures, thus increasing the need to develop new and effective methods, algorithms, and software packages. We have developed a new R-package called fast family-based sequence kernel association test (FFBSKAT) for analysis of quantitative traits in samples of related individuals. This software implements a score-based variance component test to assess the association of a given set of single nucleotide polymorphisms with a continuous phenotype. We compared the performance of our software with that of two existing software for family-based sequence kernel association testing, namely, ASKAT and famSKAT, using the Genetic Analysis Workshop 17 family sample. Results demonstrate that FFBSKAT is several times faster than other available programs. In addition, the calculations of the three-compared software were similarly accurate. With respect to the available analysis modes, we combined the advantages of both ASKAT and famSKAT and added new options to empower FFBSKAT users. The FFBSKAT package is fast, user-friendly, and provides an easy-to-use method to perform whole-exome kernel machine-based regression association analysis of quantitative traits in samples of related individuals. The FFBSKAT package, along with its manual, is available for free download at http://mga.bionet.nsc.ru/soft/FFBSKAT/.
Collapse
Affiliation(s)
- Gulnara R. Svishcheva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Nadezhda M. Belonogova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Tatiana I. Axenovich
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Novosibirsk State University, Novosibirsk, Russia
- * E-mail:
| |
Collapse
|
1491
|
Giorgi EE, Stram DO, Taverna D, Turner SD, Schumacher F, Haiman CA, Lum-Jones A, Tirikainen M, Caberto C, Duggan D, Henderson BE, Le Marchand L, Cheng I. Fine-mapping IGF1 and prostate cancer risk in African Americans: the multiethnic cohort study. Cancer Epidemiol Biomarkers Prev 2014; 23:1928-32. [PMID: 24904019 DOI: 10.1158/1055-9965.epi-14-0333] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Genetic variation at insulin-like growth factor 1 (IGF1) has been linked to prostate cancer risk. However, the specific predisposing variants have not been identified. In this study, we fine-mapped the IGF1 locus for prostate cancer risk in African Americans. We conducted targeted Roche GS-Junior 454 resequencing of a 156-kb region of IGF1 in 80 African American aggressive prostate cancer cases. Three hundred and thirty-four IGF1 SNPs were examined for their association with prostate cancer risk in 1,000 African American prostate cancer cases and 991 controls. The top associated SNP in African Americans, rs148371593, was examined in an additional 3,465 prostate cancer cases and 3,425 controls of non-African American ancestry-European Americans, Japanese Americans, Latinos, and Native Hawaiians. The overall association of 334 IGF1 SNPs and prostate cancer risk was assessed using logistic kernel-machine methods. The association between each SNP and prostate cancer risk was evaluated through unconditional logistic regression. A false discovery rate threshold of q < 0.1 was used to determine statistical significance of associations. We identified 8 novel IGF1 SNPs. The cumulative effect of the 334 IGF1 SNPs was not associated with prostate cancer risk (P = 0.13) in African Americans. Twenty SNPs were nominally associated with prostate cancer at P < 0.05. The top associated SNP among African Americans, rs148371593 [minor allele frequency (MAF) = 0.03; P = 0.0014; q > 0.1], did not reach our criterion of statistical significance. This polymorphism was rare in non-African Americans (MAF < 0.003) and was not associated with prostate cancer risk (P = 0.98). Our findings do not support the role of IGF1 variants and prostate cancer risk among African Americans.
Collapse
Affiliation(s)
- Elena E Giorgi
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico.
| | - Daniel O Stram
- Department of Preventive Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Darin Taverna
- Division of Genetic Basis of Human Disease, Translational Genomics Research Institute, Phoenix, Arizona. Systems Imagination Inc., Phoenix, Arizona
| | - Stephen D Turner
- School of Medicine, University of Virginia, Charlottesville, Virginia
| | - Fredrick Schumacher
- Department of Preventive Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Christopher A Haiman
- Department of Preventive Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Annette Lum-Jones
- Epidemiology Program, University of Hawaii Cancer Center, University of Hawaii, Honolulu, Hawaii
| | - Maarit Tirikainen
- Epidemiology Program, University of Hawaii Cancer Center, University of Hawaii, Honolulu, Hawaii
| | - Christian Caberto
- Epidemiology Program, University of Hawaii Cancer Center, University of Hawaii, Honolulu, Hawaii
| | - David Duggan
- Division of Genetic Basis of Human Disease, Translational Genomics Research Institute, Phoenix, Arizona
| | - Brian E Henderson
- Department of Preventive Medicine, Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, California
| | - Loic Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, University of Hawaii, Honolulu, Hawaii
| | - Iona Cheng
- Cancer Prevention Institute of California, Fremont, California
| |
Collapse
|
1492
|
Utilizing population controls in rare-variant case-parent association tests. Am J Hum Genet 2014; 94:845-53. [PMID: 24836453 DOI: 10.1016/j.ajhg.2014.04.014] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Accepted: 04/24/2014] [Indexed: 01/10/2023] Open
Abstract
There is great interest in detecting associations between human traits and rare genetic variation. To address the low power implicit in single-locus tests of rare genetic variants, many rare-variant association approaches attempt to accumulate information across a gene, often by taking linear combinations of single-locus contributions to a statistic. Using the right linear combination is key-an optimal test will up-weight true causal variants, down-weight neutral variants, and correctly assign the direction of effect for causal variants. Here, we propose a procedure that exploits data from population controls to estimate the linear combination to be used in an case-parent trio rare-variant association test. Specifically, we estimate the linear combination by comparing population control allele frequencies with allele frequencies in the parents of affected offspring. These estimates are then used to construct a rare-variant transmission disequilibrium test (rvTDT) in the case-parent data. Because the rvTDT is conditional on the parents' data, using parental data in estimating the linear combination does not affect the validity or asymptotic distribution of the rvTDT. By using simulation, we show that our new population-control-based rvTDT can dramatically improve power over rvTDTs that do not use population control information across a wide variety of genetic architectures. It also remains valid under population stratification. We apply the approach to a cohort of epileptic encephalopathy (EE) trios and find that dominant (or additive) inherited rare variants are unlikely to play a substantial role within EE genes previously identified through de novo mutation studies.
Collapse
|
1493
|
Jin SC, Benitez BA, Karch CM, Cooper B, Skorupa T, Carrell D, Norton JB, Hsu S, Harari O, Cai Y, Bertelsen S, Goate AM, Cruchaga C. Coding variants in TREM2 increase risk for Alzheimer's disease. Hum Mol Genet 2014; 23:5838-46. [PMID: 24899047 DOI: 10.1093/hmg/ddu277] [Citation(s) in RCA: 241] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The triggering receptor expressed on myeloid 2 (TREM2) is an immune phagocytic receptor expressed on brain microglia known to trigger phagocytosis and regulate the inflammatory response. Homozygous mutations in TREM2 cause Nasu-Hakola disease, a rare recessive form of dementia. A heterozygous TREM2 variant, p.R47H, was recently shown to increase Alzheimer''s disease (AD) risk. We hypothesized that if TREM2 is truly an AD risk gene, there would be additional rare variants in TREM2 that substantially affect AD risk. To test this hypothesis, we performed pooled sequencing of TREM2 coding regions in 2082 AD cases and 1648 cognitively normal elderly controls of European American descent. We identified 16 non-synonymous variants, six of which were not identified in previous AD studies. Two variants, p.R47H [P = 9.17 × 10(-4), odds ratio (OR) = 2.63 (1.44-4.81)] and p.R62H [P = 2.36 × 10(-4), OR = 2.36 (1.47-3.80)] were significantly associated with disease risk in single-variant analyses. Gene-based tests demonstrate variants in TREM2 are genome-wide significantly associated with AD [PSKAT-O = 5.37 × 10(-7); OR = 2.55 (1.80-3.67)]. The association of TREM2 variants with AD is still highly significant after excluding p.R47H [PSKAT-O = 7.72 × 10(-5); OR = 2.47 (1.62-3.87)], indicating that additional TREM2 variants affect AD risk. Genotyping in available family members of probands suggested that p.R47H (P = 4.65 × 10(-2)) and p.R62H (P = 6.87 × 10(-3)) were more frequently seen in AD cases versus controls within these families. Gel electrophoresis analysis confirms that at least three TREM2 transcripts are expressed in human brains, including one encoding a soluble form of TREM2.
Collapse
Affiliation(s)
- Sheng Chih Jin
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Bruno A Benitez
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Celeste M Karch
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA, Hope Center Program on Protein Aggregation and Neurodegeneration and
| | - Breanna Cooper
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Tara Skorupa
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - David Carrell
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Joanne B Norton
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Simon Hsu
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Oscar Harari
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Yefei Cai
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Sarah Bertelsen
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA
| | - Alison M Goate
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA, Hope Center Program on Protein Aggregation and Neurodegeneration and Department of Neurology, Washington University School of Medicine, 660 S. Euclid Ave. B8111, St. Louis, MO 63110, USA, Joanne Knight Alzheimer's Disease Research Center, Washington University School of Medicine, 4488 Forest Park Ave., St. Louis, MO 63108, USA and Department of Genetics, Washington University School of Medicine, 660 S. Euclid Ave., St. Louis, MO 63110, USA
| | - Carlos Cruchaga
- Department of Psychiatry, Washington University School of Medicine, 660 S. Euclid Ave. B8134, St. Louis, MO 63110, USA, Hope Center Program on Protein Aggregation and Neurodegeneration and
| |
Collapse
|
1494
|
Sham PC, Purcell SM. Statistical power and significance testing in large-scale genetic studies. Nat Rev Genet 2014; 15:335-46. [PMID: 24739678 DOI: 10.1038/nrg3706] [Citation(s) in RCA: 383] [Impact Index Per Article: 34.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Significance testing was developed as an objective method for summarizing statistical evidence for a hypothesis. It has been widely adopted in genetic studies, including genome-wide association studies and, more recently, exome sequencing studies. However, significance testing in both genome-wide and exome-wide studies must adopt stringent significance thresholds to allow multiple testing, and it is useful only when studies have adequate statistical power, which depends on the characteristics of the phenotype and the putative genetic variant, as well as the study design. Here, we review the principles and applications of significance testing and power calculation, including recently proposed gene-based tests for rare variants.
Collapse
Affiliation(s)
- Pak C Sham
- Centre for Genomic Sciences, Jockey Club Building for Interdisciplinary Research; State Key Laboratory of Brain and Cognitive Sciences, and Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Shaun M Purcell
- 1] Center for Statistical Genetics, Icahn School of Medicine at Mount Sinai, New York 10029-6574, USA. [2] Center for Human Genetic Research, Massachusetts General Hospital and Harvard Medical School, Boston, Massachusetts 02114, USA
| |
Collapse
|
1495
|
Feng S, Liu D, Zhan X, Wing MK, Abecasis GR. RAREMETAL: fast and powerful meta-analysis for rare variants. Bioinformatics 2014; 30:2828-9. [PMID: 24894501 PMCID: PMC4173011 DOI: 10.1093/bioinformatics/btu367] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Summary: RAREMETAL is a computationally efficient tool for meta-analysis of rare variants genotyped using sequencing or arrays. RAREMETAL facilitates analyses of individual studies, accommodates a variety of input file formats, handles related and unrelated individuals, executes both single variant and burden tests and performs conditional association analyses. Availability and implementation:http://genome.sph.umich.edu/wiki/RAREMETAL for executables, source code, documentation and tutorial. Contact:sfengsph@umich.edu or goncalo@umich.edu
Collapse
Affiliation(s)
- Shuang Feng
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Dajiang Liu
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Xiaowei Zhan
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Mary Kate Wing
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| | - Gonçalo R Abecasis
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, MI 48109, USA
| |
Collapse
|
1496
|
Tzeng JY, Lu W, Hsu FC. GENE-LEVEL PHARMACOGENETIC ANALYSIS ON SURVIVAL OUTCOMES USING GENE-TRAIT SIMILARITY REGRESSION. Ann Appl Stat 2014; 8:1232-1255. [PMID: 25018788 DOI: 10.1214/14-aoas735] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Gene/pathway-based methods are drawing significant attention due to their usefulness in detecting rare and common variants that affect disease susceptibility. The biological mechanism of drug responses indicates that a gene-based analysis has even greater potential in pharmacogenetics. Motivated by a study from the Vitamin Intervention for Stroke Prevention (VISP) trial, we develop a gene-trait similarity regression for survival analysis to assess the effect of a gene or pathway on time-to-event outcomes. The similarity regression has a general framework that covers a range of survival models, such as the proportional hazards model and the proportional odds model. The inference procedure developed under the proportional hazards model is robust against model misspecification. We derive the equivalence between the similarity survival regression and a random effects model, which further unifies the current variance-component based methods. We demonstrate the effectiveness of the proposed method through simulation studies. In addition, we apply the method to the VISP trial data to identify the genes that exhibit an association with the risk of a recurrent stroke. TCN2 gene was found to be associated with the recurrent stroke risk in the low-dose arm. This gene may impact recurrent stroke risk in response to cofactor therapy.
Collapse
Affiliation(s)
- Jung-Ying Tzeng
- North Carolina State University ; National Cheng-Kung University
| | | | | |
Collapse
|
1497
|
Lin H, Wang M, Brody JA, Bis JC, Dupuis J, Lumley T, McKnight B, Rice KM, Sitlani CM, Reid JG, Bressler J, Liu X, Davis BC, Johnson AD, O'Donnell CJ, Kovar CL, Dinh H, Wu Y, Newsham I, Chen H, Broka A, DeStefano AL, Gupta M, Lunetta KL, Liu CT, White CC, Xing C, Zhou Y, Benjamin EJ, Schnabel RB, Heckbert SR, Psaty BM, Muzny DM, Cupples LA, Morrison AC, Boerwinkle E. Strategies to design and analyze targeted sequencing data: cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:335-43. [PMID: 24951659 PMCID: PMC4176824 DOI: 10.1161/circgenetics.113.000350] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
BACKGROUND Genome-wide association studies have identified thousands of genetic variants that influence a variety of diseases and health-related quantitative traits. However, the causal variants underlying the majority of genetic associations remain unknown. Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study aims to follow up genome-wide association study signals and identify novel associations of the allelic spectrum of identified variants with cardiovascular-related traits. METHODS AND RESULTS The study included 4231 participants from 3 CHARGE cohorts: the Atherosclerosis Risk in Communities Study, the Cardiovascular Health Study, and the Framingham Heart Study. We used a case-cohort design in which we selected both a random sample of participants and participants with extreme phenotypes for each of 14 traits. We sequenced and analyzed 77 genomic loci, which had previously been associated with ≥1 of 14 phenotypes. A total of 52 736 variants were characterized by sequencing and passed our stringent quality control criteria. For common variants (minor allele frequency ≥1%), we performed unweighted regression analyses to obtain P values for associations and weighted regression analyses to obtain effect estimates that accounted for the sampling design. For rare variants, we applied 2 approaches: collapsed aggregate statistics and joint analysis of variants using the sequence kernel association test. CONCLUSIONS We sequenced 77 genomic loci in participants from 3 cohorts. We established a set of filters to identify high-quality variants and implemented statistical and bioinformatics strategies to analyze the sequence data and identify potentially functional variants within genome-wide association study loci.
Collapse
Affiliation(s)
- Honghuang Lin
- Department of Medicine, Boston University School of Medicine, Boston
- The NHLBI’s Framingham Heart Study,Framingham, MA
| | - Min Wang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA
| | - Joshua C. Bis
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA
| | - Josée Dupuis
- The NHLBI’s Framingham Heart Study,Framingham, MA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Thomas Lumley
- Department of Statistics, University of Auckland, New Zealand
| | - Barbara McKnight
- Department of Biostatistics, University of Washington, Seattle, WA
| | - Kenneth M. Rice
- Department of Biostatistics, University of Washington, Seattle, WA
| | - Colleen M. Sitlani
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA
| | - Jeffrey G. Reid
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Jan Bressler
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX
| | - Xiaoming Liu
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX
| | - Brian C. Davis
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX
| | | | | | - Christie L. Kovar
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Huyen Dinh
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Yuanqing Wu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Irene Newsham
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Andi Broka
- LinGA Computing Resource, Boston University, Boston, MA
| | - Anita L. DeStefano
- The NHLBI’s Framingham Heart Study,Framingham, MA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Mayetri Gupta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Kathryn L. Lunetta
- The NHLBI’s Framingham Heart Study,Framingham, MA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Charles C. White
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Chuanhua Xing
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Yanhua Zhou
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Emelia J. Benjamin
- Department of Medicine, Boston University School of Medicine, Boston
- The NHLBI’s Framingham Heart Study,Framingham, MA
| | - Renate B. Schnabel
- Department of General and Interventional Cardiology, University Heart Center, Hamburg, Hamburg, Germany
| | - Susan R. Heckbert
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA
- Group Health Research Institute, Group Health Cooperative, Seattle, WA
- Department of Epidemiology, University of Washington, Seattle, WA
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA
- Group Health Research Institute, Group Health Cooperative, Seattle, WA
- Department of Epidemiology, University of Washington, Seattle, WA
- Department of Health Services, University of Washington, Seattle, WA
| | - Donna M. Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - L. Adrienne Cupples
- The NHLBI’s Framingham Heart Study,Framingham, MA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Alanna C. Morrison
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX
| | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Science Center at Houston, Houston, TX
| |
Collapse
|
1498
|
Liu CT, Young KL, Brody J, Olden M, Wojczynski MK, Heard-Costa N, Li G, Morrison AC, Muzny D, Gibbs RA, Reid JG, Shao Y, Zhou Y, Boerwinkle E, Heiss G, Wagenknecht L, McKnight B, Borecki IB, Fox CS, North KE, Cupples LA. Sequence variation in TMEM18 in association with body mass index: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:344-9. [PMID: 24951660 PMCID: PMC4135723 DOI: 10.1161/circgenetics.13.000067] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
BACKGROUND Genome-wide association studies for body mass index (BMI) previously identified a locus near TMEM18. We conducted targeted sequencing of this region to investigate the role of common, low-frequency, and rare variants influencing BMI. METHODS AND RESULTS We sequenced TMEM18 and regions downstream of TMEM18 on chromosome 2 in 3976 individuals of European ancestry from 3 community-based cohorts (Atherosclerosis Risk in Communities, Cardiovascular Health Study, and Framingham Heart Study), including 200 adults selected for high BMI. We examined the association between BMI and variants identified in the region from nucleotide position 586 432 to 677 539 (hg18). Rare variants (minor allele frequency, <1%) were analyzed using a burden test and the sequence kernel association test. Results from the 3 cohort studies were meta-analyzed. We estimate that mean BMI is 0.43 kg/m(2) higher for each copy of the G allele of single-nucleotide polymorphism rs7596758 (minor allele frequency, 29%; P=3.46×10(-4)) using a Bonferroni threshold of P<4.6×10(-4). Analyses conditional on previous genome-wide association study single-nucleotide polymorphisms associated with BMI in the region led to attenuation of this signal and uncovered another independent (r(2)<0.2), statistically significant association, rs186019316 (P=2.11×10(-4)). Both rs186019316 and rs7596758 or proxies are located in transcription factor binding regions. No significant association with rare variants was found in either the exons of TMEM18 or the 3' genome-wide association study region. CONCLUSIONS Targeted sequencing around TMEM18 identified 2 novel BMI variants with possible regulatory function.
Collapse
Affiliation(s)
- Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Kristin L. Young
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC
- Carolina Population Center, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC
| | - Jennifer Brody
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA
| | | | - Mary K. Wojczynski
- Division of Statistical Genomics, Department of Genetics, Washington University, St. Louis, MO
| | - Nancy Heard-Costa
- NHLBI Framingham Heart Study, Framingham, MA
- Department of Neurology, Boston University School of Medicine, Boston, MA
| | - Guo Li
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA
| | - Alanna C. Morrison
- Division of Epidemiology, School of Public Health, University of Texas Health Science Center, Houston, TX
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Jeffrey G. Reid
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Yaming Shao
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC
| | - Yanhua Zhou
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Eric Boerwinkle
- Division of Epidemiology, School of Public Health, University of Texas Health Science Center, Houston, TX
| | - Geraldo Heiss
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC
| | - Lynne Wagenknecht
- Department of Epidemiology and Prevention, Wake Forest Baptist Medical Center, Winston-Salem, NC
| | - Barbara McKnight
- Cardiovascular Health Research Unit, University of Washington, Seattle, WA
- Department of Biostatistics, University of Washington School of Public Health, Seattle, WA
| | - Ingrid B. Borecki
- Division of Statistical Genomics, Department of Genetics, Washington University, St. Louis, MO
| | | | - Kari E. North
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC
| | - L. Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, MA
- NHLBI Framingham Heart Study, Framingham, MA
| |
Collapse
|
1499
|
Cornes BK, Brody JA, Nikpoor N, Morrison AC, Chu H, Ahn BS, Wang S, Dauriz M, Barzilay JI, Dupuis J, Florez JC, Coresh J, Gibbs RA, Kao WL, Liu CT, McKnight B, Muzny D, Pankow JS, Reid JG, White CC, Johnson AD, Wong TY, Psaty BM, Boerwinkle E, Rotter JI, Siscovick DS, Sladek R, Meigs JB. Association of levels of fasting glucose and insulin with rare variants at the chromosome 11p11.2-MADD locus: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:374-382. [PMID: 24951664 PMCID: PMC4066205 DOI: 10.1161/circgenetics.113.000169] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
BACKGROUND Common variation at the 11p11.2 locus, encompassing MADD, ACP2, NR1H3, MYBPC3, and SPI1, has been associated in genome-wide association studies with fasting glucose and insulin (FI). In the Cohorts for Heart and Aging Research in Genomic Epidemiology Targeted Sequencing Study, we sequenced 5 gene regions at 11p11.2 to identify rare, potentially functional variants influencing fasting glucose or FI levels. METHODS AND RESULTS Sequencing (mean depth, 38×) across 16.1 kb in 3566 individuals without diabetes mellitus identified 653 variants, 79.9% of which were rare (minor allele frequency <1%) and novel. We analyzed rare variants in 5 gene regions with FI or fasting glucose using the sequence kernel association test. At NR1H3, 53 rare variants were jointly associated with FI (P=2.73×10(-3)); of these, 7 were predicted to have regulatory function and showed association with FI (P=1.28×10(-3)). Conditioning on 2 previously associated variants at MADD (rs7944584, rs10838687) did not attenuate this association, suggesting that there are >2 independent signals at 11p11.2. One predicted regulatory variant, chr11:47227430 (hg18; minor allele frequency=0.00068), contributed 20.6% to the overall sequence kernel association test score at NR1H3, lies in intron 2 of NR1H3, and is a predicted binding site for forkhead box A1 (FOXA1), a transcription factor associated with insulin regulation. In human HepG2 hepatoma cells, the rare chr11:47227430 A allele disrupted FOXA1 binding and reduced FOXA1-dependent transcriptional activity. CONCLUSIONS Sequencing at 11p11.2-NR1H3 identified rare variation associated with FI. One variant, chr11:47227430, seems to be functional, with the rare A allele reducing transcription factor FOXA1 binding and FOXA1-dependent transcriptional activity.
Collapse
Affiliation(s)
- Belinda K. Cornes
- General Medicine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
| | - Naghmeh Nikpoor
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Alanna C. Morrison
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Huan Chu
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Byung Soo Ahn
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - Shuai Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Marco Dauriz
- General Medicine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
- Division of Endocrinology, Diabetes and Metabolism, Department of Medicine, University of Verona Medical School and Hospital Trust of Verona, Verona, Italy
| | - Joshua I. Barzilay
- Division of Endocrinology, Kaiser Permanente of Georgia and Emory University School of Medicine
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- National Heart, Lung, and Blood Institute’s The Framingham Heart Study, Cardiovascular Epidemiology and Human Genomics Center, Framingham, MA, USA
| | - Jose C. Florez
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
- Center for Human Genetic Research, Diabetes Unit, Massachusetts General Hospital, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Josef Coresh
- Department of Medicine, The Johns Hopkins Medical Institutions, Baltimore, Maryland, USA
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, University of Texas Health Science Center, Houston, TX
| | - W.H. Linda Kao
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Barbara McKnight
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Donna Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, University of Texas Health Science Center, Houston, TX
| | - James S. Pankow
- Division of Epidemiology and Community Health (J.S.P.), University of Minnesota, MN, USA
| | - Jeffrey G. Reid
- Department of Epidemiology, Johns Hopkins University Bloomberg School of Public Health, Baltimore, Maryland, USA
| | - Charles C. White
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Andrew D. Johnson
- National Heart, Lung, and Blood Institute’s The Framingham Heart Study, Cardiovascular Epidemiology and Human Genomics Center, Framingham, MA, USA
| | - Tien Y. Wong
- Singapore Eye Research Institute, Singapore National Eye Centre, Duke-NUS Graduate Medical School, Singapore
- Department of Ophthalmology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services, University of Washington, Seattle, WA; Group Health Research Institute, Group Health Cooperative, Seattle, WA
| | - Eric Boerwinkle
- Human Genetics Center, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, University of Texas Health Science Center, Houston, TX
| | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Reasearch Institute and Department of Pediatrics, Harbor-UCLA Medical Center Torrance, California, USA
| | - David S. Siscovick
- Cardiovascular Health Research Unit, Department of Medicine, University of Washington, Seattle, WA, USA
- Cardiovascular Health Research Unit, Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Robert Sladek
- Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec, Canada
| | - James B. Meigs
- General Medicine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Department of Medicine, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
1500
|
London SJ, Gao W, Gharib SA, Hancock DB, Wilk JB, House JS, Gibbs RA, Muzny DM, Lumley T, Franceschini N, North KE, Psaty BM, Kovar CL, Coresh J, Zhou Y, Heckbert SR, Brody JA, Morrison AC, Dupuis J. ADAM19 and HTR4 variants and pulmonary function: Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium Targeted Sequencing Study. CIRCULATION. CARDIOVASCULAR GENETICS 2014; 7:350-8. [PMID: 24951661 PMCID: PMC4136502 DOI: 10.1161/circgenetics.113.000066] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND The pulmonary function measures of forced expiratory volume in 1 second (FEV1) and its ratio to forced vital capacity (FVC) are used in the diagnosis and monitoring of lung diseases and predict cardiovascular mortality in the general population. Genome-wide association studies (GWASs) have identified numerous loci associated with FEV1 and FEV1/FVC, but the causal variants remain uncertain. We hypothesized that novel or rare variants poorly tagged by GWASs may explain the significant associations between FEV1/FVC and 2 genes: ADAM19 and HTR4. METHODS AND RESULTS We sequenced ADAM19 and its promoter region along with the ≈21-kb portion of HTR4 harboring GWAS single-nucleotide polymorphisms for pulmonary function and analyzed associations with FEV1/FVC among 3983 participants of European ancestry from Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium. Meta-analysis of common variants in each region identified statistically significant associations (316 tests; P<1.58×10(-4)) with FEV1/FVC for 14 ADAM19 single-nucleotide polymorphisms and 24 HTR4 single-nucleotide polymorphisms. After conditioning on the sentinel GWASs hit in each gene (ADAM19 rs1422795, minor allele frequency=0.33 and HTR4 rs11168048, minor allele frequency=0.40], 1 single-nucleotide polymorphism remained statistically significant (ADAM19 rs13155908, minor allele frequency=0.12; P=1.56×10(-4)). Analysis of rare variants (minor allele frequency <1%) using sequence kernel association test did not identify associations with either region. CONCLUSIONS Sequencing identified 1 common variant associated with FEV1/FVC independent of the sentinel ADAM19 GWAS hit and supports the original HTR4 GWAS findings. Rare variants do not seem to underlie GWAS associations with pulmonary function for common variants in ADAM19 and HTR4.
Collapse
Affiliation(s)
- Stephanie J. London
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Dept of Health and Human Services, Research Triangle Park, NC
- Laboratory of Respiratory Biology, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Dept of Health and Human Services, Research Triangle Park, NC
| | - Wei Gao
- Dept of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Sina A. Gharib
- Center for Lung Biology, Division of Pulmonary & Critical Care Medicine, Dept of Medicine, University of Washington, Seattle, WA
| | - Dana B. Hancock
- Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Dept of Health and Human Services, Research Triangle Park, NC
- Behavioral Health Epidemiology Program, Research Triangle Institute, Research Triangle Park, NC
| | - Jemma B. Wilk
- Precision Medicine, Pfizer Global Research & Development, Cambridge, MA
| | - John S. House
- Laboratory of Respiratory Biology, Division of Intramural Research, National Institute of Environmental Health Sciences, National Institutes of Health, Dept of Health and Human Services, Research Triangle Park, NC
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Donna M. Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Thomas Lumley
- Dept of Statistics, University of Auckland, Auckland, New Zealand
| | | | - Kari E. North
- Dept of Epidemiology, University of North Carolina, Chapel Hill, NC
- Carolina Center for Genome Sciences, University of North Carolina, Chapel Hill, NC
| | - Bruce M. Psaty
- Cardiovascular Health Research Unit, Dept of Medicine, University of Washington, Seattle, WA
- Dept of Epidemiology, University of Washington, Seattle, WA
- Dept of Health Services, University of Washington, Seattle, WA
- Group Health Research Institute, Group Health Cooperative, Seattle, WA
| | - Christie L. Kovar
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX
| | - Josef Coresh
- Dept of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore MD
| | - Yanhua Zhou
- Dept of Biostatistics, Boston University School of Public Health, Boston, MA
| | - Susan R. Heckbert
- Cardiovascular Health Research Unit, Dept of Medicine, University of Washington, Seattle, WA
- Dept of Epidemiology, University of Washington, Seattle, WA
- Group Health Research Institute, Group Health Cooperative, Seattle, WA
| | - Jennifer A. Brody
- Cardiovascular Health Research Unit, Dept of Medicine, University of Washington, Seattle, WA
| | - Alanna C. Morrison
- Human Genetics Center; School of Public Health; University of Texas Health Science Center at Houston, Houston, TX
| | - Josée Dupuis
- Dept of Biostatistics, Boston University School of Public Health, Boston, MA
| |
Collapse
|