1
|
Venkatesh SS, Ganjgahi H, Palmer DS, Coley K, Linchangco GV, Hui Q, Wilson P, Ho YL, Cho K, Arumäe K, Wittemans LBL, Nellåker C, Vainik U, Sun YV, Holmes C, Lindgren CM, Nicholson G. Characterising the genetic architecture of changes in adiposity during adulthood using electronic health records. Nat Commun 2024; 15:5801. [PMID: 38987242 PMCID: PMC11237142 DOI: 10.1038/s41467-024-49998-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/25/2024] [Indexed: 07/12/2024] Open
Abstract
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.5 million primary-care health records in over 740,000 individuals in the UK Biobank, Million Veteran Program USA, and Estonian Biobank, to discover and validate the genetic architecture of adiposity trajectories. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI by 14%. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (APOE missense variant). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology of quantitative traits in adulthood.
Collapse
Affiliation(s)
- Samvida S Venkatesh
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
| | - Habib Ganjgahi
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Duncan S Palmer
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Nuffield Department of Population Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Kayesha Coley
- Department of Population Health Sciences, University of Leicester, Leicester, UK
| | - Gregorio V Linchangco
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Qin Hui
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Peter Wilson
- Atlanta VA Health Care System, Decatur, GA, USA
- Department of Medicine, Emory University School of Medicine, Atlanta, GA, USA
| | - Yuk-Lam Ho
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Veterans Affairs Boston Healthcare System, Boston, MA, USA
| | - Kelly Cho
- Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), Veterans Affairs Boston Healthcare System, Boston, MA, USA
- Division of Aging, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Kadri Arumäe
- Institute of Psychology, Faculty of Social Sciences, University of Tartu, Tartu, Estonia
| | - Laura B L Wittemans
- Novo Nordisk Research Centre Oxford, Oxford, UK
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Christoffer Nellåker
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK
| | - Uku Vainik
- Institute of Psychology, Faculty of Social Sciences, University of Tartu, Tartu, Estonia
- Estonian Genome Centre, Institute of Genomics, Faculty of Science and Technology, University of Tartu, Tartu, Estonia
- Department of Neurology and Neurosurgery, Faculty of Medicine and Health Sciences, University of McGill, Montreal, Canada
| | - Yan V Sun
- Department of Epidemiology, Emory University Rollins School of Public Health, Atlanta, GA, USA
- Atlanta VA Health Care System, Decatur, GA, USA
| | - Chris Holmes
- Department of Statistics, University of Oxford, Oxford, UK
- Nuffield Department of Medicine, Medical Sciences Division, University of Oxford, Oxford, UK
- The Alan Turing Institute, London, UK
| | - Cecilia M Lindgren
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford, UK.
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.
- Nuffield Department of Women's and Reproductive Health, Medical Sciences Division, University of Oxford, Oxford, UK.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| | | |
Collapse
|
2
|
Venkatesh SS, Ganjgahi H, Palmer DS, Coley K, Wittemans LBL, Nellaker C, Holmes C, Lindgren CM, Nicholson G. The genetic architecture of changes in adiposity during adulthood. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.01.09.23284364. [PMID: 36711652 PMCID: PMC9882550 DOI: 10.1101/2023.01.09.23284364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 1.5 million primary-care health records in over 177,000 individuals in UK Biobank to study the genetic architecture of weight-change. Using multiple BMI measurements over time increases power to identify genetic factors affecting baseline BMI. In the largest reported genome-wide study of adiposity-change in adulthood, we identify novel associations with BMI-change at six independent loci, including rs429358 (a missense variant in APOE). The SNP-based heritability of BMI-change (1.98%) is 9-fold lower than that of BMI, and higher in women than in men. The modest genetic correlation between BMI-change and BMI (45.2%) indicates that genetic studies of longitudinal trajectories could uncover novel biology driving quantitative trait values in adulthood.
Collapse
Affiliation(s)
- Samvida S. Venkatesh
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, UK
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
| | | | - Duncan S. Palmer
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Kayesha Coley
- Department of Population Health Sciences, University of Leicester, UK
| | - Laura B. L. Wittemans
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Christoffer Nellaker
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
| | - Chris Holmes
- Department of Statistics, University of Oxford, UK
- Nuffield Department of Medicine, Medical Sciences Division, University of Oxford, UK
- The Alan Turing Institute, London, UK
| | - Cecilia M. Lindgren
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, UK
- Big Data Institute at the Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK
- Nuffield Department of Women’s and Reproductive Health, Medical Sciences Division, University of Oxford, UK
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | | |
Collapse
|
3
|
Eleftheriou A, Petry CJ, Hughes IA, Ong KK, Dunger DB. The High-Risk Type 1 Diabetes HLA-DR and HLA-DQ Polymorphisms Are Differentially Associated With Growth and IGF-I Levels in Infancy: The Cambridge Baby Growth Study. Diabetes Care 2021; 44:1852-1859. [PMID: 34172490 DOI: 10.2337/dc20-2820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Accepted: 05/05/2021] [Indexed: 02/03/2023]
Abstract
OBJECTIVE This study explored the link between HLA polymorphisms that predispose to type 1 diabetes and birth size, infancy growth, and/or circulating IGF-I in a general population-based birth cohort. RESEARCH DESIGN AND METHODS The Cambridge Baby Growth Study is a prospective observational birth cohort study that recruited 2,229 newborns for follow-up in infancy. Of these, 612 children had DNA available for genotyping single nucleotide polymorphisms in the HLA region that capture the highest risk of type 1 diabetes: rs17426593 for DR4, rs2187668 for DR3, and rs7454108 for DQ8. Multivariate linear regression models at critical ages (cross-sectional) and mixed-effects models (longitudinal) were performed under additive genetic effects to test for associations between HLA polymorphisms and infancy weight, length, skinfold thickness (indicator of adiposity), and concentrations of IGF-I and IGF-binding protein-3 (IGFBP-3). RESULTS In longitudinal models, the minor allele of rs2187668 tagging DR3 was associated with faster linear growth (P = 0.007), which was more pronounced in boys (P = 3 × 10-7) than girls (P = 0.07), and was also associated with increasing IGF-I (P = 0.002) and IGFBP-3 (P = 0.003) concentrations in infancy. Cross-sectionally, the minor alleles of rs7454108 tagging DQ8 and rs17426593 tagging DR4 were associated with lower IGF-I concentrations at age 12 months (P = 0.003) and greater skinfold thickness at age 24 months (P = 0.003), respectively. CONCLUSIONS The variable associations of DR4, DR3, and DQ8 alleles with growth measures and IGF-I levels in infants from the general population could explain the heterogeneous growth trajectories observed in genetically at-risk cohorts. These findings could suggest distinct mechanisms involving endocrine pathways related to the HLA-conferred type 1 diabetes risk.
Collapse
Affiliation(s)
| | - Clive J Petry
- Department of Paediatrics, University of Cambridge, Cambridge, U.K
| | - Ieuan A Hughes
- Department of Paediatrics, University of Cambridge, Cambridge, U.K
| | - Ken K Ong
- Department of Paediatrics, University of Cambridge, Cambridge, U.K.,MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, U.K.,Institute of Metabolic Science, University of Cambridge, Cambridge, U.K
| | - David B Dunger
- Department of Paediatrics, University of Cambridge, Cambridge, U.K. .,Institute of Metabolic Science, University of Cambridge, Cambridge, U.K
| |
Collapse
|
4
|
Taporoski TP, Duarte NE, Pompéia S, Sterr A, Gómez LM, Alvim RO, Horimoto ARVR, Krieger JE, Vallada H, Pereira AC, von Schantz M, Negrão AB. Heritability of semantic verbal fluency task using time-interval analysis. PLoS One 2019; 14:e0217814. [PMID: 31185027 PMCID: PMC6559646 DOI: 10.1371/journal.pone.0217814] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Accepted: 05/20/2019] [Indexed: 01/27/2023] Open
Abstract
Individual variability in word generation is a product of genetic and environmental influences. The genetic effects on semantic verbal fluency were estimated in 1,735 participants from the Brazilian Baependi Heart Study. The numbers of exemplars produced in 60 s were broken down into time quartiles because of the involvement of different cognitive processes—predominantly automatic at the beginning, controlled/executive at the end. Heritability in the unadjusted model for the 60-s measure was 0.32. The best-fit model contained age, sex, years of schooling, and time of day as covariates, giving a heritability of 0.21. Schooling had the highest moderating effect. The highest heritability (0.17) was observed in the first quartile, decreasing to 0.09, 0.12, and 0.0003 in the following ones. Heritability for average production starting point (intercept) was 0.18, indicating genetic influences for automatic cognitive processes. Production decay (slope), indicative of controlled processes, was not significant. The genetic influence on different quartiles of the semantic verbal fluency test could potentially be exploited in clinical practice and genome-wide association studies.
Collapse
Affiliation(s)
- T. P. Taporoski
- Institute of Psychiatry (LIM-23), Faculdade de Medicina FMUSP, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
- Department of Biochemical Sciences, Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, United Kingdom
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| | - N. E. Duarte
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
- Departmento de Matemáticas, Universidad Nacional de Colombia, Manizales, Colombia
| | - S. Pompéia
- Department of Psychobiology, Universidade Federal de São Paulo–Escola Paulista de Medicina, São Paulo, SP, Brazil
| | - A. Sterr
- Department of Psychological Sciences, University of Surrey, Guildford, Surrey, United Kingdom
| | - L. M. Gómez
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| | - R. O. Alvim
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
- Postgraduate Program in Public Health, Federal University of Espírito Santo, Vitória, ES, Brazil
| | - A. R. V. R. Horimoto
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| | - J. E. Krieger
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| | - H. Vallada
- Institute of Psychiatry (LIM-23), Faculdade de Medicina FMUSP, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
| | - A. C. Pereira
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| | - M. von Schantz
- Institute of Psychiatry (LIM-23), Faculdade de Medicina FMUSP, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
- Department of Biochemical Sciences, Faculty of Health and Medical Sciences, University of Surrey, Guildford, Surrey, United Kingdom
- * E-mail:
| | - A. B. Negrão
- Institute of Psychiatry (LIM-23), Faculdade de Medicina FMUSP, Universidade de Sao Paulo, Sao Paulo, SP, Brazil
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, Faculty of Medicine, University of São Paulo, São Paulo, SP, Brazil
| |
Collapse
|
5
|
Chiu YF, Lee CY, Hsu FC. Multipoint association mapping for longitudinal family data: an application to hypertension phenotypes. BMC Proc 2016; 10:315-320. [PMID: 27980655 PMCID: PMC5133529 DOI: 10.1186/s12919-016-0049-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
It is essential to develop adequate statistical methods to fully utilize information from longitudinal family studies. We extend our previous multipoint linkage disequilibrium approach—simultaneously accounting for correlations between markers and repeat measurements within subjects, and the correlations between subjects in families—to detect loci relevant to disease through gene-based analysis. Estimates of disease loci and their genetic effects along with their 95 % confidence intervals (or significance levels) are reported. Four different phenotypes—ever having hypertension at 4 visits, incidence of hypertension, hypertension status at baseline only, and hypertension status at 4 visits—are studied using the proposed approach. The efficiency of estimates of disease locus positions (inverse of standard error) improves when using the phenotypes from 4 visits rather than using baseline only.
Collapse
Affiliation(s)
- Yen-Feng Chiu
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli, 35053 Taiwan Republic of China
| | - Chun-Yi Lee
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli, 35053 Taiwan Republic of China
| | - Fang-Chi Hsu
- Department of Biostatistical Sciences, Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, 27157 USA
| |
Collapse
|
6
|
de Farias Pires T, Azambuja AP, Horimoto ARVR, Nakamura MS, de Oliveira Alvim R, Krieger JE, Pereira AC. A population-based study of the stratum corneum moisture. Clin Cosmet Investig Dermatol 2016; 9:79-87. [PMID: 27143945 PMCID: PMC4845893 DOI: 10.2147/ccid.s88485] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
BACKGROUND The stratum corneum (SC) has important functions as a bound-water modulator and a primary barrier of the human skin from the external environment. However, no large epidemiological study has quantified the relative importance of different exposures with regard to these functional properties. In this study, we have studied a large sample of individuals from the Brazilian population in order to understand the different relationships between the properties of SC and a number of demographic and self-perceived variables. METHODS One thousand three hundred and thirty-nine individuals from a rural Brazilian population, who were participants of a family-based study, were submitted to a cross-sectional examination of the SC moisture by capacitance using the Corneometer® CM820 and investigated regarding environmental exposures, cosmetic use, and other physiological and epidemiological measurements. Self-perception-scaled questions about skin conditions were also applied. RESULTS We found significant associations between SC moisture and sex, age, high sun exposure, and sunscreen use frequency (P<0.025). In specific studied sites, self-reported race and obesity were also found to show significant effects. Dry skin self-perception was also found to be highly correlated with the objective measurement of the skin. Other environmental effects on SC moisture are also reported.
Collapse
Affiliation(s)
- Thiago de Farias Pires
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of São Paulo Medical School, Cajamar, SP, Brazil
| | | | | | | | - Rafael de Oliveira Alvim
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of São Paulo Medical School, Cajamar, SP, Brazil
| | - José Eduardo Krieger
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of São Paulo Medical School, Cajamar, SP, Brazil
| | - Alexandre Costa Pereira
- Laboratory of Genetics and Molecular Cardiology, Heart Institute, University of São Paulo Medical School, Cajamar, SP, Brazil
| |
Collapse
|
7
|
Abstract
BACKGROUND Longitudinal phenotypic data provides a rich potential resource for genetic studies which may allow for greater understanding of variants and their covariates over time. Herein, we review 3 longitudinal analytical approaches from the Genetic Analysis Workshop 19 (GAW19). These contributions investigated both genome-wide association (GWA) and whole genome sequence (WGS) data from odd numbered chromosomes on up to 4 time points for blood pressure-related phenotypes. The statistical models used included generalized estimating equations (GEEs), latent class growth modeling (LCGM), linear mixed-effect (LME), and variance components (VC). The goal of these analyses was to test statistical approaches that use repeat measurements to increase genetic signal for variant identification. RESULTS Two analytical methods were applied to the GAW19: GWA using real phenotypic data, and one approach to WGS using 200 simulated replicates. The first GWA approach applied a GEE-based model to identify gene-based associations with 4 derived hypertension phenotypes. This GEE model identified 1 significant locus, GRM7, which passed multiple test corrections for 2 hypertension-derived traits. The second GWA approach employed the LME to estimate genetic associations with systolic blood pressure (SBP) change trajectories identified using LCGM. This LCGM method identified 5 SBP trajectories and association analyses identified a genome-wide significant locus, near ATOX1 (p = 1.0E(-8)). Finally, a third VC-based model using WGS and simulated SBP phenotypes that constrained the β coefficient for a genetic variant across each time point was calculated and compared to an unconstrained approach. This constrained VC approach demonstrated increased power for WGS variants of moderate effect, but when larger genetic effects were present, averaging across time points was as effective. CONCLUSION In this paper, we summarize 3 GAW19 contributions applying novel statistical methods and testing previously proposed techniques under alternative conditions for longitudinal genetic association. We conclude that these approaches when appropriately applied have the potential to: (a) increase statistical power; (b) decrease trait heterogeneity and standard error;
Collapse
Affiliation(s)
- Yen-Feng Chiu
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Miaoli, Taiwan, ROC.
| | - Anne E Justice
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, 27514, USA.
| | - Phillip E Melton
- Centre for Genetic Origins of Health and Disease, University of Western Australia, Perth, WA, Australia.
| |
Collapse
|
8
|
Genome-wide gene-environment interactions on quantitative traits using family data. Eur J Hum Genet 2015; 24:1022-8. [PMID: 26626313 DOI: 10.1038/ejhg.2015.253] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Revised: 10/09/2015] [Accepted: 10/27/2015] [Indexed: 12/15/2022] Open
Abstract
Gene-environment interactions may provide a mechanism for targeting interventions to those individuals who would gain the most benefit from them. Searching for interactions agnostically on a genome-wide scale requires large sample sizes, often achieved through collaboration among multiple studies in a consortium. Family studies can contribute to consortia, but to do so they must account for correlation within families by using specialized analytic methods. In this paper, we investigate the performance of methods that account for within-family correlation, in the context of gene-environment interactions with binary exposures and quantitative outcomes. We simulate both cross-sectional and longitudinal measurements, and analyze the simulated data taking family structure into account, via generalized estimating equations (GEE) and linear mixed-effects models. With sufficient exposure prevalence and correct model specification, all methods perform well. However, when models are misspecified, mixed modeling approaches have seriously inflated type I error rates. GEE methods with robust variance estimates are less sensitive to model misspecification; however, when exposures are infrequent, GEE methods require modifications to preserve type I error rate. We illustrate the practical use of these methods by evaluating gene-drug interactions on fasting glucose levels in data from the Framingham Heart Study, a cohort that includes related individuals.
Collapse
|
9
|
Burkett KM, Roy-Gagnon MH, Lefebvre JF, Wang C, Fontaine-Bisson B, Dubois L. A Comparison of Statistical Methods for the Discovery of Genetic Risk Factors Using Longitudinal Family Study Designs. Front Immunol 2015; 6:589. [PMID: 26635803 PMCID: PMC4652172 DOI: 10.3389/fimmu.2015.00589] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Accepted: 11/02/2015] [Indexed: 01/04/2023] Open
Abstract
The etiology of immune-related diseases or traits is often complex, involving many genetic and environmental factors and their interactions. While methodological approaches focusing on an outcome measured at one time point have succeeded in identifying genetic factors involved in immune-related traits, they fail to capture complex disease mechanisms that fluctuate over time. It is increasingly recognized that longitudinal studies, where an outcome is measured at multiple time points, have great potential to shed light on complex disease mechanisms involving genetic factors. However, longitudinal data require specialized statistical methods, especially in family studies where multiple sources of correlation in the data must be modeled. Using simulated data with known genetic effects, we examined the performance of different analytical methods for investigating associations between genetic factors and longitudinal phenotypes in twin data. The simulations were modeled on data from the Québec Newborn Twin Study, an ongoing population-based longitudinal study of twin births with multiple phenotypes, such as cortisol levels and body mass index, collected multiple times in infancy and early childhood and with sequencing data on immune-related genes and pathways. We compared approaches that we classify as (1) family-based methods applied to summaries of the observations over time, (2) longitudinal-based methods with simplifications of the familial correlation, and (3) Bayesian family-based method with simplifications of the temporal correlation. We found that for estimation of the genetic main and interaction effects, all methods gave estimates close to the true values and had similar power. If heritability estimation is desired, approaches of type (1) also provide heritability estimates close to the true value. Our work shows that the simpler approaches are likely adequate to detect genetic effects; however, interpretation of these effects is more challenging.
Collapse
Affiliation(s)
- Kelly M Burkett
- Department of Mathematics and Statistics, University of Ottawa , Ottawa, ON , Canada
| | - Marie-Hélène Roy-Gagnon
- School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa , Ottawa, ON , Canada
| | - Jean-François Lefebvre
- School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa , Ottawa, ON , Canada
| | - Cheng Wang
- School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa , Ottawa, ON , Canada
| | | | - Lise Dubois
- School of Epidemiology, Public Health and Preventive Medicine, University of Ottawa , Ottawa, ON , Canada
| |
Collapse
|
10
|
Wu Z, Hu Y, Melton PE. Longitudinal data analysis for genetic studies in the whole-genome sequencing era. Genet Epidemiol 2014; 38 Suppl 1:S74-80. [PMID: 25112193 DOI: 10.1002/gepi.21829] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The analysis of whole-genome sequence (WGS) data using longitudinal phenotypes offers a potentially rich resource for the examination of the genetic variants and their covariates that affect complex phenotypes over time. We summarize eight contributions to the Genetic Analysis Workshop 18, which applied a diverse array of statistical genetic methods to analyze WGS data in combination with data from genome-wide association studies (GWAS) from up to four different time points on blood pressure phenotypes. The common goal of these analyses was to develop and apply appropriate methods that utilize longitudinal repeated measures to potentially increase the analytic efficiency of WGS and GWAS data. These diverse methods can be grouped into two categories, based on the way they model dependence structures: (1) linear mixed-effects (LME) models, where the random effect terms in the linear models are used to capture the dependence structures; and (2) variance-components models, where the dependence structures are constructed directly based on multiple components of variance-covariance matrices for the multivariate Gaussian responses. Despite the heterogeneous nature of these analytical methods, the group came to the following conclusions: (1) the use of repeat measurements can gain power to identify variants associated with the phenotype; (2) the inclusion of family data may correct genotyping errors and allow for more accurate detection of rare variants than using unrelated individuals only; and (3) fitting mixed-effects and variance-components models for longitudinal data presents computational challenges. The challenges and computational burden demanded by WGS data were addressed in the eight contributions.
Collapse
Affiliation(s)
- Zheyang Wu
- Department of Mathematical Sciences, Worcester Polytechnic Institute, Worcester, Massachusetts, United States of America
| | | | | |
Collapse
|
11
|
Wang W, Feng Z, Bull SB, Wang Z. A 2-step strategy for detecting pleiotropic effects on multiple longitudinal traits. Front Genet 2014; 5:357. [PMID: 25368629 PMCID: PMC4202779 DOI: 10.3389/fgene.2014.00357] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2014] [Accepted: 09/25/2014] [Indexed: 12/13/2022] Open
Abstract
Genetic pleiotropy refers to the situation in which a single gene influences multiple traits and so it is considered as a major factor that underlies genetic correlation among traits. To identify pleiotropy, an important focus in genome-wide association studies (GWAS) is on finding genetic variants that are simultaneously associated with multiple traits. On the other hand, longitudinal designs are often employed in many complex disease studies, such that, traits are measured repeatedly over time within the same subject. Performing genetic association analysis simultaneously on multiple longitudinal traits for detecting pleiotropic effects is interesting but challenging. In this paper, we propose a 2-step method for simultaneously testing the genetic association with multiple longitudinal traits. In the first step, a mixed effects model is used to analyze each longitudinal trait. We focus on estimation of the random effect that accounts for the subject-specific genetic contribution to the trait; fixed effects of other confounding covariates are also estimated. This first step enables separation of the genetic effect from other confounding effects for each subject and for each longitudinal trait. Then in the second step, we perform a simultaneous association test on multiple estimated random effects arising from multiple longitudinal traits. The proposed method can efficiently detect pleiotropic effects on multiple longitudinal traits and can flexibly handle traits of different data types such as quantitative, binary, or count data. We apply this method to analyze the 16th Genetic Analysis Workshop (GAW16) Framingham Heart Study (FHS) data. A simulation study is also conducted to validate this 2-step method and evaluate its performance.
Collapse
Affiliation(s)
- Weiqiang Wang
- Department of Mathematics and Statistics, University of Guelph Guelph, ON, Canada
| | - Zeny Feng
- Department of Mathematics and Statistics, University of Guelph Guelph, ON, Canada
| | - Shelley B Bull
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Prosserman Centre for Health Research Toronto, ON, Canada ; Dalla Lana School of Public Health, University of Toronto Toronto, ON, Canada
| | - Zuoheng Wang
- Division of Biostatistics, Yale School of Public Health New Haven, CT, USA
| |
Collapse
|
12
|
Wang S, Gao W, Ngwa J, Allard C, Liu CT, Cupples LA. Comparing baseline and longitudinal measures in association studies. BMC Proc 2014; 8:S84. [PMID: 25519412 PMCID: PMC4143666 DOI: 10.1186/1753-6561-8-s1-s84] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
In recent years, longitudinal family-based studies have had success in identifying genetic variants that influence complex traits in genome-wide association studies. In this paper, we suggest that longitudinal analyses may contain valuable information that can enable identification of additional associations compared to baseline analyses. Using Genetic Analysis Workshop 18 data, consisting of whole genome sequence data in a pedigree-based sample, we compared 3 methods for the genetic analysis of longitudinal data to an analysis that used baseline data only. These longitudinal methods were (a) longitudinal mixed-effects model; (b) analysis of the mean trait over time; and (c) a 2-stage analysis, with estimation of a random intercept in the first stage and regression of the random intercept on a single-nucleotide polymorphism at the second stage. All methods accounted for the familial correlation among subjects within a pedigree. The analyses considered common variants with minor allele frequency above 5% on chromosome 3. Analyses were performed without knowledge of the simulation model. The 3 longitudinal methods showed consistent results, which were generally different from those found by using only the baseline observation. The gene CACNA2D3, identified by both longitudinal and baseline approaches, had a stronger signal in the longitudinal analysis (p = 2.65 × 10−7) compared to that in the baseline analysis (p = 2.48 × 10−5). The effect size of the longitudinal mixed-effects model and mean trait were higher compared to the 2-stage approach. The longitudinal results provided stable results different from that using 1 observation at baseline and generally had lower p values.
Collapse
Affiliation(s)
- Shuai Wang
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd floor, Boston, MA 02118, USA
| | - Wei Gao
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd floor, Boston, MA 02118, USA
| | - Julius Ngwa
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd floor, Boston, MA 02118, USA
| | - Catherine Allard
- Département de Mathématiques, Université de Sherbrooke, Québec, Canada J1K 2R1
| | - Ching-Ti Liu
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd floor, Boston, MA 02118, USA
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue 3rd floor, Boston, MA 02118, USA
| |
Collapse
|
13
|
Abstract
The cost of next-generation sequencing is now approaching that of the first generation of genome-wide single-nucleotide genotyping panels, but this is still out of reach for large-scale epidemiologic studies with tens of thousands of subjects. Furthermore, the anticipated yield of millions of rare variants poses serious challenges for distinguishing causal from noncausal variants for disease. We explore the merits of using family-based designs for sequencing substudies to identify novel variants and prioritize them for their likelihood of causality. While the sharing of variants within families means that family-based designs may be less efficient for discovery than sequencing of a comparable number of unrelated individuals, the ability to exploit cosegregation of variants with disease within families helps distinguish causal from noncausal ones. We introduce a score test criterion for prioritizing discovered variants in terms of their likelihood of being functional. We compare the relative statistical efficiency of 2-stage versus1-stage family-based designs by application to the Genetic Analysis Workshop 18 simulated sequence data.
Collapse
Affiliation(s)
- Zhao Yang
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089-9234, USA
| | - Duncan C Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089-9234, USA
| |
Collapse
|
14
|
Sung YJ, Simino J, Kume R, Basson J, Schwander K, Rao DC. Comparison of two methods for analysis of gene-environment interactions in longitudinal family data: the Framingham heart study. Front Genet 2014; 5:9. [PMID: 24523728 PMCID: PMC3906599 DOI: 10.3389/fgene.2014.00009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 01/09/2014] [Indexed: 01/17/2023] Open
Abstract
Gene–environment interaction (GEI) analysis can potentially enhance gene discovery for common complex traits. However, genome-wide interaction analysis is computationally intensive. Moreover, analysis of longitudinal data in families is much more challenging due to the two sources of correlations arising from longitudinal measurements and family relationships. GWIS of longitudinal family data can be a computational bottleneck. Therefore, we compared two methods for analysis of longitudinal family data: a methodologically sound but computationally demanding method using the Kronecker model (KRC) and a computationally more forgiving method using the hierarchical linear model (HLM). The KRC model uses a Kronecker product of an unstructured matrix for correlations among repeated measures (longitudinal) and a compound symmetry matrix for correlations within families at a given visit. The HLM uses an autoregressive covariance matrix for correlations among repeated measures and a random intercept for familial correlations. We compared the two methods using the longitudinal Framingham heart study (FHS) SHARe data. Specifically, we evaluated SNP–alcohol (amount of alcohol consumption) interaction effects on high density lipoprotein cholesterol (HDLC). Keeping the prohibitive computational burden of KRC in mind, we limited the analysis to chromosome 16, where preliminary cross-sectional analysis yielded some interesting results. Our first important finding was that the HLM provided very comparable results but was remarkably faster than the KRC, making HLM the method of choice. Our second finding was that longitudinal analysis provided smaller P-values, thus leading to more significant results, than cross-sectional analysis. This was particularly pronounced in identifying GEIs. We conclude that longitudinal analysis of GEIs is more powerful and that the HLM method is an optimal method of choice as compared to the computationally (prohibitively) intensive KRC method.
Collapse
Affiliation(s)
- Yun Ju Sung
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| | - Jeannette Simino
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| | - Rezart Kume
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| | - Jacob Basson
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| | - Karen Schwander
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| | - D C Rao
- Division of Biostatistics, Washington University School of Medicine in St. Louis St. Louis, MO, USA
| |
Collapse
|
15
|
Malzahn D, Müller-Nurasyid M, Heid IM, Wichmann HE, Bickeböller H. Controversial association results for INSIG2 on body mass index may be explained by interactions with age and with MC4R. Eur J Hum Genet 2014; 22:1217-24. [PMID: 24518831 DOI: 10.1038/ejhg.2014.3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Revised: 12/17/2013] [Accepted: 12/30/2013] [Indexed: 12/14/2022] Open
Abstract
Among the single-nucleotide polymorphisms (SNPs) previously reported to be associated with body mass index (BMI) and obesity, we focus on a common risk variant rs7566605 upstream of the insulin-induced gene 2 (INSIG2) gene and a rare protective variant rs2229616 on the melanocortin-4 receptor (MC4R) gene. INSIG2 is involved in adipogenesis and MC4R effects hormonal appetite control in response to the amount of adipose tissue. The influence of rs2229616 (MC4R) on BMI and obesity has been confirmed repeatedly and insight into the underlying mechanism provided. However, a main effect of rs7566605 (INSIG2) is under debate because of inconsistent replications of association. Interaction of rs7566605 with age may offer an explanation. SNP-age and SNP-SNP interaction models were tested on independent individuals from three population-based longitudinal cohorts, restricting the analysis to an observed age of 25-74 years. KORA S3/F3, KORA S4/F4 (Augsburg, Germany, 1994-2005, 1999-2008), and Framingham-Offspring data (Framingham, USA, 1971-2001) were analysed, with a total sample size of N=6926 in the joint analysis. The effect of interaction between rs7566605 and age on BMI and obesity status is significant and consistent across studies. This new evidence for rs7566605 (INSIG2) complements previous research. In addition, the interaction effect of rs7566605 with the MC4R variant rs2229616 on BMI was observed. This effect size was three times larger than that in a previously reported single-locus main effect of rs2229616. This leads to the conclusion that SNP-age or SNP-SNP interactions can mask genetic effects for complex diseases if left unaccounted for.
Collapse
Affiliation(s)
- Dörthe Malzahn
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University, Göttingen, Germany
| | - Martina Müller-Nurasyid
- 1] Department of Medicine I, University Hospital Grosshadern, Ludwig-Maximilians-University, Munich, Germany [2] Institute of Medical Informatics, Biometry and Epidemiology, Chair of Epidemiology and Chair of Genetic Epidemiology, Ludwig-Maximilians-University, Neuherberg, Germany [3] Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Iris M Heid
- 1] Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany [2] Department of Genetic Epidemiology, University of Regensburg, Regensburg, Germany
| | - H-Erich Wichmann
- 1] Institute of Epidemiology I, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany [2] Institute of Medical Informatics, Biometry and Epidemiology, Chair of Epidemiology, Ludwig-Maximilians-University, Munich, Germany [3] Klinikum Großhadern, Munich, Germany
| | | | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August-University, Göttingen, Germany
| |
Collapse
|
16
|
Malzahn D, Schillert A, Müller M, Bickeböller H. The longitudinal nonparametric test as a new tool to explore gene-gene and gene-time effects in cohorts. Genet Epidemiol 2010; 34:469-78. [PMID: 20568282 DOI: 10.1002/gepi.20500] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Current approaches for analysis of longitudinal genetic epidemiological data of quantitative traits are typically restricted to normality assumptions of the trait. We introduce the longitudinal nonparametric test (LNPT) for cohorts with quantitative follow-up data to test for overall main effects of genes and for gene-gene and gene-time interactions. The LNPT is a rank procedure and does not depend on normality assumptions of the trait. We demonstrate by simulations that the LNPT is powerful, keeps the type-1 error level, and has very good small sample size behavior. For phenotypes with normal residuals, loss of power compared to parametric approaches (linear mixed models) was small for the quite general scenarios, which we simulated. For phenotypes with non-normal residuals, gain in power by the LNPT can be substantial. In contrast to parametric approaches, the LNPT is invariant with respect to monotone transformations of the trait. It is mathematically valid for arbitrary trait distribution.
Collapse
Affiliation(s)
- D Malzahn
- Department of Genetic Epidemiology, University Medical Center, University of Goettingen, Goettingen, Germany
| | | | | | | |
Collapse
|
17
|
Abstract
Epidemiologic studies have clearly shown that air pollution is associated with a range of respiratory effects. Recent research has identified oxidative stress as a major biologic pathway underlying the toxic effect of air pollutants. Genetic susceptibility is likely to play a role in response to air pollution. Genes involved in oxidative stress and inflammatory pathways are logical candidates for the study of the interaction with air pollutants. In this article we use the example of asthma, a genetically complex disease, to address the issue of gene by environment interaction with air pollution. The majority of studies have focused on the genes GSTM1, GSTP1, NQO1, and TNF, but the inconsistency of the results prevents the drawing of firm conclusions. The limited sample size of most studies to date make them underpowered for the study of gene by gene interactions. Large consortia of studies with repeated measurements of environmental exposures and clear phenotypic assessments may help determine special environmental triggers and the window of susceptibility in the development of atopy and asthma. The role of gene by gene interactions and epigenetic mechanisms needs to be considered along with gene by environment interactions.
Collapse
|
18
|
Moreno-Macias H, Romieu I, London SJ, Laird NM. Gene-environment interaction tests for family studies with quantitative phenotypes: A review and extension to longitudinal measures. Hum Genomics 2010; 4:302-26. [PMID: 20650819 PMCID: PMC2952941 DOI: 10.1186/1479-7364-4-5-302] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Longitudinal studies are an important tool for analysing traits that change over time, depending on individual characteristics and environmental exposures. Complex quantitative traits, such as lung function, may change over time and appear to depend on genetic and environmental factors, as well as on potential gene-environment interactions. There is a growing interest in modelling both marginal genetic effects and gene-environment interactions. In an admixed population, the use of traditional statistical models may fail to adjust for confounding by ethnicity, leading to bias in the genetic effect estimates. A variety of methods have been developed to account for the genetic substructure of human populations. Family-based designs provide an important resource for avoiding confounding due to admixture. To date, however, most genetic analyses have been applied to cross-sectional designs. In this paper, we propose a methodology which aims to improve the assessment of main genetic effect and gene-environment interaction effects by combining the advantages of both longitudinal studies for continuous phenotypes, and the family-based designs. This approach is based on an extension of ordinary linear mixed models for quantitative phenotypes, which incorporates information from a case-parent design. Our results indicate that use of this method allows both main genetic and gene-environment interaction effects to be estimated without bias, even in the presence of population substructure.
Collapse
|
19
|
Salem RM, O'Connor DT, Schork NJ. Curve-based multivariate distance matrix regression analysis: application to genetic association analyses involving repeated measures. Physiol Genomics 2010; 42:236-47. [PMID: 20423962 PMCID: PMC3032281 DOI: 10.1152/physiolgenomics.00118.2009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2009] [Accepted: 04/21/2010] [Indexed: 01/09/2023] Open
Abstract
Most, if not all, human phenotypes exhibit a temporal, dosage-dependent, or age effect. Despite this fact, it is rare that data are collected over time or in sequence in relevant studies of the determinants of these phenotypes. The costs and organizational sophistication necessary to collect repeated measurements or longitudinal data for a given phenotype are clearly impediments to this, but greater efforts in this area are needed if insights into human phenotypic expression are to be obtained. Appropriate data analysis methods for genetic association studies involving repeated or longitudinal measures are also needed. We consider the use of longitudinal profiles obtained from fitted functions on repeated data collections from a set of individuals whose similarities are contrasted between sets of individuals with different genotypes to test hypotheses about genetic influences on time-dependent phenotype expression. The proposed approach can accommodate uncertainty of the fitted functions, as well as weighting factors across the time points, and is easily extended to a wide variety of complex analysis settings. We showcase the proposed approach with data from a clinical study investigating human blood vessel response to tyramine. We also compare the proposed approach with standard analytic procedures and investigate its robustness and power via simulation studies. The proposed approach is found to be quite flexible and performs either as well or better than traditional statistical methods.
Collapse
|
20
|
Kerner B, North KE, Fallin MD. Use of longitudinal data in genetic studies in the genome-wide association studies era: summary of Group 14. Genet Epidemiol 2010; 33 Suppl 1:S93-8. [PMID: 19924713 DOI: 10.1002/gepi.20479] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Participants analyzed actual and simulated longitudinal data from the Framingham Heart Study for various metabolic and cardiovascular traits. The genetic information incorporated into these investigations ranged from selected single-nucleotide polymorphisms to genome-wide association arrays. Genotypes were incorporated using a broad range of methodological approaches including conditional logistic regression, linear mixed models, generalized estimating equations, linear growth curve estimation, growth modeling, growth mixture modeling, population attributable risk fraction based on survival functions under the proportional hazards models, and multivariate adaptive splines for the analysis of longitudinal data. The specific scientific questions addressed by these different approaches also varied, ranging from a more precise definition of the phenotype, bias reduction in control selection, estimation of effect sizes and genotype associated risk, to direct incorporation of genetic data into longitudinal modeling approaches and the exploration of population heterogeneity with regard to longitudinal trajectories. The group reached several overall conclusions: (1) The additional information provided by longitudinal data may be useful in genetic analyses. (2) The precision of the phenotype definition as well as control selection in nested designs may be improved, especially if traits demonstrate a trend over time or have strong age-of-onset effects. (3) Analyzing genetic data stratified for high-risk subgroups defined by a unique development over time could be useful for the detection of rare mutations in common multifactorial diseases. (4) Estimation of the population impact of genomic risk variants could be more precise. The challenges and computational complexity demanded by genome-wide single-nucleotide polymorphism data were also discussed.
Collapse
Affiliation(s)
- Berit Kerner
- Department of Psychiatry, University of California, Los Angeles, California, USA
| | | | | |
Collapse
|
21
|
Bennett DA, De Jager PL, Leurgans SE, Schneider JA. Neuropathologic intermediate phenotypes enhance association to Alzheimer susceptibility alleles. Neurology 2009; 72:1495-503. [PMID: 19398704 PMCID: PMC2677477 DOI: 10.1212/wnl.0b013e3181a2e87d] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE The identification of susceptibility alleles to risk of Alzheimer disease (AD) is a major public health priority. Using apolipoprotein E genotype (APOE), we examined whether neuropathologic intermediate phenotypes, the pathology underlying clinical AD that presumably lies intermediate in the causal chain, would increase power for genetic associations. METHODS More than 700 older persons underwent annual evaluation and organ donation as part of the Religious Orders Study or Rush Memory and Aging Project. A total of 536 autopsied persons with clinical AD or without dementia with APOE genotyping and a quantitative measure of AD pathology were analyzed. Regression analyses were used to examine the relation of APOE to clinical AD, to the level of cognitive function proximate to death, and to measures of AD neuropathology. RESULTS APOE epsilon4 was associated with increased odds of clinical AD (p = 3 x 10(-7)), and its association with level of cognition was stronger (p = 8 x 10(-12)). However, the use of quantitative measures of AD pathology markedly enhanced the association (p = 9 x 10(-24)). The APOE epsilon2 was not associated with either AD (p = 0.69) or level of cognition (p = 0.82). However, its association with AD pathology (p = 1 x 10(-5)) was sufficiently strong that it would have warranted follow-up if discovered in a genome-wide association study. Power calculations demonstrate that a sample size of 625 subjects with our measure of AD pathology would be required to meet genome-wide significance of p = 5 x 10(-8) for epsilon2. CONCLUSION Discovery efforts for susceptibility loci for Alzheimer disease could benefit from the use of neuropathologic intermediate phenotypes as a complement to other approaches.
Collapse
Affiliation(s)
- David A. Bennett
- From Rush Alzheimer's Disease Center and the Department of Neurological Sciences (D.A.B., S.E.L., J.A.S.) and Department of Pathology (Neuropathology) (J.A.S.), Rush University Medical Center, Chicago, IL; Center for Neurologic Diseases (P.L.D.), Department of Neurology, Brigham & Women's Hospital and Harvard Medical School, Boston; Partners Center for Personalized Genetic Medicine Boston (P.L.D.); and Program in Medical & Population Genetics (P.L.D.), Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA.
| | - Philip L. De Jager
- From Rush Alzheimer's Disease Center and the Department of Neurological Sciences (D.A.B., S.E.L., J.A.S.) and Department of Pathology (Neuropathology) (J.A.S.), Rush University Medical Center, Chicago, IL; Center for Neurologic Diseases (P.L.D.), Department of Neurology, Brigham & Women's Hospital and Harvard Medical School, Boston; Partners Center for Personalized Genetic Medicine Boston (P.L.D.); and Program in Medical & Population Genetics (P.L.D.), Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA.
| | - Sue E. Leurgans
- From Rush Alzheimer's Disease Center and the Department of Neurological Sciences (D.A.B., S.E.L., J.A.S.) and Department of Pathology (Neuropathology) (J.A.S.), Rush University Medical Center, Chicago, IL; Center for Neurologic Diseases (P.L.D.), Department of Neurology, Brigham & Women's Hospital and Harvard Medical School, Boston; Partners Center for Personalized Genetic Medicine Boston (P.L.D.); and Program in Medical & Population Genetics (P.L.D.), Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA.
| | - Julie A. Schneider
- From Rush Alzheimer's Disease Center and the Department of Neurological Sciences (D.A.B., S.E.L., J.A.S.) and Department of Pathology (Neuropathology) (J.A.S.), Rush University Medical Center, Chicago, IL; Center for Neurologic Diseases (P.L.D.), Department of Neurology, Brigham & Women's Hospital and Harvard Medical School, Boston; Partners Center for Personalized Genetic Medicine Boston (P.L.D.); and Program in Medical & Population Genetics (P.L.D.), Broad Institute of Harvard University and Massachusetts Institute of Technology, Cambridge, MA.
| |
Collapse
|
22
|
Liang L, Chen WM, Sham PC, Abecasis GR. Variance components linkage analysis with repeated measurements. Hum Hered 2009; 67:237-47. [PMID: 19172083 DOI: 10.1159/000194977] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2008] [Accepted: 08/05/2008] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND When subjects are measured multiple times, linkage analysis needs to appropriately model these repeated measures. A number of methods have been proposed to model repeated measures in linkage analysis. Here, we focus on assessing the impact of repeated measures on the power and cost of a linkage study. METHODS We describe three alternative extensions of the variance components approach to accommodate repeated measures in a quantitative trait linkage study. We explicitly relate power and cost through the number of measures for different designs. Based on these models, we derive general formulas for optimal number of repeated measures for a given power or cost and use analytical calculations and simulations to compare power for different numbers of repeated measures across several scenarios. We give rigorous proof for the results under the balanced design. RESULTS Repeated measures substantially improve power and the proportional increase in LOD score depends mostly on measurement error and total heritability but not much on marker map, the number of alleles per marker or family structure. When measurement error takes up 20% of the trait variability and 4 measures/subject are taken, the proportional increase in LOD score ranges from 38% for traits with heritability of approximately 20% to 63% for traits with heritability of approximately 80%. An R package is provided to determine optimal number of repeated measures for given measurement error and cost. Variance component and regression based implementations of our methods are included in the MERLIN package to facilitate their use in practical studies.
Collapse
Affiliation(s)
- Liming Liang
- Department of Biostatistics, University of Michigan, Ann Arbor, Mich., USA.
| | | | | | | |
Collapse
|
23
|
Influence of interleukin 1alpha (IL-1alpha), IL-4, and IL-6 polymorphisms on genetic susceptibility to chronic osteomyelitis. CLINICAL AND VACCINE IMMUNOLOGY : CVI 2008; 15:1888-90. [PMID: 18971305 DOI: 10.1128/cvi.00209-08] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The association between cytokine gene polymorphisms and chronic osteomyelitis was investigated in order to determine whether genetic variability in cytokine genes predisposes to osteomyelitis susceptibility. Significant genotypic and allelic associations were observed between interleukin 1alpha (IL-1alpha) -889-C/T, IL-4 -1098-G/T and -590-C/T, and IL-6 -174-G/C polymorphisms and osteomyelitis in the Greek population, pointing towards their potential involvement in osteomyelitis pathogenesis.
Collapse
|
24
|
Yang Q, Kim SK, Sun F, Cui J, Larson MG, Vasan RS, Levy D, Schwartz F. Maternal influence on blood pressure suggests involvement of mitochondrial DNA in the pathogenesis of hypertension: the Framingham Heart Study. J Hypertens 2007; 25:2067-73. [PMID: 17885549 DOI: 10.1097/hjh.0b013e328285a36e] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE To investigate the contribution of the mitochondrial genome to hypertension and quantitative blood pressure (BP) phenotypes in the Framingham Heart Study cohort, a randomly ascertained, community-based sample. METHODS Longitudinal BP values of 6421 participants (mean age, 53 years; 46% men) from 1593 extended families were used for analyses. In analyses of BP as a continuous trait, a variance components model with a variance component for maternal effects was used to estimate the mitochondrial heritability of the long-term average BP adjusted for age, sex, body mass index, and hypertension treatment. For analyses of BP as a categorical trait, a nonparametric test sensitive to excessive maternal inheritance was used to test for mitochondrial effect on long-term hypertension, defined as systolic BP of at least 140 mmHg or diastolic BP of at least 90 mmHg or use of antihypertensive medication in one-half or more of qualifying examinations. This test was based on 353 pedigrees comprised of 403 individuals informative for mitochondrial DNA contribution. RESULTS The estimated fraction of hypertensive pedigrees potentially due to mitochondrial effects was 35.2% (95% confidence interval, 27-43%, P < 10). The mitochondrial heritabilities for multivariable-adjusted long-term average systolic BP and diastolic BP were, respectively, 5% (P < 0.02) and 4% (P = 0.11). CONCLUSION Our data provide support for a maternal effect on hypertension status and quantitative systolic BP, consistent with mitochondrial influence. Additional studies are warranted to identify mitochondrial DNA variant(s) affecting BP.
Collapse
Affiliation(s)
- Qiong Yang
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Zhang H, Zhong X. Linkage analysis of longitudinal data and design consideration. BMC Genet 2006; 7:37. [PMID: 16768806 PMCID: PMC1550417 DOI: 10.1186/1471-2156-7-37] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2006] [Accepted: 06/12/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Statistical methods have been proposed recently to analyze longitudinal data in genetic studies. So far, little attention has been paid to examine the relationship among key factors in genetic longitudinal studies including power, the number of families or sibships, and the number of repeated measures per individual subjects. RESULTS We proposed a variance component model that extends classic variance component models for a single quantitative trait to mapping longitudinal traits. Our model includes covariate effects and allows genetic effects to vary over time. Using our proposed model, we examined the power, pedigree structures, and sample size through simulation experiments. CONCLUSION Our simulation results provide useful insights into the study design for genetic, longitudinal studies. For example, collecting a small number of large sibships is much more powerful than collecting a large number of small sibships or increasing the number of repeated measures, when the total number of measurements is comparable.
Collapse
Affiliation(s)
- Heping Zhang
- Department of Epidemiology and Public Health, Yale University School of Medicine, 60 College Street, New Haven, CT 06520-8034, USA
| | - Xiaoyun Zhong
- Department of Medicine, University of Massachusetts Medical School, 55 Lake Avenue North, Worcester, MA 01655, USA
| |
Collapse
|
26
|
Won S, Elston RC, Park T. Extension of the Haseman-Elston regression model to longitudinal data. Hum Hered 2006; 61:111-9. [PMID: 16733364 DOI: 10.1159/000093519] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2006] [Accepted: 03/16/2006] [Indexed: 11/19/2022] Open
Abstract
We propose an extension to longitudinal data of the Haseman and Elston regression method for linkage analysis. The proposed model is a mixed model having several random effects. As response variable, we investigate the sibship sample mean corrected cross-product (smHE) and the BLUP-mean corrected cross product (pmHE), comparing them with the original squared difference (oHE), the overall mean corrected cross-product (rHE), and the weighted average of the squared difference and the squared mean-corrected sum (wHE). The proposed model allows for the correlation structure of longitudinal data. Also, the model can test for gene x time interaction to discover genetic variation over time. The model was applied in an analysis of the Genetic Analysis Workshop 13 (GAW13) simulated dataset for a quantitative trait simulating systolic blood pressure. Independence models did not preserve the test sizes, while the mixed models with both family and sibpair random effects tended to preserve size well.
Collapse
Affiliation(s)
- Sungho Won
- Department of Epidemiology and Biostatistics, Case Western Reserve University, USA
| | | | | |
Collapse
|
27
|
|
28
|
Macgregor S, Knott SA, White I, Visscher PM. Quantitative trait locus analysis of longitudinal quantitative trait data in complex pedigrees. Genetics 2005; 171:1365-76. [PMID: 16020786 PMCID: PMC1456837 DOI: 10.1534/genetics.105.043828] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
There is currently considerable interest in genetic analysis of quantitative traits such as blood pressure and body mass index. Despite the fact that these traits change throughout life they are commonly analyzed only at a single time point. The genetic basis of such traits can be better understood by collecting and effectively analyzing longitudinal data. Analyses of these data are complicated by the need to incorporate information from complex pedigree structures and genetic markers. We propose conducting longitudinal quantitative trait locus (QTL) analyses on such data sets by using a flexible random regression estimation technique. The relationship between genetic effects at different ages is efficiently modeled using covariance functions (CFs). Using simulated data we show that the change in genetic effects over time can be well characterized using CFs and that including parameters to model the change in effect with age can provide substantial increases in power to detect QTL compared with repeated measure or univariate techniques. The asymptotic distributions of the methods used are investigated and methods for overcoming the practical difficulties in fitting CFs are discussed. The CF-based techniques should allow efficient multivariate analyses of many data sets in human and natural population genetics.
Collapse
Affiliation(s)
- Stuart Macgregor
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | | | | | |
Collapse
|
29
|
Yang Q, Chazaro I, Cui J, Guo CY, Demissie S, Larson M, Atwood LD, Cupples LA, DeStefano AL. Genetic analyses of longitudinal phenotype data: a comparison of univariate methods and a multivariate approach. BMC Genet 2003; 4 Suppl 1:S29. [PMID: 14975097 PMCID: PMC1866464 DOI: 10.1186/1471-2156-4-s1-s29] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background We explored three approaches to heritability and linkage analyses of longitudinal total cholesterol levels (CHOL) in the Genetic Analysis Workshop 13 simulated data without knowing the answers. The first two were univariate approaches and used 1) baseline measure at exam one or 2) summary measures such as mean and slope from multiple exams. The third method was a multivariate approach that directly models multiple measurements on a subject. A variance components model (SOLAR) was employed in the univariate approaches. A mixed regression model with polynomials was employed in the multivariate approach and implemented in SAS/IML. Results Using the baseline measure at exam 1, we detected all baseline or slope genes contributing a substantial amount (0.08) of variance (LOD > 3). Compared to the baseline measure, the mean measures yielded slightly higher LOD at the slope genes, and a lower LOD at the baseline genes. The slope measure produced a somewhat lower LOD for the slope gene than did the mean measure. Descriptive information on the pattern of changes in gene effects with age was estimated for three linked loci by the third approach. Conclusion We found simple univariate methods may be effective to detect genes affecting longitudinal phenotypes but may not fully reveal temporal trends in gene effects. The relative efficiency of the univariate methods to detect genes depends heavily on the underlying model. Compared with the univariate approaches, the multivariate approach provided more information on temporal trends in gene effects at the cost of more complicated modelling and more intense computations.
Collapse
Affiliation(s)
- Qiong Yang
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| | - Irmarie Chazaro
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Mathematics and Statistics, Boston University, Boston, Massachusetts, USA
| | - Jing Cui
- Departments of Medicine, Boston University, Boston, Massachusetts, USA
| | - Chao-Yu Guo
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Serkalem Demissie
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Martin Larson
- Departments of Mathematics and Statistics, Boston University, Boston, Massachusetts, USA
| | - Larry D Atwood
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| | - L Adrienne Cupples
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Anita L DeStefano
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| |
Collapse
|
30
|
Scurrah KJ, Tobin MD, Burton PR. Longitudinal variance components models for systolic blood pressure, fitted using Gibbs sampling. BMC Genet 2003; 4 Suppl 1:S25. [PMID: 14975093 PMCID: PMC1866460 DOI: 10.1186/1471-2156-4-s1-s25] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
This paper describes an analysis of systolic blood pressure (SBP) in the Genetic Analysis Workshop 13 (GAW13) simulated data. The main aim was to assess evidence for both general and specific genetic effects on the baseline blood pressure and on the rate of change (slope) of blood pressure with time. Generalized linear mixed models were fitted using Gibbs sampling in WinBUGS, and the additive polygenic random effects estimated using these models were then used as continuous phenotypes in a variance components linkage analysis. The first-stage analysis provided evidence for general genetic effects on both the baseline and slope of blood pressure, and the linkage analysis found evidence of several genes, again for both baseline and slope.
Collapse
Affiliation(s)
- Katrina J Scurrah
- Institute of Genetics and Department of Epidemiology and Public Health, University of Leicester, 22-28 Princess Road, Leicester, LE1 6TP, United Kingdom
| | - Martin D Tobin
- Institute of Genetics and Department of Epidemiology and Public Health, University of Leicester, 22-28 Princess Road, Leicester, LE1 6TP, United Kingdom
| | - Paul R Burton
- Institute of Genetics and Department of Epidemiology and Public Health, University of Leicester, 22-28 Princess Road, Leicester, LE1 6TP, United Kingdom
| |
Collapse
|
31
|
Mirea L, Bull SB, Stafford J. Comparison of Haseman-Elston regression analyses using single, summary, and longitudinal measures of systolic blood pressure. BMC Genet 2003; 4 Suppl 1:S23. [PMID: 14975091 PMCID: PMC1866458 DOI: 10.1186/1471-2156-4-s1-s23] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
To compare different strategies for linkage analyses of longitudinal quantitative trait measures, we applied the "revisited" Haseman-Elston (RHE) regression model (the cross product of centered sib-pair trait values is regressed on expected identical-by-descent allele sharing) to cross-sectional, summary, and repeated measurements of systolic blood pressure (SBP) values in replicate 34, randomly selected from the Genetic Analysis Workshop 13 simulated data. RHE linkage scans were performed without knowledge of the generating model using the following phenotypes derived from untreated SBP measurements: the first, the last, the mean, the ratio of the change between the first and last over time, and the estimated linear regression slope coefficient. Estimates of allele sharing in sibling pairs were obtained from the complete genotype data of Cohorts 1 and 2, but linkage analyses were restricted to the five visits of Cohort 2 siblings. Evidence for linkage was suggestive (p < 0.001) at markers neighboring SBP genes Gb35, Gs10, and Gs12, but weaker signals (p < 0.01) were obtained at markers mapping close to Gb34 and Gs11. Linkage to baseline genes Gb34 and Gb35 was best detected using the first SBP measurement, whereas linkage to slope genes Gs10-12 was best detected using the last or mean SBP value. At markers on chromosomes 13 and 21 displaying strongest linkage signals, marginal RHE-type models including repeated SBP measures were fit to test for overall and time-dependent genetic effects. These analyses assumed independent sib pairs and employed generalized estimating equations (GEE) with a first-order autoregressive working correlation structure to adjust for serial correlation present among repeated observations from the same sibling pair.
Collapse
Affiliation(s)
- Lucia Mirea
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
| | - Shelley B Bull
- Samuel Lunenfeld Research Institute, Mount Sinai Hospital, 600 University Avenue, Toronto, Ontario, M5G 1X5, Canada
- Department of Public Health Sciences, University of Toronto, Toronto, Ontario, M5S 1A8, Canada
| | - James Stafford
- Department of Public Health Sciences, University of Toronto, Toronto, Ontario, M5S 1A8, Canada
| |
Collapse
|