1
|
Chamberlin KW, Li C, Luo Z, D'Aloisio AA, Pinto JM, Sandler DP, Chen H. Use of nonsteroidal anti-inflammatory drugs and poor olfaction in women. Int Forum Allergy Rhinol 2024; 14:639-650. [PMID: 37548119 DOI: 10.1002/alr.23241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 07/06/2023] [Accepted: 07/17/2023] [Indexed: 08/08/2023]
Abstract
BACKGROUND It is unclear whether regular use of nonsteroidal anti-inflammatory drugs (NSAIDs) is associated with poor olfaction in older adults. METHODS We selected 4020 participants, aged 50 to 79 years in 2018, from 36,492 eligible participants in the National Institute of Environmental Health Sciences Sister Study, according to their self-reported olfaction status. Of these, 3406 women completed the 12-item Brief Smell Identification Test. We defined poor olfaction as a test score ≤9 in the primary analysis. We then estimated odds ratios (ORs) and 95% confidence intervals (CIs) from weighted logistic models, accounting for the study design, missing exposures/outcomes, and covariates. RESULTS Overall, NSAID use was not associated with poor olfaction. However, we found evidence for potential multiplicative interactions. Specifically, the OR comparing regular versus never use of aspirin was 1.8 (95% CI, 1.1-3.2) among women who had not regularly used nonaspirin NSAIDs, while the corresponding OR was 0.8 (95% CI, 0.5-1.2) among nonaspirin NSAID users (P for interaction = 0.016). Similar results were seen for ibuprofen alone versus ibuprofen with other NSAID use (P for interaction = 0.010). Among women using either drug alone, associations with poor olfaction increased with increasing duration and cumulative dose. Post hoc analyses showed that the interactions could not be readily explained by potential biases. Other NSAIDs were not associated with olfaction. CONCLUSION Long-term regular use of aspirin or ibuprofen was associated with poor olfaction among women who never regularly used other types of NSAIDs. These preliminary findings warrant independent confirmation.
Collapse
Affiliation(s)
- Keran W Chamberlin
- Department of Epidemiology and Biostatistics, Michigan State University College of Human Medicine, East Lansing, Michigan, USA
| | - Chenxi Li
- Department of Epidemiology and Biostatistics, Michigan State University College of Human Medicine, East Lansing, Michigan, USA
| | - Zhehui Luo
- Department of Epidemiology and Biostatistics, Michigan State University College of Human Medicine, East Lansing, Michigan, USA
| | - Aimee A D'Aloisio
- Social & Scientific Systems, DLH Holdings Corporation, Durham, North Carolina, USA
| | - Jayant M Pinto
- Department of Surgery, The University of Chicago, Chicago, Illinois, USA
| | - Dale P Sandler
- Epidemiology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Honglei Chen
- Department of Epidemiology and Biostatistics, Michigan State University College of Human Medicine, East Lansing, Michigan, USA
| |
Collapse
|
2
|
Zhong Y, Cook RJ, Yu A. Analysis of secondary failure time responses in studies with response-dependent sampling schemes. Stat Med 2023; 42:4763-4775. [PMID: 37643587 DOI: 10.1002/sim.9887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/20/2023] [Accepted: 08/15/2023] [Indexed: 08/31/2023]
Abstract
Response-dependent sampling is routinely used as an enrichment strategy in the design of family studies investigating the heritable nature of disease. In addition to the response of primary interest, investigators often wish to investigate the association between biomarkers and secondary responses related to possible comorbidities. Statistical analysis regarding genetic biomarkers and their association with the secondary outcome must address the biased sampling scheme involving the primary response. In this article, we develop composite likelihoods and two-stage estimation procedures for such secondary analyses in which the within-family dependence structure for the primary and secondary outcomes is modeled via a Gaussian copula. The dependence among responses within family members is modeled based on kinship coefficients. Auxiliary data from independent individuals are exploited by augmenting the composite likelihoods to increase precision of marginal parameter estimates and enhance the efficiency of estimators of the dependence parameters. Simulation studies are carried out to evaluate the finite sample performance of the proposed method, and an application to a motivating family study in psoriatic arthritis is given for illustration.
Collapse
Affiliation(s)
- Yujie Zhong
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
- Oncology Statistics, R&D China AstraZeneca, Shanghai, China
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Aiai Yu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| |
Collapse
|
3
|
Dapas M, Lee YL, Wentworth-Sheilds W, Im HK, Ober C, Schoettler N. Revealing polygenic pleiotropy using genetic risk scores for asthma. HGG ADVANCES 2023; 4:100233. [PMID: 37663543 PMCID: PMC10474095 DOI: 10.1016/j.xhgg.2023.100233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/11/2023] [Indexed: 09/05/2023] Open
Abstract
In this study we examined how genetic risk for asthma associates with different features of the disease and with other medical conditions and traits. Using summary statistics from two multi-ancestry genome-wide association studies of asthma, we modeled polygenic risk scores (PRSs) and validated their predictive performance in the UK Biobank. We then performed phenome-wide association studies of the asthma PRSs with 371 heritable traits in the UK Biobank. We identified 228 total significant associations across a variety of organ systems, including associations that varied by PRS model, sex, age of asthma onset, ancestry, and human leukocyte antigen region alleles. Our results highlight pervasive pleiotropy between asthma and numerous other traits and conditions and elucidate pathways that contribute to asthma and its comorbidities.
Collapse
Affiliation(s)
- Matthew Dapas
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Yu Lin Lee
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
- Biological Sciences Collegiate Division, University of Chicago, Chicago, IL, USA
| | | | - Hae Kyung Im
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Carole Ober
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Nathan Schoettler
- Section of Pulmonary and Critical Care Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| |
Collapse
|
4
|
Mignogna G, Carey CE, Wedow R, Baya N, Cordioli M, Pirastu N, Bellocco R, Malerbi KF, Nivard MG, Neale BM, Walters RK, Ganna A. Patterns of item nonresponse behaviour to survey questionnaires are systematic and associated with genetic loci. Nat Hum Behav 2023; 7:1371-1387. [PMID: 37386106 PMCID: PMC10444625 DOI: 10.1038/s41562-023-01632-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 05/17/2023] [Indexed: 07/01/2023]
Abstract
Response to survey questionnaires is vital for social and behavioural research, and most analyses assume full and accurate response by participants. However, nonresponse is common and impedes proper interpretation and generalizability of results. We examined item nonresponse behaviour across 109 questionnaire items in the UK Biobank (N = 360,628). Phenotypic factor scores for two participant-selected nonresponse answers, 'Prefer not to answer' (PNA) and 'I don't know' (IDK), each predicted participant nonresponse in follow-up surveys (incremental pseudo-R2 = 0.056), even when controlling for education and self-reported health (incremental pseudo-R2 = 0.046). After performing genome-wide association studies of our factors, PNA and IDK were highly genetically correlated with one another (rg = 0.73 (s.e. = 0.03)) and with education (rg,PNA = -0.51 (s.e. = 0.03); rg,IDK = -0.38 (s.e. = 0.02)), health (rg,PNA = 0.51 (s.e. = 0.03); rg,IDK = 0.49 (s.e. = 0.02)) and income (rg,PNA = -0.57 (s.e. = 0.04); rg,IDK = -0.46 (s.e. = 0.02)), with additional unique genetic associations observed for both PNA and IDK (P < 5 × 10-8). We discuss how these associations may bias studies of traits correlated with item nonresponse and demonstrate how this bias may substantially affect genome-wide association studies. While the UK Biobank data are deidentified, we further protected participant privacy by avoiding exploring non-response behaviour to single questions, assuring that no information can be used to associate results with any particular respondents.
Collapse
Affiliation(s)
- Gianmarco Mignogna
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Caitlin E Carey
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Robbee Wedow
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Sociology, Purdue University, West Lafayette, IN, USA.
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA.
- AnalytiXIN (Analytics Indiana), Indianapolis, IN, USA.
- Department of Statistics, Purdue University, West Lafayette, IN, USA.
| | - Nikolas Baya
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Mattia Cordioli
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Nicola Pirastu
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, Scotland
- Fondazione Human Technopole, Viale Rita Levi-Montalcini, Milan, Italy
| | - Rino Bellocco
- Department of Statistics and Quantitative Methods, University of Milano-Bicocca, Milan, Italy
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | | | - Michel G Nivard
- Department of Biological Psychiatry, Faculty of Behavioural and Movement Sciences, Vrije Universiteit, Amsterdam, the Netherlands
- Methodology Program, Amsterdam Public Health, Amsterdam, the Netherlands
- Amsterdam Neuroscience - Mood, Anxiety, Psychosis, Stress and Sleep, Amsterdam, the Netherlands
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Novo Nordisk Foundation for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Raymond K Walters
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Andrea Ganna
- Analytic and Translational Genetics Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA.
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
5
|
Ratanatharathorn A, Roberts AL, Chibnik LB, Choi KW, De Vivo I, Kim Y, Nishimi K, Rimm EB, Sumner JA, Kubzansky LD, Koenen KC. Posttraumatic Stress Disorder, Depression, and Accelerated Aging: Leukocyte Telomere Length in the Nurses' Health Study II. BIOLOGICAL PSYCHIATRY GLOBAL OPEN SCIENCE 2023; 3:510-518. [PMID: 37519465 PMCID: PMC10382693 DOI: 10.1016/j.bpsgos.2022.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 05/10/2022] [Accepted: 05/14/2022] [Indexed: 10/18/2022] Open
Abstract
Background Exposure to trauma, posttraumatic stress disorder (PTSD), and depression have been independently associated with leukocyte telomere length (LTL), a cellular marker of aging associated with mortality and age-related diseases. However, the joint contributions of trauma and its psychological sequelae on LTL have not been examined. Methods We conducted an analysis of LTL in a subset of women from the Nurses' Health Study II (N = 1868). Lifetime exposure to traumatic events, PTSD, and depression was assessed with validated measures. DNA was extracted from peripheral blood leukocytes and telomere repeat copy number to single gene copy number was determined by quantitative real-time polymerase chain reaction telomere assay. Linear regression models assessed the association of trauma, PTSD, and depression with LTL after adjustment for health behaviors and medical conditions. Results Trauma, PTSD, and depression were not independently associated with LTL in mutually adjusted models. However, individuals with severe psychological distress-characterized by comorbid PTSD and depression-had shorter LTL equivalent to being 7.62 years older (95% CI, 0.02 to 17.97) than participants who had never experienced a traumatic event and were not depressed. Further examination found only an association among individuals with the highest number of PTSD symptoms and comorbid depression equivalent to 9.71 additional years of aging (95% CI, 1.36 to 20.49). No effect was found among individuals meeting the minimum threshold for probable PTSD with comorbid depression. Conclusions Severe psychological distress, as indicated by the presence of comorbid PTSD and depression, may be associated with shorter LTL.
Collapse
Affiliation(s)
- Andrew Ratanatharathorn
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, New York
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Andrea L. Roberts
- Environmental Health, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Lori B. Chibnik
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Neurology, Massachusetts General Hospital, Boston, Massachusetts
| | - Karmel W. Choi
- Department of Psychiatry, Massachusetts General Hospital, Boston, Massachusetts
| | - Immaculata De Vivo
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Channing Division of Network Medicine, Brigham and Women's Hospital - Harvard Medical School, Boston, Massachusetts
| | - Yongjoo Kim
- College of Korean Medicine, Sangji University, Wonju, Republic of Korea
| | - Kristen Nishimi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Mental Health Service, San Francisco Veterans Affairs Health Care System, San Francisco, California
- Department of Psychiatry, University of California San Francisco, San Francisco, California
| | - Eric B. Rimm
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Jennifer A. Sumner
- Department of Psychology, University of California, Los Angeles, Los Angeles, California
| | - Laura D. Kubzansky
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - Karestan C. Koenen
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| |
Collapse
|
6
|
Yoo S, Garg E, Elliott LT, Hung RJ, Halevy AR, Brooks JD, Bull SB, Gagnon F, Greenwood C, Lawless JF, Paterson AD, Sun L, Zawati MH, Lerner-Ellis J, Abraham R, Birol I, Bourque G, Garant JM, Gosselin C, Li J, Whitney J, Thiruvahindrapuram B, Herbrick JA, Lorenti M, Reuter MS, Adeoye OO, Liu S, Allen U, Bernier FP, Biggs CM, Cheung AM, Cowan J, Herridge M, Maslove DM, Modi BP, Mooser V, Morris SK, Ostrowski M, Parekh RS, Pfeffer G, Suchowersky O, Taher J, Upton J, Warren RL, Yeung R, Aziz N, Turvey SE, Knoppers BM, Lathrop M, Jones S, Scherer SW, Strug LJ. HostSeq: a Canadian whole genome sequencing and clinical data resource. BMC Genom Data 2023; 24:26. [PMID: 37131148 PMCID: PMC10152008 DOI: 10.1186/s12863-023-01128-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 02/22/2023] [Indexed: 05/04/2023] Open
Abstract
HostSeq was launched in April 2020 as a national initiative to integrate whole genome sequencing data from 10,000 Canadians infected with SARS-CoV-2 with clinical information related to their disease experience. The mandate of HostSeq is to support the Canadian and international research communities in their efforts to understand the risk factors for disease and associated health outcomes and support the development of interventions such as vaccines and therapeutics. HostSeq is a collaboration among 13 independent epidemiological studies of SARS-CoV-2 across five provinces in Canada. Aggregated data collected by HostSeq are made available to the public through two data portals: a phenotype portal showing summaries of major variables and their distributions, and a variant search portal enabling queries in a genomic region. Individual-level data is available to the global research community for health research through a Data Access Agreement and Data Access Compliance Office approval. Here we provide an overview of the collective project design along with summary level information for HostSeq. We highlight several statistical considerations for researchers using the HostSeq platform regarding data aggregation, sampling mechanism, covariate adjustment, and X chromosome analysis. In addition to serving as a rich data source, the diversity of study designs, sample sizes, and research objectives among the participating studies provides unique opportunities for the research community.
Collapse
Affiliation(s)
- S Yoo
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Ottawa, Ottawa, ON, Canada
| | - E Garg
- Simon Fraser University, Burnaby, BC, Canada
| | - L T Elliott
- Simon Fraser University, Burnaby, BC, Canada
| | - R J Hung
- University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - A R Halevy
- The Hospital for Sick Children, Toronto, ON, Canada
| | - J D Brooks
- University of Toronto, Toronto, ON, Canada
| | - S B Bull
- University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - F Gagnon
- University of Toronto, Toronto, ON, Canada
| | - Cmt Greenwood
- McGill University, Montreal, QC, Canada
- Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC, Canada
| | - J F Lawless
- University of Waterloo, Waterloo, ON, Canada
| | - A D Paterson
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - L Sun
- University of Toronto, Toronto, ON, Canada
| | | | - J Lerner-Ellis
- University of Toronto, Toronto, ON, Canada
- Sinai Health System, Toronto, ON, Canada
| | - Rjs Abraham
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - I Birol
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - G Bourque
- McGill University, Montreal, QC, Canada
| | - J-M Garant
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - C Gosselin
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - J Li
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - J Whitney
- The Hospital for Sick Children, Toronto, ON, Canada
| | | | - J-A Herbrick
- The Hospital for Sick Children, Toronto, ON, Canada
| | - M Lorenti
- The Hospital for Sick Children, Toronto, ON, Canada
| | - M S Reuter
- The Hospital for Sick Children, Toronto, ON, Canada
| | - O O Adeoye
- The Hospital for Sick Children, Toronto, ON, Canada
| | - S Liu
- The Hospital for Sick Children, Toronto, ON, Canada
| | - U Allen
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - F P Bernier
- University of Calgary, Calgary, AB, Canada
- Alberta Children's Hospital, Calgary, AB, Canada
| | - C M Biggs
- University of British Columbia, Vancouver, BC, Canada
- BC Children's Hospital, Vancouver, BC, Canada
- St. Paul's Hospital, Vancouver, BC, Canada
| | - A M Cheung
- University Health Network, Toronto, ON, Canada
| | - J Cowan
- University of Ottawa, Ottawa, ON, Canada
- The Ottawa Hospital Research Institute, Ottawa, ON, Canada
| | - M Herridge
- University Health Network, Toronto, ON, Canada
| | | | - B P Modi
- BC Children's Hospital, Vancouver, BC, Canada
| | - V Mooser
- McGill University, Montreal, QC, Canada
| | - S K Morris
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - M Ostrowski
- University of Toronto, Toronto, ON, Canada
- St. Michael's Hospital, Unity Health, Toronto, ON, Canada
| | - R S Parekh
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
- Women's College Hospital, Toronto, ON, Canada
| | - G Pfeffer
- University of Calgary, Calgary, AB, Canada
| | | | - J Taher
- University of Toronto, Toronto, ON, Canada
- Sinai Health System, Toronto, ON, Canada
| | - J Upton
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - R L Warren
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - Rsm Yeung
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - N Aziz
- The Hospital for Sick Children, Toronto, ON, Canada
| | - S E Turvey
- University of British Columbia, Vancouver, BC, Canada
- BC Children's Hospital, Vancouver, BC, Canada
| | | | - M Lathrop
- McGill University, Montreal, QC, Canada
| | - Sjm Jones
- Canada's Michael Smith Genome Sciences Centre, Vancouver, BC, Canada
| | - S W Scherer
- The Hospital for Sick Children, Toronto, ON, Canada
- University of Toronto, Toronto, ON, Canada
| | - L J Strug
- The Hospital for Sick Children, Toronto, ON, Canada.
- University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
7
|
Nudel R, Allesøe RL, Werge T, Thompson WK, Rasmussen S, Benros ME. An immunogenetic investigation of 30 autoimmune and autoinflammatory diseases and their links to psychiatric disorders in a nationwide sample. Immunology 2023; 168:622-639. [PMID: 36273265 DOI: 10.1111/imm.13597] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 10/20/2022] [Indexed: 11/05/2022] Open
Abstract
Autoimmune and autoinflammatory diseases (AIIDs) involve a deficit in an individual's immune system function, whereby the immune reaction is directed against self-antigens. Many AIIDs have a strong genetic component, but they can also be triggered by environmental factors. AIIDs often have a highly negative impact on the individual's physical and mental wellbeing. Understanding the genetic underpinning of AIIDs is thus crucial both for diagnosis and for identifying individuals at high risk of an AIID and mental illness as a result thereof. The aim of the present study was to perform systematic statistical and genetic analyses to assess the role of human leukocyte antigen (HLA) alleles in 30 AIIDs and to study the links between AIIDs and psychiatric disorders. We leveraged the Danish iPSYCH Consortium sample comprising 65 534 individuals diagnosed with psychiatric disorders or selected as part of a random population sample, for whom we also had genetic data and diagnoses of AIIDs. We employed regression analysis to examine comorbidities between AIIDs and psychiatric disorders and associations between AIIDs and HLA alleles across seven HLA genes. Our comorbidity analyses showed that overall AIID and five specific AIIDs were associated with having a psychiatric diagnosis. Our genetic analyses found 81 significant associations between HLA alleles and AIIDs. Lastly, we show connections across AIIDs, psychiatric disorders and infection susceptibility through network analysis of significant HLA associations in these disease classes. Combined, our results include both novel associations as well as replications of previously reported associations in a large sample, and highlight the genetic and epidemiological links between AIIDs and psychiatric disorders.
Collapse
Affiliation(s)
- Ron Nudel
- CORE-Copenhagen Research Centre for Mental Health, Mental Health Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, Denmark
| | - Rosa Lundbye Allesøe
- CORE-Copenhagen Research Centre for Mental Health, Mental Health Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, Denmark
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Thomas Werge
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, Denmark
- Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- Department of Clinical Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Wesley K Thompson
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, Denmark
- Institute of Biological Psychiatry, Mental Health Centre Sct. Hans, Mental Health Services Copenhagen, Roskilde, Denmark
- Division of Biostatistics, Herbert Wertheim School of Public Health and Human Longevity Science, University of California, San Diego, California, USA
| | - Simon Rasmussen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Michael E Benros
- CORE-Copenhagen Research Centre for Mental Health, Mental Health Centre Copenhagen, Copenhagen University Hospital, Copenhagen, Denmark
- iPSYCH, The Lundbeck Foundation Initiative for Integrative Psychiatric Research, Aarhus, Denmark
- Department of Immunology and Microbiology, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
8
|
Khorshid Shamshiri A, Alidoust M, Hemmati Nokandei M, Pasdar A, Afzaljavan F. Genetic architecture of mammographic density as a risk factor for breast cancer: a systematic review. CLINICAL & TRANSLATIONAL ONCOLOGY : OFFICIAL PUBLICATION OF THE FEDERATION OF SPANISH ONCOLOGY SOCIETIES AND OF THE NATIONAL CANCER INSTITUTE OF MEXICO 2023; 25:1729-1747. [PMID: 36639603 DOI: 10.1007/s12094-022-03071-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 12/30/2022] [Indexed: 01/15/2023]
Abstract
BACKGROUND Mammography Density (MD) is a potential risk marker that is influenced by genetic polymorphisms and can subsequently modulate the risk of breast cancer. This qualitative systematic review summarizes the genes and biological pathways involved in breast density and discusses the potential clinical implications in view of the genetic risk profile for breast density. METHODS The terms related to "Common genetic variations" and "Breast density" were searched in Scopus, PubMed, and Web of Science databases. Gene pathways analysis and assessment of protein interactions were also performed. RESULTS Eighty-six studies including 111 genes, reported a significant association between mammographic density in different populations. ESR1, IGF1, IGFBP3, and ZNF365 were the most prevalent genes. Moreover, estrogen metabolism, signal transduction, and prolactin signaling pathways were significantly related to the associated genes. Mammography density was an associated phenotype, and eight out of 111 genes, including COMT, CYP19A1, CYP1B1, ESR1, IGF1, IGFBP1, IGFBP3, and LSP1, were modifiers of this trait. CONCLUSION Genes involved in developmental processes and the evolution of secondary sexual traits play an important role in determining mammographic density. Due to the effect of breast tissue density on the risk of breast cancer, these genes may also be associated with breast cancer risk.
Collapse
Affiliation(s)
- Asma Khorshid Shamshiri
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Maryam Alidoust
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Mahboubeh Hemmati Nokandei
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Student Research Committee, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Alireza Pasdar
- Department of Medical Genetics and Molecular Medicine, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
- Division of Applied Medicine, Medical School, University of Aberdeen, Foresterhill, Aberdeen, AB25 2ZD, UK.
| | - Fahimeh Afzaljavan
- Clinical Research Development Unit, Faculty of Medicine, Imam Reza Hospital, Mashhad University of Medical Sciences, Mashhad, 917794-8564, Iran.
| |
Collapse
|
9
|
Kartsonaki C, Cox DR. Regression Reconstruction from a Retrospective Sample. ECONOMETRICS AND STATISTICS 2023; 25:87-92. [PMID: 36726747 PMCID: PMC9872473 DOI: 10.1016/j.ecosta.2020.10.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Revised: 10/27/2020] [Accepted: 10/29/2020] [Indexed: 06/18/2023]
Abstract
The simplest form of retrospective study allows the reconstruction of the dependence between a binary outcome, Y , representing the contrast between cases and controls, and one or more explanatory variables. A different objective for such situations is considered, in which there are distinct explanatory variables, say ( W , X ) determining Y . Reconstruction of the originating distribution of ( W , X ) from the case-control data is considered for both continuous and binary variables. Emphasis is on the linear regression coefficient of W on X . That coefficient, but not the relevant intercept, shows considerable stability, as shown by theory and simulations. An approximation to the value of the coefficient not conditioning on Y is given.
Collapse
Affiliation(s)
- Christiana Kartsonaki
- MRC Population Health Research Unit, Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, UK
| | - D R Cox
- Nuffield College, Oxford OX1 1NF, UK
| |
Collapse
|
10
|
Discerning asthma endotypes through comorbidity mapping. Nat Commun 2022; 13:6712. [PMID: 36344522 PMCID: PMC9640644 DOI: 10.1038/s41467-022-33628-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 09/27/2022] [Indexed: 11/09/2022] Open
Abstract
Asthma is a heterogeneous, complex syndrome, and identifying asthma endotypes has been challenging. We hypothesize that distinct endotypes of asthma arise in disparate genetic variation and life-time environmental exposure backgrounds, and that disease comorbidity patterns serve as a surrogate for such genetic and exposure variations. Here, we computationally discover 22 distinct comorbid disease patterns among individuals with asthma (asthma comorbidity subgroups) using diagnosis records for >151 M US residents, and re-identify 11 of the 22 subgroups in the much smaller UK Biobank. GWASs to discern asthma risk loci for individuals within each subgroup and in all subgroups combined reveal 109 independent risk loci, of which 52 are replicated in multi-ancestry meta-analysis across different ethnicity subsamples in UK Biobank, US BioVU, and BioBank Japan. Fourteen loci confer asthma risk in multiple subgroups and in all subgroups combined. Importantly, another six loci confer asthma risk in only one subgroup. The strength of association between asthma and each of 44 health-related phenotypes also varies dramatically across subgroups. This work reveals subpopulations of asthma patients distinguished by comorbidity patterns, asthma risk loci, gene expression, and health-related phenotypes, and so reveals different asthma endotypes.
Collapse
|
11
|
Vasbinder A, Thompson H, Zaslavksy O, Heckbert SR, Saquib N, Shadyab AH, Chlebowski RT, Warsinger Martin L, Paskett ED, Reding KW. Inflammatory, Oxidative Stress, and Cardiac Damage Biomarkers and Radiation-Induced Fatigue in Breast Cancer Survivors. Biol Res Nurs 2022; 24:472-483. [PMID: 35527686 PMCID: PMC9630726 DOI: 10.1177/10998004221098113] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/03/2023]
Abstract
PURPOSE Studies examining biomarkers associated with fatigue in breast cancer survivors treated with radiation are limited. Therefore, we examined the longitudinal association between serum biomarkers and post-breast cancer fatigue in survivors treated with radiation: [oxidative stress] 8-hydroxyguanosine, myeloperoxidase; [inflammation] interleukin-6 (IL-6), c-reactive protein, growth differentiation factor-15 (GDF-15), placental growth factor, transforming growth factor-beta, [cardiac damage] cystatin-C, troponin-I. METHODS In a secondary analysis, we included participants from the Women's Health Initiative if they had: a previous breast cancer diagnosis (stages I-III), no prior cardiovascular diseases, pre-and post-breast cancer serum samples drawn approximately 3 years apart, and fatigue measured using the Short-Form 36 vitality subscale at both serum collections. Biomarkers were measured using ELISA or RT-qPCR and modeled as the log2 post-to pre-breast cancer ratio. RESULTS Overall, 180 women with a mean (SD) age of 67.0 (5.5) years were included. The mean (SD) vitality scores were 66.2 (17.2) and 59.7 (19.7) pre- and post-breast cancer, respectively. Using multivariable weighted linear regression, higher biomarker ratios of cystatin-C, IL-6, and GDF-15 were associated with a lower vitality score (i.e., higher fatigue). For example, for each 2-fold difference in cystatin-C biomarker ratio, the vitality score was lower by 7.31 points (95% CI: -14.2, -0.45). CONCLUSION Inflammatory and cardiac damage biomarkers are associated with fatigue in breast cancer survivors treated with radiation; however, these findings should be replicated in a larger sample. Biomarkers could be measured in clinical practice or assessed in risk prediction models to help identify patients at high risk for fatigue.
Collapse
Affiliation(s)
- Alexi Vasbinder
- Department of Biobehavioral Nursing
and Health Informatics, School of Nursing, University of
Washington, Seattle, WA, USA
| | - Hilaire Thompson
- Department of Biobehavioral Nursing
and Health Informatics, School of Nursing, University of
Washington, Seattle, WA, USA
| | - Oleg Zaslavksy
- Department of Biobehavioral Nursing
and Health Informatics, School of Nursing, University of
Washington, Seattle, WA, USA
| | - Susan R. Heckbert
- Department of Epidemiology, School
of Public Health, University of
Washington, Seattle, WA, USA
| | - Nazmus Saquib
- Research Unit, College of Medicine, Sulaiman AlRajhi
University, Al Bukayriyah, Saudi Arabia
| | - Aladdin H. Shadyab
- Herbert Wertheim School of Public
Health and Human Longevity Science, University of
California, San Diego, CA, USA
| | - Rowan T. Chlebowski
- Lundquist Institute for Biomedical
Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Lisa Warsinger Martin
- Division of Cardiology, School of
Medicine and Health Sciences, George Washington
University, Washington, DC, USA
| | - Electra D. Paskett
- Department of Medicine,
Comprehensive Cancer Center, The Ohio State
University, Columbus, OH, USA
| | - Kerryn W. Reding
- Department of Biobehavioral Nursing
and Health Informatics, School of Nursing, University of
Washington, Seattle, WA, USA
| |
Collapse
|
12
|
Satten GA, Curtis SW, Solis-Lemus C, Leslie EJ, Epstein MP. Efficient estimation of indirect effects in case-control studies using a unified likelihood framework. Stat Med 2022; 41:2879-2893. [PMID: 35352841 PMCID: PMC9232910 DOI: 10.1002/sim.9390] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2021] [Revised: 03/07/2022] [Accepted: 03/08/2022] [Indexed: 06/01/2024]
Abstract
Mediation models are a set of statistical techniques that investigate the mechanisms that produce an observed relationship between an exposure variable and an outcome variable in order to deduce the extent to which the relationship is influenced by intermediate mediator variables. For a case-control study, the most common mediation analysis strategy employs a counterfactual framework that permits estimation of indirect and direct effects on the odds ratio scale for dichotomous outcomes, assuming either binary or continuous mediators. While this framework has become an important tool for mediation analysis, we demonstrate that we can embed this approach in a unified likelihood framework for mediation analysis in case-control studies that leverages more features of the data (in particular, the relationship between exposure and mediator) to improve efficiency of indirect effect estimates. One important feature of our likelihood approach is that it naturally incorporates cases within the exposure-mediator model to improve efficiency. Our approach does not require knowledge of disease prevalence and can model confounders and exposure-mediator interactions, and is straightforward to implement in standard statistical software. We illustrate our approach using both simulated data and real data from a case-control genetic study of lung cancer.
Collapse
Affiliation(s)
- Glen A. Satten
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA
| | | | - Claudia Solis-Lemus
- Department of Plant Pathology, Wisconsin Institute for Discovery, University of Wisconsin, Madison, WI
| | | | | |
Collapse
|
13
|
Chen S, Zhang H. Analysis of parent‐of‐origin effects for secondary phenotypes using case–control mother–child pair data. Genet Epidemiol 2022; 46:430-445. [DOI: 10.1002/gepi.22463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 03/28/2022] [Accepted: 04/20/2022] [Indexed: 11/10/2022]
Affiliation(s)
- Shuyue Chen
- School of Data Science University of Science and Technology of China Hefei Anhui P.R. China
| | - Hong Zhang
- Department of Statistics and Finance, School of Management University of Science and Technology of China Hefei Anhui P.R. China
| |
Collapse
|
14
|
Cai S, Hartley A, Mahmoud O, Tilling K, Dudbridge F. Adjusting for collider bias in genetic association studies using instrumental variable methods. Genet Epidemiol 2022; 46:303-316. [PMID: 35583096 PMCID: PMC9544531 DOI: 10.1002/gepi.22455] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 04/12/2022] [Accepted: 04/20/2022] [Indexed: 11/16/2022]
Abstract
Genome‐wide association studies have provided many genetic markers that can be used as instrumental variables to adjust for confounding in epidemiological studies. Recently, the principle has been applied to other forms of bias in observational studies, especially collider bias that arises when conditioning or stratifying on a variable that is associated with the outcome of interest. An important case is in studies of disease progression and survival. Here, we clarify the links between the genetic instrumental variable methods proposed for this problem and the established methods of Mendelian randomisation developed to account for confounding. We highlight the critical importance of weak instrument bias in this context and describe a corrected weighted least‐squares procedure as a simple approach to reduce this bias. We illustrate the range of available methods on two data examples. The first, waist–hip ratio adjusted for body‐mass index, entails statistical adjustment for a quantitative trait. The second, smoking cessation, is a stratified analysis conditional on having initiated smoking. In both cases, we find little effect of collider bias on the primary association results, but this may propagate into more substantial effects on further analyses such as polygenic risk scoring and Mendelian randomisation.
Collapse
Affiliation(s)
- Siyang Cai
- Department of Health Sciences, University of Leicester, Leicester, UK
| | - April Hartley
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK
| | - Osama Mahmoud
- Department of Mathematical Sciences, University of Essex, Colchester, UK
| | - Kate Tilling
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Frank Dudbridge
- Department of Health Sciences, University of Leicester, Leicester, UK
| |
Collapse
|
15
|
Modeling Secondary Phenotypes Conditional on Genotypes in Case–Control Studies. STATS 2022. [DOI: 10.3390/stats5010014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Traditional case–control genetic association studies examine relationships between case–control status and one or more covariates. It is becoming increasingly common to study secondary phenotypes and their association with the original covariates. The Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) project, a study of temporomandibular disorders (TMD), motivates this work. Numerous measures of interest are collected at enrollment, such as the number of comorbid pain conditions from which a participant suffers. Examining the potential genetic basis of these measures is of secondary interest. Assessing these associations is statistically challenging, as participants do not form a random sample from the population of interest. Standard methods may be biased and lack coverage and power. We propose a general method for the analysis of arbitrary phenotypes utilizing inverse probability weighting and bootstrapping for standard error estimation. The method may be applied to the complicated association tests used in next-generation sequencing studies, such as analyses of haplotypes with ambiguous phase. Simulation studies show that our method performs as well as competing methods when they are applicable and yield promising results for outcome types, such as time-to-event, to which other methods may not apply. The method is applied to the OPPERA baseline case–control genetic study.
Collapse
|
16
|
Liu Y, Chen H, Heine J, Lindstrom S, Turman C, Warner ET, Winham SJ, Vachon CM, Tamimi RM, Kraft P, Jiang X. A genome-wide association study of mammographic texture variation. Breast Cancer Res 2022; 24:76. [PMCID: PMC9639267 DOI: 10.1186/s13058-022-01570-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 10/26/2022] [Indexed: 11/09/2022] Open
Abstract
Background Breast parenchymal texture features, including grayscale variation (V), capture the patterns of texture variation on a mammogram and are associated with breast cancer risk, independent of mammographic density (MD). However, our knowledge on the genetic basis of these texture features is limited. Methods We conducted a genome-wide association study of V in 7040 European-ancestry women. V assessments were generated from digitized film mammograms. We used linear regression to test the single-nucleotide polymorphism (SNP)-phenotype associations adjusting for age, body mass index (BMI), MD phenotypes, and the top four genetic principal components. We further calculated genetic correlations and performed SNP-set tests of V with MD, breast cancer risk, and other breast cancer risk factors. Results We identified three genome-wide significant loci associated with V: rs138141444 (6q24.1) in ECT2L, rs79670367 (8q24.22) in LINC01591, and rs113174754 (12q22) near PGAM1P5. 6q24.1 and 8q24.22 have not previously been associated with MD phenotypes or breast cancer risk, while 12q22 is a known locus for both MD and breast cancer risk. Among known MD and breast cancer risk SNPs, we identified four variants that were associated with V at the Bonferroni-corrected thresholds accounting for the number of SNPs tested: rs335189 (5q23.2) in PRDM6, rs13256025 (8p21.2) in EBF2, rs11836164 (12p12.1) near SSPN, and rs17817449 (16q12.2) in FTO. We observed significant genetic correlations between V and mammographic dense area (rg = 0.79, P = 5.91 × 10−5), percent density (rg = 0.73, P = 1.00 × 10−4), and adult BMI (rg = − 0.36, P = 3.88 × 10−7). Additional significant relationships were observed for non-dense area (z = − 4.14, P = 3.42 × 10−5), estrogen receptor-positive breast cancer (z = 3.41, P = 6.41 × 10−4), and childhood body fatness (z = − 4.91, P = 9.05 × 10−7) from the SNP-set tests. Conclusions These findings provide new insights into the genetic basis of mammographic texture variation and their associations with MD, breast cancer risk, and other breast cancer risk factors. Supplementary Information The online version contains supplementary material available at 10.1186/s13058-022-01570-8.
Collapse
Affiliation(s)
- Yuxi Liu
- grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA USA ,grid.38142.3c000000041936754XProgram in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, 655 Huntington Avenue, Building 2-249A, Boston, MA 02115 USA
| | - Hongjie Chen
- grid.34477.330000000122986657Department of Epidemiology, University of Washington, Seattle, WA USA
| | - John Heine
- grid.468198.a0000 0000 9891 5233Division of Population Sciences, H. Lee Moffitt Cancer Center & Research Institute, Tampa, FL USA
| | - Sara Lindstrom
- grid.34477.330000000122986657Department of Epidemiology, University of Washington, Seattle, WA USA ,grid.270240.30000 0001 2180 1622Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA USA
| | - Constance Turman
- grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Erica T. Warner
- grid.38142.3c000000041936754XClinical and Translational Epidemiology Unit, Department of Medicine, Mongan Institute, Massachusetts General Hospital and Harvard Medical School, Boston, MA USA
| | - Stacey J. Winham
- grid.66875.3a0000 0004 0459 167XBiomedical Statistics and Informatics, Mayo Clinic, Rochester, MN USA
| | - Celine M. Vachon
- grid.66875.3a0000 0004 0459 167XDivision of Epidemiology, Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN USA
| | - Rulla M. Tamimi
- grid.38142.3c000000041936754XChanning Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA USA ,grid.5386.8000000041936877XDepartment of Population Health Sciences, Weill Cornell Medicine, New York, NY USA
| | - Peter Kraft
- grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA USA ,grid.38142.3c000000041936754XProgram in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, 655 Huntington Avenue, Building 2-249A, Boston, MA 02115 USA ,grid.38142.3c000000041936754XDepartment of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA USA
| | - Xia Jiang
- grid.465198.7Department of Clinical Neuroscience, Center for Molecular Medicine, Karolinska Institutet, Visionsgatan 18, 171 77 Solna, Stockholm Sweden ,grid.13291.380000 0001 0807 1581West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
17
|
Pirastu N, Cordioli M, Nandakumar P, Mignogna G, Abdellaoui A, Hollis B, Kanai M, Rajagopal VM, Parolo PDB, Baya N, Carey CE, Karjalainen J, Als TD, Van der Zee MD, Day FR, Ong KK, Morisaki T, de Geus E, Bellocco R, Okada Y, Børglum AD, Joshi P, Auton A, Hinds D, Neale BM, Walters RK, Nivard MG, Perry JRB, Ganna A. Genetic analyses identify widespread sex-differential participation bias. Nat Genet 2021; 53:663-671. [PMID: 33888908 DOI: 10.1101/2020.03.22.001453v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 03/16/2021] [Indexed: 05/25/2023]
Abstract
Genetic association results are often interpreted with the assumption that study participation does not affect downstream analyses. Understanding the genetic basis of participation bias is challenging since it requires the genotypes of unseen individuals. Here we demonstrate that it is possible to estimate comparative biases by performing a genome-wide association study contrasting one subgroup versus another. For example, we showed that sex exhibits artifactual autosomal heritability in the presence of sex-differential participation bias. By performing a genome-wide association study of sex in approximately 3.3 million males and females, we identified over 158 autosomal loci spuriously associated with sex and highlighted complex traits underpinning differences in study participation between the sexes. For example, the body mass index-increasing allele at FTO was observed at higher frequency in males compared to females (odds ratio = 1.02, P = 4.4 × 10-36). Finally, we demonstrated how these biases can potentially lead to incorrect inferences in downstream analyses and propose a conceptual framework for addressing such biases. Our findings highlight a new challenge that genetic studies may face as sample sizes continue to grow.
Collapse
Affiliation(s)
- Nicola Pirastu
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Mattia Cordioli
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | | | - Gianmarco Mignogna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Department of Statistics and Quantitative Methods, University of Milano Bicocca, Milan, Italy
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Abdel Abdellaoui
- Department of Psychiatry, Amsterdam Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Benjamin Hollis
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- The Kennedy Institute of Rheumatology, University of Oxford, Oxford, UK
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Veera M Rajagopal
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark
- Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | | | - Nikolas Baya
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Caitlin E Carey
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Juha Karjalainen
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Thomas D Als
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark
- Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | - Matthijs D Van der Zee
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands
| | - Felix R Day
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Ken K Ong
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
- Department of Paediatrics, University of Cambridge, Cambridge, UK
| | - Takayuki Morisaki
- Division of Molecular Pathology, Institute of Medical Sciences, University of Tokyo, Tokyo, Japan
- BioBank Japan, Institute of Medical Science, University of Tokyo, Tokyo, Japan
- Department of Internal Medicine, Institute of Medical Science, University of Tokyo Hospital, Tokyo, Japan
| | - Eco de Geus
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands
- Amsterdam Public Health Research institute, Amsterdam, the Netherlands
| | - Rino Bellocco
- Department of Statistics and Quantitative Methods, University of Milano Bicocca, Milan, Italy
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Laboratory of Statistical Immunology, World Premier International Immunology Frontier Research Center, Osaka University, Suita, Japan
- Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
| | - Anders D Børglum
- Department of Biomedicine, Aarhus University, Aarhus, Denmark
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark
- Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | - Peter Joshi
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, UK
| | | | | | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Raymond K Walters
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michel G Nivard
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands
- Amsterdam Public Health, Methodology Program, Amsterdam, the Netherlands
- Amsterdam Neuroscience-Mood, Anxiety, Psychosis, Stress & Sleep, Amsterdam, the Netherlands
| | - John R B Perry
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
18
|
Pirastu N, Cordioli M, Nandakumar P, Mignogna G, Abdellaoui A, Hollis B, Kanai M, Rajagopal VM, Parolo PDB, Baya N, Carey CE, Karjalainen J, Als TD, Van der Zee MD, Day FR, Ong KK, Morisaki T, de Geus E, Bellocco R, Okada Y, Børglum AD, Joshi P, Auton A, Hinds D, Neale BM, Walters RK, Nivard MG, Perry JRB, Ganna A. Genetic analyses identify widespread sex-differential participation bias. Nat Genet 2021; 53:663-671. [PMID: 33888908 DOI: 10.1038/s41588-021-00846-7] [Citation(s) in RCA: 107] [Impact Index Per Article: 35.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Accepted: 03/16/2021] [Indexed: 01/22/2023]
Abstract
Genetic association results are often interpreted with the assumption that study participation does not affect downstream analyses. Understanding the genetic basis of participation bias is challenging since it requires the genotypes of unseen individuals. Here we demonstrate that it is possible to estimate comparative biases by performing a genome-wide association study contrasting one subgroup versus another. For example, we showed that sex exhibits artifactual autosomal heritability in the presence of sex-differential participation bias. By performing a genome-wide association study of sex in approximately 3.3 million males and females, we identified over 158 autosomal loci spuriously associated with sex and highlighted complex traits underpinning differences in study participation between the sexes. For example, the body mass index-increasing allele at FTO was observed at higher frequency in males compared to females (odds ratio = 1.02, P = 4.4 × 10-36). Finally, we demonstrated how these biases can potentially lead to incorrect inferences in downstream analyses and propose a conceptual framework for addressing such biases. Our findings highlight a new challenge that genetic studies may face as sample sizes continue to grow.
Collapse
Affiliation(s)
- Nicola Pirastu
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, UK
| | - Mattia Cordioli
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | | | - Gianmarco Mignogna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.,Department of Statistics and Quantitative Methods, University of Milano Bicocca, Milan, Italy.,Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Abdel Abdellaoui
- Department of Psychiatry, Amsterdam Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands
| | - Benjamin Hollis
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK.,The Kennedy Institute of Rheumatology, University of Oxford, Oxford, UK
| | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.,Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
| | - Veera M Rajagopal
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.,The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.,Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark.,Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | | | - Nikolas Baya
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Caitlin E Carey
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Juha Karjalainen
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland.,Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Thomas D Als
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.,The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.,Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark.,Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | - Matthijs D Van der Zee
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands
| | - Felix R Day
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK
| | - Ken K Ong
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK.,Department of Paediatrics, University of Cambridge, Cambridge, UK
| | | | | | | | - Takayuki Morisaki
- Division of Molecular Pathology, Institute of Medical Sciences, University of Tokyo, Tokyo, Japan.,BioBank Japan, Institute of Medical Science, University of Tokyo, Tokyo, Japan.,Department of Internal Medicine, Institute of Medical Science, University of Tokyo Hospital, Tokyo, Japan
| | - Eco de Geus
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands.,Amsterdam Public Health Research institute, Amsterdam, the Netherlands
| | - Rino Bellocco
- Department of Statistics and Quantitative Methods, University of Milano Bicocca, Milan, Italy.,Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan.,Laboratory of Statistical Immunology, World Premier International Immunology Frontier Research Center, Osaka University, Suita, Japan.,Integrated Frontier Research for Medical Science Division, Institute for Open and Transdisciplinary Research Initiatives, Osaka University, Suita, Japan
| | - Anders D Børglum
- Department of Biomedicine, Aarhus University, Aarhus, Denmark.,The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.,Centre for Genomics and Personalized Medicine, Center for Genimics and Personalized Medice, Aarhus University, Aarhus, Denmark.,Centre for Integrative Sequencing, iSEQ, Aarhus University, Aarhus, Denmark
| | - Peter Joshi
- Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, UK
| | | | | | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Raymond K Walters
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.,Stanley Center for Psychiatric Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Michel G Nivard
- Faculty of Behavioural and Movement Sciences, Biological Psychology, Vrije Universiteit, Amsterdam, the Netherlands.,Amsterdam Public Health, Methodology Program, Amsterdam, the Netherlands.,Amsterdam Neuroscience-Mood, Anxiety, Psychosis, Stress & Sleep, Amsterdam, the Netherlands
| | - John R B Perry
- MRC Epidemiology Unit, Institute of Metabolic Science, University of Cambridge, Cambridge, UK.
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland. .,Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA. .,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
19
|
Li F, Allen AS. Secondary analysis of case-control association studies: Insights on weighting-based inference motivate a new specification test. Stat Med 2020; 39:2869-2882. [PMID: 32501597 DOI: 10.1002/sim.8579] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2018] [Revised: 03/02/2020] [Accepted: 04/24/2020] [Indexed: 12/31/2022]
Abstract
Case-control sampling is frequently used in genetic association studies to examine the relationship between disease and genetic exposures. Such designs usually collect extensive information on phenotypes beyond the primary disease, whose associations with the genetic exposures are also of great interest. Because the cases are over-sampled, appropriate analysis of secondary phenotypes should take into account this biased sampling design. We previously introduced a weighting-based estimator for appropriate secondary analysis, but have not thoroughly explored its statistical properties. In this article, we revisit our previous estimator to offer new insights and methodological extensions. Specifically, we extend our previous estimator and construct its more general form based on generalized least squares (GLS). Such an extension allows us to connect the GLS estimator with the generalized method of moments and motivates a new specification test designed to assess the adequacy of the disease model or the weights. The specification test statistic measures the weighted discrepancy between the case and control subsample estimators, and asymptotically follows a central Chi-squared distribution under correct disease model specification. We illustrate the GLS estimator and specification test using a case-control sample of peripheral arterial disease, and use simulations to further shed light on the operating characteristics of the specification test.
Collapse
Affiliation(s)
- Fan Li
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA.,Center for Statistical Genetics and Genomics, Duke University, Durham, North Carolina, USA
| |
Collapse
|
20
|
Yajnik P, Boehnke M. Power loss due to testing association between covariate-adjusted traits and genetic variants. Genet Epidemiol 2020; 44:579-588. [PMID: 32511788 DOI: 10.1002/gepi.22325] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Revised: 03/27/2020] [Accepted: 05/19/2020] [Indexed: 11/07/2022]
Abstract
Multiple linear regression is commonly used to test for association between genetic variants and continuous traits and estimate genetic effect sizes. Confounding variables are controlled for by including them as additional covariates. An alternative technique that is increasingly used is to regress out covariates from the raw trait and then perform regression analysis with only the genetic variants included as predictors. In the case of single-variant analysis, this adjusted trait regression (ATR) technique is known to be less powerful than the traditional technique when the genetic variant is correlated with the covariates We extend previous results for single-variant tests by deriving exact relationships between the single-variant score, Wald, likelihood-ratio, and F test statistics and their ATR analogs. We also derive the asymptotic power of ATR analogs of the multiple-variant score and burden tests. We show that the maximum power loss of the ATR analog of the multiple-variant score test is completely characterized by the canonical correlations between the set of genetic variants and the set of covariates. Further, we show that for both single- and multiple-variant tests, the power loss for ATR analogs increases with increasing stringency of Type 1 error control ( α ) and increasing correlation (or canonical correlations) between the genetic variant (or multiple variants) and covariates. We recommend using ATR only when maximum canonical correlation between variants and covariates is low, as is typically true.
Collapse
Affiliation(s)
- Pranav Yajnik
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan
| | - Michael Boehnke
- Department of Biostatistics, University of Michigan School of Public Health, Ann Arbor, Michigan
| |
Collapse
|
21
|
Zhou F, Zhou H, Li T, Zhu H. Analysis of secondary phenotypes in multigroup association studies. Biometrics 2020; 76:606-618. [PMID: 31544963 PMCID: PMC7085961 DOI: 10.1111/biom.13157] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 08/30/2019] [Indexed: 11/28/2022]
Abstract
Although case-control association studies have been widely used, they are insufficient for many complex diseases, such as Alzheimer's disease and breast cancer, since these diseases may have multiple subtypes with distinct morphologies and clinical implications. Many multigroup studies, such as the Alzheimer's Disease Neuroimaging Initiative (ADNI), have been undertaken by recruiting subjects based on their multiclass primary disease status, while extensive secondary outcomes have been collected. The aim of this paper is to develop a general regression framework for the analysis of secondary phenotypes collected in multigroup association studies. Our regression framework is built on a conditional model for the secondary outcome given the multigroup status and covariates and its relationship with the population regression of interest of the secondary outcome given the covariates. Then, we develop generalized estimation equations to estimate the parameters of interest. We use both simulations and a large-scale imaging genetic data analysis from the ADNI to evaluate the effect of the multigroup sampling scheme on standard genome-wide association analyses based on linear regression methods, while comparing it with our statistical methods that appropriately adjust for the multigroup sampling scheme. Data used in preparation of this article were obtained from the ADNI database.
Collapse
Affiliation(s)
- Fan Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Tengfei Li
- Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Biomedical Research Imaging Center, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
22
|
Tounkara F, Lefebvre G, Greenwood C, Oualkacha K. A flexible copula-based approach for the analysis of secondary phenotypes in ascertained samples. Stat Med 2020; 39:517-543. [PMID: 31868965 DOI: 10.1002/sim.8416] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 04/30/2019] [Accepted: 09/04/2019] [Indexed: 12/20/2022]
Abstract
Data collected for a genome-wide association study of a primary phenotype are often used for additional genome-wide association analyses of secondary phenotypes. However, when the primary and secondary traits are dependent, naïve analyses of secondary phenotypes may induce spurious associations in non-randomly ascertained samples. Previously, retrospective likelihood-based methods have been proposed to correct for sampling biases arising in secondary trait association analyses. However, most methods have been introduced to handle studies featuring a case-control design based on a binary primary phenotype. As such, these methods are not directly applicable to more complicated study designs such as multiple-trait studies, where the sampling mechanism also depends on the secondary phenotype, or extreme-trait studies, where individuals with extreme primary phenotype values are selected. To accommodate these more complicated sampling mechanisms, only a few prospective likelihood approaches have been proposed. These approaches assume a normal distribution for the secondary phenotype (or the latent secondary phenotype) and a bivariate normal distribution for the primary-secondary phenotype dependence. In this paper, we propose a unified copula-based approach to appropriately detect genetic variant/secondary phenotype association in the presence of selected samples. Primary phenotype is either binary or continuous and the secondary phenotype is continuous although not necessary normal. We use both prospective and retrospective likelihoods to account for the sampling mechanism and use a copula model to allow for potentially different dependence structures between the primary and secondary phenotypes. We demonstrate the effectiveness of our approach through simulation studies and by analyzing data from the Avon Longitudinal Study of Parents and Children cohort.
Collapse
Affiliation(s)
- Fodé Tounkara
- Lunenfeld-Tenenbaum Research Institute, Toronto, Canada
| | - Geneviève Lefebvre
- Department of Mathematics, Université du Québec à Montréal, Montreal, Canada
| | - Celia Greenwood
- Lady Davis Research Institute, Centre for Clinical Epidemiology, Jewish General Hospital, Montreal, Canada.,Gerald Bronfman Department of Oncology, McGill University, Montreal, Canada.,Department of Epidemiology, Biostatistics & Occupational Health, McGill University, Montreal, Canada.,Department of Human Genetics, McGill University, Montreal, Canada
| | - Karim Oualkacha
- Department of Mathematics, Université du Québec à Montréal, Montreal, Canada
| |
Collapse
|
23
|
Bi W, Li Y, Smeltzer MP, Gao G, Zhao S, Kang G. STEPS: an efficient prospective likelihood approach to genetic association analyses of secondary traits in extreme phenotype sequencing. Biostatistics 2020; 21:33-49. [PMID: 30007308 PMCID: PMC8559722 DOI: 10.1093/biostatistics/kxy030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 05/16/2018] [Accepted: 06/02/2018] [Indexed: 11/13/2022] Open
Abstract
It has been well acknowledged that methods for secondary trait (ST) association analyses under a case-control design (ST$_{\text{CC}}$) should carefully consider the sampling process to avoid biased risk estimates. A similar situation also exists in the extreme phenotype sequencing (EPS) designs, which is to select subjects with extreme values of continuous primary phenotype for sequencing. EPS designs are commonly used in modern epidemiological and clinical studies such as the well-known National Heart, Lung, and Blood Institute Exome Sequencing Project. Although naïve generalized regression or ST$_{\text{CC}}$ method could be applied, their validity is questionable due to difference in statistical designs. Herein, we propose a general prospective likelihood framework to perform association testing for binary and continuous STs under EPS designs (STEPS), which can also incorporate covariates and interaction terms. We provide a computationally efficient and robust algorithm to obtain the maximum likelihood estimates. We also present two empirical mathematical formulas for power/sample size calculations to facilitate planning of binary/continuous STs association analyses under EPS designs. Extensive simulations and application to a genome-wide association study of benign ethnic neutropenia under an EPS design demonstrate the superiority of STEPS over all its alternatives above.
Collapse
Affiliation(s)
- Wenjian Bi
- Department of Biostatistics, St. Jude Children’s Research
Hospital, Memphis, TN 38105, USA
| | - Yun Li
- Department of Genetics, University of North Carolina, Chapel
Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina, Chapel
Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina,
Chapel Hill, NC 27599, USA
| | - Matthew P Smeltzer
- Division of Epidemiology, Biostatistics, and Environmental Health, School of
Public Health, University of Memphis, Memphis, TN 38152, USA
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago,
Chicago, IL 60637, USA
| | - Shengli Zhao
- School of Statistics, Qufu Normal University, Qufu 273165, PR
China
| | - Guolian Kang
- Department of Biostatistics, St. Jude Children’s Research
Hospital, Memphis, TN 38105, USA
| |
Collapse
|
24
|
Sanders AE, Greenspan JD, Fillingim RB, Rathnayaka N, Ohrbach R, Slade GD. Associations of Sleep Disturbance, Atopy, and Other Health Measures with Chronic Overlapping Pain Conditions. J Oral Facial Pain Headache 2020; 34:s73-s84. [PMID: 32975542 PMCID: PMC9879298 DOI: 10.11607/ofph.2577] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
AIMS To quantify the contributions of atopic disorders, sleep disturbance, and other health conditions to five common pain conditions. METHODS This cross-sectional analysis used data from 655 participants in the OPPERA study. The authors investigated the individual and collective associations of five chronic overlapping pain conditions (COPCs) with medically diagnosed atopic disorders and self-reported sleep disturbance, fatigue, and symptoms of obstructive sleep apnea. Atopic disorders were allergies, allergic rhinitis, atopic dermatitis, allergic asthma, urticaria, allergic conjunctivitis, and food allergy. Logistic regression models estimated odds ratios as measures of association with temporomandibular disorders, headache, irritable bowel syndrome, low back pain, and fibromyalgia. Measures of sleep and atopy disorders were standardized to z scores to determine the relative strength of their associations with each COPC. Sociodemographic characteristics and body mass index were covariates. Random forest regression analyzed all variables simultaneously, computing importance metrics to determine which variables best differentiated pain cases from controls. RESULTS Fatigue and sleep disturbance were strongly associated with each COPC and with the total number of COPCs. An increase of one standard deviation in fatigue or sleep disturbance score was associated with approximately two-fold greater odds of having a COPC. In random forest models, atopic disorders contributed more than other health measures to differentiating between cases and controls of headache, whereas other COPCs were best differentiated by measures of fatigue or sleep. CONCLUSION Atopic disorders, previously recognized as predictors of poor sleep, are associated with COPCs after accounting for sleep problems.
Collapse
Affiliation(s)
- Anne E. Sanders
- Division of Pediatric and Public Health, Adams School of Dentistry, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Joel D. Greenspan
- Department of Neural and Pain Sciences, Brotman Facial Pain Clinic, School of Dentistry, University of Maryland, Baltimore, Maryland, USA
| | - Roger B. Fillingim
- Department of Community Dentistry & Behavioral Science, Pain Research and Intervention Center of Excellence, College of Dentistry, University of Florida, Gainesville, Florida, USA
| | - Nuvan Rathnayaka
- Department of Biostatistics, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Richard Ohrbach
- Department of Oral Diagnostic Sciences, University at Buffalo School of Dental Medicine, Buffalo, New York, USA; Department of Orofacial Pain and Jaw Function, Faculty of Odontology, Malmö University, Malmö, Sweden
| | - Gary D. Slade
- Division of Pediatric and Public Health, Adams School of Dentistry, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, North Carolina, USA
| |
Collapse
|
25
|
Wang CY, Dai J. Best linear inverse probability weighted estimation for two-phase designs and missing covariate regression. Stat Med 2019; 38:2783-2796. [PMID: 30908669 DOI: 10.1002/sim.8141] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2017] [Revised: 02/08/2019] [Accepted: 02/20/2019] [Indexed: 11/07/2022]
Abstract
The inverse probability weighted estimator is often applied to two-phase designs and regression with missing covariates. Inverse probability weighted estimators typically are less efficient than likelihood-based estimators but, in general, are more robust against model misspecification. In this paper, we propose a best linear inverse probability weighted estimator for two-phase designs and missing covariate regression. Our proposed estimator is the projection of the SIPW onto the orthogonal complement of the score space based on a working regression model of the observed covariate data. The efficiency gain is from the use of the association between the outcome variable and the available covariates, which is the working regression model. One advantage of the proposed estimator is that there is no need to calculate the augmented term of the augmented weighted estimator. The estimator can be applied to general missing data problems or two-phase design studies in which the second phase data are obtained in a subcohort. The method can also be applied to secondary trait case-control genetic association studies. The asymptotic distribution is derived, and the finite sample performance of the proposed estimator is examined via extensive simulation studies. The methods are applied to a bladder cancer case-control study.
Collapse
Affiliation(s)
- Ching-Yun Wang
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - James Dai
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington
| |
Collapse
|
26
|
Adjustment for index event bias in genome-wide association studies of subsequent events. Nat Commun 2019; 10:1561. [PMID: 30952951 PMCID: PMC6450903 DOI: 10.1038/s41467-019-09381-w] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 03/05/2019] [Indexed: 12/12/2022] Open
Abstract
Following numerous genome-wide association studies of disease susceptibility, there is increasing interest in genetic associations with prognosis, survival or other subsequent events. Such associations are vulnerable to index event bias, by which selection of subjects according to disease status creates biased associations if common causes of incidence and prognosis are not accounted for. We propose an adjustment for index event bias using the residuals from the regression of genetic effects on prognosis on genetic effects on incidence. Our approach eliminates this bias when direct genetic effects on incidence and prognosis are independent, and otherwise reduces bias in realistic situations. In a study of idiopathic pulmonary fibrosis, we reverse a paradoxical association of the strong susceptibility gene MUC5B with increased survival, suggesting instead a significant association with decreased survival. In re-analysis of a study of Crohn’s disease prognosis, four regions remain associated at genome-wide significance but with increased standard errors. Different from GWAS for susceptibility to disease, GWAS for prognosis or survival may be vulnerable to selection bias. Here, Dudbridge et al present an approach to reduce index event bias in simulated and realistic situations, and apply it to GWAS of survival with idiopathic pulmonary fibrosis and Crohn’s disease prognosis.
Collapse
|
27
|
A review of analysis methods for secondary outcomes in case-control studies. COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS 2019. [DOI: 10.29220/csam.2019.26.2.103] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
28
|
Boone SC, le Cessie S, van Dijk KW, de Mutsert R, Mook-Kanamori DO. Avoiding selection bias in metabolomics studies: a tutorial. Metabolomics 2019; 15:7. [PMID: 30830435 DOI: 10.1007/s11306-018-1463-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 12/18/2018] [Indexed: 11/24/2022]
Abstract
BACKGROUND Metabolomics techniques are increasingly applied in epidemiologic research. Many available assays are still relatively expensive and therefore measurements are often performed in small patient population studies such as case series or case-control designs with strong participant selection criteria. Subsequently, metabolomics data are frequently used to assess secondary associations for which the original study was not explicitly designed. Especially in these secondary analyses, there is a risk that the original selection criteria and the conditioning that takes place due to this selection are not properly accounted for which can lead to selection bias. AIM OF REVIEW In this tutorial, we start with a brief theoretical introduction on the issue of selection bias. Subsequently, we demonstrate how selection bias can occur in metabolomics studies by means of an investigation into associations of metabolites with total body fat in a nested case-control study that was originally designed to study effects of elevated fasting glucose. KEY SCIENTIFIC CONCEPTS OF REVIEW We demonstrate that standard analytical methods, such as stratification or adjustment in regression analyses, are not suited to deal with selection bias and may even induce the bias when analysing metabolite-phenotype relationships in selected groups. Finally, we show that inverse probability weighting, also known as survey weighting, can be used in some situations to make unbiased estimates of the outcomes.
Collapse
Affiliation(s)
- S C Boone
- Department of Clinical Epidemiology, Department C7-P, Leiden University Medical Center (LUMC), PO Box 9600, 2300 RC, Leiden, The Netherlands.
| | - S le Cessie
- Department of Clinical Epidemiology, Department C7-P, Leiden University Medical Center (LUMC), PO Box 9600, 2300 RC, Leiden, The Netherlands
- Department of Biomedical Data Sciences, Section Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - K Willems van Dijk
- Department of Endocrinology, Leiden University Medical Center, Leiden, The Netherlands
- Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - R de Mutsert
- Department of Clinical Epidemiology, Department C7-P, Leiden University Medical Center (LUMC), PO Box 9600, 2300 RC, Leiden, The Netherlands
| | - D O Mook-Kanamori
- Department of Clinical Epidemiology, Department C7-P, Leiden University Medical Center (LUMC), PO Box 9600, 2300 RC, Leiden, The Netherlands
- Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
29
|
Fuady AM, Lent S, Sarnowski C, Tintle NL. Application of novel and existing methods to identify genes with evidence of epigenetic association: results from GAW20. BMC Genet 2018; 19:72. [PMID: 30255777 PMCID: PMC6157126 DOI: 10.1186/s12863-018-0647-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND The rise in popularity and accessibility of DNA methylation data to evaluate epigenetic associations with disease has led to numerous methodological questions. As part of GAW20, our working group of 8 research groups focused on gene searching methods. RESULTS Although the methods were varied, we identified 3 main themes within our group. First, many groups tackled the question of how best to use pedigree information in downstream analyses, finding that (a) the use of kinship matrices is common practice, (b) ascertainment corrections may be necessary, and (c) pedigree information may be useful for identifying parent-of-origin effects. Second, many groups also considered multimarker versus single-marker tests. Multimarker tests had modestly improved power versus single-marker methods on simulated data, and on real data identified additional associations that were not identified with single-marker methods, including identification of a gene with a strong biological interpretation. Finally, some of the groups explored methods to combine single-nucleotide polymorphism (SNP) and DNA methylation into a single association analysis. CONCLUSIONS A causal inference method showed promise at discovering new mechanisms of SNP activity; gene-based methods of summarizing SNP and DNA methylation data also showed promise. Even though numerous questions still remain in the analysis of DNA methylation data, our discussions at GAW20 suggest some emerging best practices.
Collapse
Affiliation(s)
- Angga M. Fuady
- Medical Statistics, Department of Biomedical Data Sciences, Leiden University Medical Center, Einthovenweg 20, 2333 Leiden, ZC Netherlands
| | - Samantha Lent
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Chloé Sarnowski
- Department of Biostatistics, Boston University School of Public Health, 801 Massachusetts Avenue, Boston, MA 02118 USA
| | - Nathan L. Tintle
- Department of Mathematics and Statistics, Dordt College, Sioux Center, IA 51250 USA
| |
Collapse
|
30
|
Liang L, Ma Y, Wei Y, Carroll RJ. Semiparametrically efficient estimation in quantile regression of secondary analysis. J R Stat Soc Series B Stat Methodol 2018; 80:625-648. [PMID: 30337833 PMCID: PMC6191046 DOI: 10.1111/rssb.12272] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Analysing secondary outcomes is a common practice for case-control studies. Traditional secondary analysis employs either completely parametric models or conditional mean regression models to link the secondary outcome to covariates. In many situations, quantile regression models complement mean-based analyses and provide alternative new insights on the associations of interest. For example, biomedical outcomes are often highly asymmetric, and median regression is more useful in describing the 'central' behaviour than mean regressions. There are also cases where the research interest is to study the high or low quantiles of a population, as they are more likely to be at risk. We approach the secondary quantile regression problem from a semiparametric perspective, allowing the covariate distribution to be completely unspecified. We derive a class of consistent semiparametric estimators and identify the efficient member. The asymptotic properties of the resulting estimators are established. Simulation results and a real data analysis are provided to demonstrate the superior performance of our approach with a comparison with the only existing approach so far in the literature.
Collapse
Affiliation(s)
| | - Yanyuan Ma
- Penn State University, University Park, USA
| | - Ying Wei
- Columbia University, New York, USA
| | - Raymond J Carroll
- Texas A&M University, College Station, USA, and University of Technology, Sydney, Australia
| |
Collapse
|
31
|
Harrewijn A, Van der Molen MJW, Verkuil B, Sweijen SW, Houwing-Duistermaat JJ, Westenberg PM. Heart rate variability as candidate endophenotype of social anxiety: A two-generation family study. J Affect Disord 2018; 237:47-55. [PMID: 29763849 DOI: 10.1016/j.jad.2018.05.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Revised: 04/09/2018] [Accepted: 05/07/2018] [Indexed: 01/17/2023]
Abstract
BACKGROUND Social anxiety disorder (SAD) is the extreme fear and avoidance of one or more social situations. The goal of the current study was to investigate whether heart rate variability (HRV) during resting state and a social performance task (SPT) is a candidate endophenotype of SAD. METHODS In this two-generation family study, patients with SAD with their partner and children, and their siblings with partner and children took part in a SPT (total n = 121, 9 families, 3-30 persons per family, age range: 8-61 years, 17 patients with SAD). In this task, participants had to watch and evaluate the speech of a female peer, and had to give a similar speech. HRV was measured during two resting state phases, and during anticipation, speech and recovery phases of the SPT. We tested two criteria for endophenotypes: co-segregation with SAD within families and heritability. RESULTS HRV did not co-segregate with SAD within families. Root mean square of successive differences during the first resting phase and recovery, and high frequency power during all phases of the task were heritable. LIMITATIONS It should be noted that few participants were diagnosed with SAD. Results during the speech should be interpreted with caution, because the duration was short and there was a lot of movement. CONCLUSIONS HRV during resting state and the SPT is a possible endophenotype, but not of SAD. As other studies have shown that HRV is related to different internalizing disorders, HRV might reflect a transdiagnostic genetic vulnerability for internalizing disorders. Future research should investigate which factors influence the development of psychopathology in persons with decreased HRV.
Collapse
Affiliation(s)
- A Harrewijn
- Developmental and educational psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands.
| | - M J W Van der Molen
- Developmental and educational psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| | - B Verkuil
- Leiden Institute for Brain and Cognition, Leiden University, The Netherlands; Clinical psychology, Leiden University, The Netherlands
| | - S W Sweijen
- Developmental and educational psychology, Leiden University, The Netherlands
| | - J J Houwing-Duistermaat
- Department of Medical Statistics and BioInformatics, Leiden University Medical Center, The Netherlands; Department of Statistics, University of Leeds, United Kingdom
| | - P M Westenberg
- Developmental and educational psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| |
Collapse
|
32
|
Fuady AM, Tissier RLM, Houwing-Duistermaat JJ. Genome-wide analysis in multiple-case families: assessing the relationship between triglyceride and methylation. BMC Proc 2018; 12:33. [PMID: 30275885 PMCID: PMC6157284 DOI: 10.1186/s12919-018-0123-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The main goal of this paper is to estimate the effect of triglyceride levels on methylation of cytosine-phosphate-guanine (CpG) sites in multiple-case families. These families are selected because they have 2 or more cases of metabolic syndrome (primary phenotype). The methylations at the CpG sites are the secondary phenotypes. Ascertainment corrections are needed when there is an association between the primary and secondary phenotype. We will apply the newly developed secondary phenotype analysis for multiple-case family studies to identify CpG sites where methylations are influenced by triglyceride levels. Our second goal is to compare the performance of the naïve approach, which ignores the sampling of the families, SOLAR (Sequential Oligogenic Linkage Analysis Routines), which adjusts for ascertainment via probands, and the secondary phenotype approach. The analysis of possible CpG sites associated with triglyceride levels shows results consistent with the literature when using the secondary phenotype approach. Overall, the secondary phenotype approach performed well, but the comparison of the different approaches does not show significant differences between them. However, for genome-wide applications, we recommend using the secondary phenotype approach when there is an association between primary and secondary phenotypes, and to use the naïve approach otherwise.
Collapse
|
33
|
Pan Y, Cai J, Longnecker MP, Zhou H. Secondary outcome analysis for data from an outcome-dependent sampling design. Stat Med 2018; 37:2321-2337. [PMID: 29682775 DOI: 10.1002/sim.7672] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Revised: 01/19/2018] [Accepted: 03/08/2018] [Indexed: 11/11/2022]
Abstract
Outcome-dependent sampling (ODS) scheme is a cost-effective way to conduct a study. For a study with continuous primary outcome, an ODS scheme can be implemented where the expensive exposure is only measured on a simple random sample and supplemental samples selected from 2 tails of the primary outcome variable. With the tremendous cost invested in collecting the primary exposure information, investigators often would like to use the available data to study the relationship between a secondary outcome and the obtained exposure variable. This is referred as secondary analysis. Secondary analysis in ODS designs can be tricky, as the ODS sample is not a random sample from the general population. In this article, we use the inverse probability weighted and augmented inverse probability weighted estimating equations to analyze the secondary outcome for data obtained from the ODS design. We do not make any parametric assumptions on the primary and secondary outcome and only specify the form of the regression mean models, thus allow an arbitrary error distribution. Our approach is robust to second- and higher-order moment misspecification. It also leads to more precise estimates of the parameters by effectively using all the available participants. Through simulation studies, we show that the proposed estimator is consistent and asymptotically normal. Data from the Collaborative Perinatal Project are analyzed to illustrate our method.
Collapse
Affiliation(s)
- Yinghao Pan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Matthew P Longnecker
- Epidemiology Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| |
Collapse
|
34
|
Harrewijn A, van der Molen MJW, van Vliet IM, Houwing-Duistermaat JJ, Westenberg PM. Delta-beta correlation as a candidate endophenotype of social anxiety: A two-generation family study. J Affect Disord 2018; 227:398-405. [PMID: 29154156 DOI: 10.1016/j.jad.2017.11.019] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 10/01/2017] [Accepted: 11/07/2017] [Indexed: 02/02/2023]
Abstract
BACKGROUND Social anxiety disorder (SAD) is characterized by an extreme and intense fear and avoidance of social situations. In this two-generation family study we examined delta-beta correlation during a social performance task as candidate endophenotype of SAD. METHODS Nine families with a target participant (diagnosed with SAD), their spouse and children, as well as target's siblings with spouse and children performed a social performance task in which they gave a speech in front of a camera. EEG was measured during resting state, anticipation, and recovery. Our analyses focused on two criteria for endophenotypes: co-segregation within families and heritability. RESULTS Co-segregation analyses revealed increased negative delta-low beta correlation during anticipation in participants with (sub)clinical SAD compared to participants without (sub)clinical SAD. Heritability analyses revealed that delta-low beta and delta-high beta correlation during anticipation were heritable. Delta-beta correlation did not differ between participants with and without (sub)clinical SAD during resting state or recovery, nor between participants with and without SAD during all phases of the task. LIMITATIONS It should be noted that participants were seen only once, they all performed the EEG tasks in the same order, and some participants were too anxious to give a speech. CONCLUSIONS Delta-low beta correlation during anticipation of giving a speech might be a candidate endophenotype of SAD, possibly reflecting increased crosstalk between cortical and subcortical regions. If validated as endophenotype, delta-beta correlation during anticipation could be useful in studying the genetic basis, as well as improving treatment and early detection of persons at risk for developing SAD.
Collapse
Affiliation(s)
- Anita Harrewijn
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands.
| | - Melle J W van der Molen
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| | - Irene M van Vliet
- Department of Psychiatry, Leiden University Medical Center, The Netherlands
| | - Jeanine J Houwing-Duistermaat
- Department of Medical Statistics and BioInformatics, Leiden University Medical Center, The Netherlands; Department of Statistics, University of Leeds, United Kingdom
| | - P Michiel Westenberg
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| |
Collapse
|
35
|
Pan Y, Cai J, Kim S, Zhou H. Regression analysis for secondary response variable in a case-cohort study. Biometrics 2017; 74:1014-1022. [PMID: 29286533 DOI: 10.1111/biom.12838] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Revised: 11/01/2017] [Accepted: 11/01/2017] [Indexed: 12/01/2022]
Abstract
Case-cohort study design has been widely used for its cost-effectiveness. In any real study, there are always other important outcomes of interest beside the failure time that the original case-cohort study is based on. How to utilize the available case-cohort data to study the relationship of a secondary outcome with the primary exposure obtained through the case-cohort study is not well studied. In this article, we propose a non-parametric estimated likelihood approach for analyzing a secondary outcome in a case-cohort study. The estimation is based on maximizing a semiparametric likelihood function that is built jointly on both time-to-failure outcome and the secondary outcome. The proposed estimator is shown to be consistent, efficient, and asymptotically normal. Finite sample performance is evaluated via simulation studies. Data from the Sister Study is analyzed to illustrate our method.
Collapse
Affiliation(s)
- Yinghao Pan
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Jianwen Cai
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| | - Sangmi Kim
- Medical College of Georgia, GRU Cancer Center, Augusta University, Augusta, Georgia 30912, U.S.A
| | - Haibo Zhou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
36
|
Harrewijn A, van der Molen MJW, van Vliet IM, Tissier RLM, Westenberg PM. Behavioral and EEG responses to social evaluation: A two-generation family study on social anxiety. Neuroimage Clin 2017; 17:549-562. [PMID: 29527481 PMCID: PMC5842666 DOI: 10.1016/j.nicl.2017.11.010] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 11/06/2017] [Accepted: 11/08/2017] [Indexed: 02/07/2023]
Abstract
Social anxiety disorder is an invalidating psychiatric disorder characterized by extreme fear and avoidance of one or more social situations in which patients might experience scrutiny by others. The goal of this two-generation family study was to delineate behavioral and electrocortical endophenotypes of social anxiety disorder related to social evaluation. Nine families of patients with social anxiety disorder (their spouse and children, and siblings of these patients with spouse and children) performed a social judgment paradigm in which they believed to be evaluated by peers. For each peer, participants indicated their expectation about the evaluative outcome, after which they received social acceptance or rejection feedback. Task behavior, as well as the feedback-related EEG brain potentials (N1, FRN, P3) and theta power were tested as candidate endophenotypes based on two criteria: co-segregation with social anxiety disorder within families and heritability. Results indicated that reaction time for indicating acceptance-expectations might be a candidate behavioral endophenotype of social anxiety disorder, possibly reflecting increased uncertainty or self-focused attention and vigilance during the social judgment paradigm. N1 in response to expected rejection feedback and P3 in response to acceptance feedback might be candidate electrocortical endophenotypes of social anxiety disorder, although the heritability analyses did not remain significant after correcting for multiple tests. Increased N1 possibly reflects hypervigilance to socially threatening stimuli, and increased P3 might reflect that positive feedback is more important for, and/or less expected by, participants with social anxiety disorder. Finally, increased feedback-related negativity and theta power in response to unexpected rejection feedback compared to the other conditions co-segregated with social anxiety disorder, but these EEG measures were not heritable. The candidate endophenotypes might play a new and promising role in future research on genetic mechanisms, early detection and/or prevention of social anxiety disorder.
Collapse
Affiliation(s)
- Anita Harrewijn
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands.
| | - Melle J W van der Molen
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| | - Irene M van Vliet
- Department of Psychiatry, Leiden University Medical Center, The Netherlands
| | - Renaud L M Tissier
- Developmental and Educational Psychology, Leiden University, The Netherlands
| | - P Michiel Westenberg
- Developmental and Educational Psychology, Leiden University, The Netherlands; Leiden Institute for Brain and Cognition, Leiden University, The Netherlands
| |
Collapse
|
37
|
Houwing-Duistermaat JJ, Uh HW, Gusnanto A. Discussion on the paper ‘Statistical contributions to bioinformatics: Design, modelling, structure learning and integration’ by Jeffrey S. Morris and Veerabhadran Baladandayuthapani. STAT MODEL 2017. [DOI: 10.1177/1471082x17706135] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Bioinformatics is an important research area for statisticians. This discussion provides some additional topics to the paper, namely on statistical contributions to detect differential expressed genes, for protein structure prediction, and for the analysis of highly correlated features in Glycomics datasets.
Collapse
Affiliation(s)
- Jeanine J Houwing-Duistermaat
- Department of Statistics, University of Leeds, Leeds LS2 9JT, United Kingdom
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Hae Won Uh
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
| | - Arief Gusnanto
- Department of Statistics, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
38
|
Yaghootkar H, Bancks MP, Jones SE, McDaid A, Beaumont R, Donnelly L, Wood AR, Campbell A, Tyrrell J, Hocking LJ, Tuke MA, Ruth KS, Pearson ER, Murray A, Freathy RM, Munroe PB, Hayward C, Palmer C, Weedon MN, Pankow JS, Frayling TM, Kutalik Z. Quantifying the extent to which index event biases influence large genetic association studies. Hum Mol Genet 2017; 26:1018-1030. [PMID: 28040731 DOI: 10.1093/hmg/ddw433] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 12/19/2016] [Indexed: 11/12/2022] Open
Abstract
As genetic association studies increase in size to 100 000s of individuals, subtle biases may influence conclusions. One possible bias is 'index event bias' (IEB) that appears due to the stratification by, or enrichment for, disease status when testing associations between genetic variants and a disease-associated trait. We aimed to test the extent to which IEB influences some known trait associations in a range of study designs and provide a statistical framework for assessing future associations. Analyzing data from 113 203 non-diabetic UK Biobank participants, we observed three (near TCF7L2, CDKN2AB and CDKAL1) overestimated (body mass index (BMI) decreasing) and one (near MTNR1B) underestimated (BMI increasing) associations among 11 type 2 diabetes risk alleles (at P < 0.05). IEB became even stronger when we tested a type 2 diabetes genetic risk score composed of these 11 variants (-0.010 standard deviations BMI per allele, P = 5 × 10- 4), which was confirmed in four additional independent studies. Similar results emerged when examining the effect of blood pressure increasing alleles on BMI in normotensive UK Biobank samples. Furthermore, we demonstrated that, under realistic scenarios, common disease alleles would become associated at P < 5 × 10- 8 with disease-related traits through IEB alone, if disease prevalence in the sample differs appreciably from the background population prevalence. For example, some hypertension and type 2 diabetes alleles will be associated with BMI in sample sizes of >500 000 if the prevalence of those diseases differs by >10% from the background population. In conclusion, IEB may result in false positive or negative genetic associations in very large studies stratified or strongly enriched for/against disease cases.
Collapse
Affiliation(s)
- Hanieh Yaghootkar
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Michael P Bancks
- Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
| | - Sam E Jones
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Aaron McDaid
- Institute of Social and Preventive Medicine, Lausanne University Hospital, Lausanne 1010, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Robin Beaumont
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Louise Donnelly
- Division of Cardiovascular & Diabetes Medicine, Medical Research Institute, University of Dundee, Dundee, Scotland, UK
| | - Andrew R Wood
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Archie Campbell
- Generation Scotland, Centre for Genomic and Experimental Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, UK
| | - Jessica Tyrrell
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Lynne J Hocking
- Institute of Medical Sciences, University of Aberdeen, Aberdeen, UK
| | - Marcus A Tuke
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Katherine S Ruth
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Ewan R Pearson
- Division of Cardiovascular & Diabetes Medicine, Medical Research Institute, University of Dundee, Dundee, Scotland, UK
| | - Anna Murray
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Rachel M Freathy
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Patricia B Munroe
- Clinical Pharmacology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, UK
- NIHR Barts Cardiovascular Biomedical Research Unit, Barts and The London School of Medicine, Queen Mary University of London, London, UK
| | - Caroline Hayward
- Generation Scotland, MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh, UK
| | - Colin Palmer
- Division of Cardiovascular & Diabetes Medicine, Medical Research Institute, University of Dundee, Dundee, Scotland, UK
| | - Michael N Weedon
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - James S Pankow
- Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
| | - Timothy M Frayling
- Genetics of Complex Traits, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Zoltán Kutalik
- Institute of Social and Preventive Medicine, Lausanne University Hospital, Lausanne 1010, Switzerland
- Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
39
|
Ray D, Basu S. A novel association test for multiple secondary phenotypes from a case-control GWAS. Genet Epidemiol 2017; 41:413-426. [PMID: 28393390 DOI: 10.1002/gepi.22045] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 12/22/2016] [Accepted: 02/05/2017] [Indexed: 12/13/2022]
Abstract
In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case-control data for exploring genetic associations of some additional traits (secondary phenotypes, Y) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non-random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM-PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM-PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y, X, and D. Finally, we use POM-PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case-control sample from the population-based Metabolic Syndrome in Men (METSIM) study. Only POM-PS analysis of the T2D case-control sample seems to provide valid association signals.
Collapse
Affiliation(s)
- Debashree Ray
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Saonli Basu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
40
|
Sofer T, Schifano ED, Christiani DC, Lin X. Weighted pseudolikelihood for SNP set analysis with multiple secondary outcomes in case-control genetic association studies. Biometrics 2017; 73:1210-1220. [PMID: 28346824 DOI: 10.1111/biom.12680] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 01/01/2017] [Accepted: 02/01/2017] [Indexed: 11/29/2022]
Abstract
We propose a weighted pseudolikelihood method for analyzing the association of a SNP set, example, SNPs in a gene or a genetic pathway or network, with multiple secondary phenotypes in case-control genetic association studies. To boost analysis power, we assume that the SNP-specific effects are shared across all secondary phenotypes using a scaled mean model. We estimate regression parameters using Inverse Probability Weighted (IPW) estimating equations obtained from the weighted pseudolikelihood, which accounts for case-control sampling to prevent potential ascertainment bias. To test the effect of a SNP set, we propose a weighted variance component pseudo-score test. We also propose a penalized IPW pseudolikelihood method for selecting a subset of SNPs that are associated with the multiple secondary phenotypes. We show that the proposed variable selection procedure has the oracle properties and is robust to misspecification of the correlation structure among secondary phenotypes. We select the tuning parameter using a weighted Bayesian Information-like Criterion (wBIC). We evaluate the finite sample performance of the proposed methods via simulations, and illustrate the methods by the analysis of the multiple secondary smoking behavior outcomes in a lung cancer case-control genetic association study.
Collapse
Affiliation(s)
- Tamar Sofer
- Department of Biostatistics, University of Washington, Seattle, Washington 98105, U.S.A
| | - Elizabeth D Schifano
- Department of Statistics, University of Connecticut, Storrs, Connecticut 06269, U.S.A
| | - David C Christiani
- Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A
| | - Xihong Lin
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115, U.S.A
| |
Collapse
|
41
|
Lindström S, Loomis S, Turman C, Huang H, Huang J, Aschard H, Chan AT, Choi H, Cornelis M, Curhan G, De Vivo I, Eliassen AH, Fuchs C, Gaziano M, Hankinson SE, Hu F, Jensen M, Kang JH, Kabrhel C, Liang L, Pasquale LR, Rimm E, Stampfer MJ, Tamimi RM, Tworoger SS, Wiggs JL, Hunter DJ, Kraft P. A comprehensive survey of genetic variation in 20,691 subjects from four large cohorts. PLoS One 2017; 12:e0173997. [PMID: 28301549 PMCID: PMC5354293 DOI: 10.1371/journal.pone.0173997] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 03/01/2017] [Indexed: 12/18/2022] Open
Abstract
The Nurses' Health Study (NHS), Nurses' Health Study II (NHSII), Health Professionals Follow Up Study (HPFS) and the Physicians Health Study (PHS) have collected detailed longitudinal data on multiple exposures and traits for approximately 310,000 study participants over the last 35 years. Over 160,000 study participants across the cohorts have donated a DNA sample and to date, 20,691 subjects have been genotyped as part of genome-wide association studies (GWAS) of twelve primary outcomes. However, these studies utilized six different GWAS arrays making it difficult to conduct analyses of secondary phenotypes or share controls across studies. To allow for secondary analyses of these data, we have created three new datasets merged by platform family and performed imputation using a common reference panel, the 1,000 Genomes Phase I release. Here, we describe the methodology behind the data merging and imputation and present imputation quality statistics and association results from two GWAS of secondary phenotypes (body mass index (BMI) and venous thromboembolism (VTE)). We observed the strongest BMI association for the FTO SNP rs55872725 (β = 0.45, p = 3.48x10-22), and using a significance level of p = 0.05, we replicated 19 out of 32 known BMI SNPs. For VTE, we observed the strongest association for the rs2040445 SNP (OR = 2.17, 95% CI: 1.79-2.63, p = 2.70x10-15), located downstream of F5 and also observed significant associations for the known ABO and F11 regions. This pooled resource can be used to maximize power in GWAS of phenotypes collected across the cohorts and for studying gene-environment interactions as well as rare phenotypes and genotypes.
Collapse
Affiliation(s)
- Sara Lindström
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, University of Washington, Seattle, WA, United States of America
| | - Stephanie Loomis
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA, United States of America
| | - Constance Turman
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Hongyan Huang
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Jinyan Huang
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Hugues Aschard
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Andrew T. Chan
- Gastrointestinal Unit, Massachusetts General Hospital, Boston, MA, United States of America
| | - Hyon Choi
- Section of Rheumatology and Clinical Epidemiology Unit, Boston University School of Medicine, Boston, MA, United States of America
| | - Marilyn Cornelis
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, United States of America
| | - Gary Curhan
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Renal Division, Department of Medicine, Brigham and Women's Hospital, Boston, MA, United States of America
| | - Immaculata De Vivo
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - A. Heather Eliassen
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - Charles Fuchs
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, United States of America
| | - Michael Gaziano
- Division of Aging, Department of Medicine, Brigham and Women's Hospital, Boston, MA, United States of America
| | - Susan E. Hankinson
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Biostatistics and Epidemiology, University of Massachusetts, Amherst, MA, United States of America
| | - Frank Hu
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Majken Jensen
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Jae H. Kang
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - Christopher Kabrhel
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Emergency Medicine, Center for Vascular Emergencies, Massachusetts General Hospital, Harvard Medical School, Boston, MA, United States of America
| | - Liming Liang
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Louis R. Pasquale
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - Eric Rimm
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Meir J. Stampfer
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Rulla M. Tamimi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - Shelley S. Tworoger
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
| | - Janey L. Wiggs
- Department of Ophthalmology, Harvard Medical School, Massachusetts Eye and Ear Infirmary, Boston, MA, United States of America
| | - David J. Hunter
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Channing Division of Network Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States of America
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Peter Kraft
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| |
Collapse
|
42
|
Tissier R, Tsonaka R, Mooijaart SP, Slagboom E, Houwing-Duistermaat JJ. Secondary phenotype analysis in ascertained family designs: application to the Leiden longevity study. Stat Med 2017; 36:2288-2301. [PMID: 28303589 PMCID: PMC5485037 DOI: 10.1002/sim.7281] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Revised: 02/17/2017] [Accepted: 02/20/2017] [Indexed: 01/14/2023]
Abstract
The case-control design is often used to test associations between the case-control status and genetic variants. In addition to this primary phenotype, a number of additional traits, known as secondary phenotypes, are routinely recorded, and typically, associations between genetic factors and these secondary traits are studied too. Analysing secondary phenotypes in case-control studies may lead to biased genetic effect estimates, especially when the marker tested is associated with the primary phenotype and when the primary and secondary phenotypes tested are correlated. Several methods have been proposed in the literature to overcome the problem, but they are limited to case-control studies and not directly applicable to more complex designs, such as the multiple-cases family studies. A proper secondary phenotype analysis, in this case, is complicated by the within families correlations on top of the biased sampling design. We propose a novel approach to accommodate the ascertainment process while explicitly modelling the familial relationships. Our approach pairs existing methods for mixed-effects models with the retrospective likelihood framework and uses a multivariate probit model to capture the association between the mixed type primary and secondary phenotypes. To examine the efficiency and bias of the estimates, we performed simulations under several scenarios for the association between the primary phenotype, secondary phenotype and genetic markers. We will illustrate the method by analysing the association between triglyceride levels and glucose (secondary phenotypes) and genetic markers from the Leiden Longevity Study, a multiple-cases family study that investigates longevity. © 2017 The Authors. Statistics in Medicine Published by JohnWiley & Sons Ltd.
Collapse
Affiliation(s)
- Renaud Tissier
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Roula Tsonaka
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Simon P Mooijaart
- Department of Gerontology and Geriatrics, Leiden University Medical Centre, Leiden, The Netherlands
| | - Eline Slagboom
- Department of Molecular Epidemiology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Jeanine J Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Centre, Leiden, The Netherlands.,Department of Statistics, University of Leeds, U.K
| |
Collapse
|
43
|
Ostrom C, Bair E, Maixner W, Dubner R, Fillingim RB, Ohrbach R, Slade GD, Greenspan JD. Demographic Predictors of Pain Sensitivity: Results From the OPPERA Study. THE JOURNAL OF PAIN 2017; 18:295-307. [PMID: 27884689 PMCID: PMC6408952 DOI: 10.1016/j.jpain.2016.10.018] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Revised: 10/29/2016] [Accepted: 10/31/2016] [Indexed: 11/20/2022]
Abstract
The demographic factors of sex, age, and race/ethnicity are well recognized as relevant to pain sensitivity and clinical pain expression. Of these, sex differences have been the most frequently studied, and most of the literature describes greater pain sensitivity for women. The other 2 factors have been less frequently evaluated, and current literature is not definitive. Taking advantage of the large Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study cohort, we evaluated the association of sex, age, and self-reported race with 34 measures of pressure, mechanical, and thermal pain sensitivity encompassing threshold and suprathreshold perception. Women were significantly more pain-sensitive than men for 29 of 34 measures. Age effects were small, and only significant for 7 of 34 measures, however, the age range was limited (18-44 years of age). Race/ethnicity differences varied across groups and pain assessment type. Non-Hispanic white individuals were less pain-sensitive than African-American (for 21 of 34 measures), Hispanic (19 of 34), and Asian (6 of 34) individuals. No pain threshold measure showed significant racial differences, whereas several suprathreshold pain measures did. This suggests that racial differences are not related to tissue characteristics or inherent nociceptor sensitivity. Rather, the differences observed for suprathreshold pain ratings or tolerance are more likely related to differences in central nociceptive processing, including modulation imposed by cognitive, psychological, and/or affective factors. PERSPECTIVE The influence of sex, age, and race/ethnicity on various aspects of pain sensitivity, encompassing threshold and suprathreshold measures and multiple stimulus modalities, allows for a more complete evaluation of the relevance of these demographic factors to acute pain perception.
Collapse
Affiliation(s)
- Cara Ostrom
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Eric Bair
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Center for Pain Research and Innovation, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Endodontics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - William Maixner
- Center for Pain Research and Innovation, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Endodontics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Ronald Dubner
- Department of Neural and Pain Sciences, and Brotman Facial Pain Clinic, University of Maryland School of Dentistry, Baltimore, Maryland
| | - Roger B Fillingim
- Department of Community Dentistry and Behavioral Science, University of Florida, Gainesville, Florida
| | - Richard Ohrbach
- Department of Oral Diagnostic Services, University at Buffalo, Buffalo, New York
| | - Gary D Slade
- Center for Pain Research and Innovation, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Dental Ecology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina; Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Joel D Greenspan
- Department of Neural and Pain Sciences, and Brotman Facial Pain Clinic, University of Maryland School of Dentistry, Baltimore, Maryland.
| |
Collapse
|
44
|
Kang G, Bi W, Zhang H, Pounds S, Cheng C, Shete S, Zou F, Zhao Y, Zhang JF, Yue W. A Robust and Powerful Set-Valued Approach to Rare Variant Association Analyses of Secondary Traits in Case-Control Sequencing Studies. Genetics 2017; 205:1049-1062. [PMID: 28040743 PMCID: PMC5340322 DOI: 10.1534/genetics.116.192377] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Accepted: 12/29/2016] [Indexed: 12/16/2022] Open
Abstract
In many case-control designs of genome-wide association (GWAS) or next generation sequencing (NGS) studies, extensive data on secondary traits that may correlate and share the common genetic variants with the primary disease are available. Investigating these secondary traits can provide critical insights into the disease etiology or pathology, and enhance the GWAS or NGS results. Methods based on logistic regression (LG) were developed for this purpose. However, for the identification of rare variants (RVs), certain inadequacies in the LG models and algorithmic instability can cause severely inflated type I error, and significant loss of power, when the two traits are correlated and the RV is associated with the disease, especially at stringent significance levels. To address this issue, we propose a novel set-valued (SV) method that models a binary trait by dichotomization of an underlying continuous variable, and incorporate this into the genetic association model as a critical component. Extensive simulations and an analysis of seven secondary traits in a GWAS of benign ethnic neutropenia show that the SV method consistently controls type I error well at stringent significance levels, has larger power than the LG-based methods, and is robust in performance to effect pattern of the genetic variant (risk or protective), rare or common variants, rare or common diseases, and trait distributions. Because of the SV method's striking and profound advantage, we strongly recommend the SV method be employed instead of the LG-based methods for secondary traits analyses in case-control sequencing studies.
Collapse
Affiliation(s)
- Guolian Kang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105
| | - Wenjian Bi
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105
| | - Hang Zhang
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Stanley Pounds
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105
| | - Cheng Cheng
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105
| | - Sanjay Shete
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030
| | - Fei Zou
- Department of Biostatistics, The University of North Carolina at Chapel Hill, North Carolina 27599
| | - Yanlong Zhao
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
| | - Ji-Feng Zhang
- Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, People's Republic of China
- School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| | - Weihua Yue
- Institute of Mental Health, Key Laboratory of Mental Health, Ministry of Health & National Clinical Research Center for Mental Disorders, Sixth Hospital, Peking University, Beijing 100191, People's Republic of China
| |
Collapse
|
45
|
Abstract
Catecholamine-O-methyltransferase (COMT) is a polymorphic gene whose variants affect enzymatic activity and pain sensitivity via adrenergic pathways. Although COMT represents one of the most studied genes in human pain genetics, findings regarding its association with pain phenotypes are not always replicated. Here, we investigated if interactions among functional COMT haplotypes, stress, and sex can modify the effect of COMT genetic variants on pain sensitivity. We tested these interactions in a cross-sectional study, including 2 cohorts, one of 2972 subjects tested for thermal pain sensitivity (Orofacial Pain: Prospective Evaluation and Risk Assessment) and one of 948 subjects with clinical acute pain after motor vehicle collision (post-motor vehicle collision). In both cohorts, the COMT high-pain sensitivity (HPS) haplotype showed robust interaction with stress and number of copies of the HPS haplotype was positively associated with pain sensitivity in nonstressed individuals, but not in stressed individuals. In the post-motor vehicle collision cohort, there was additional modification by sex: the HPS-stress interaction was apparent in males, but not in females. In summary, our findings indicate that stress and sex should be evaluated in association studies aiming to investigate the effect of COMT genetic variants on pain sensitivity.
Collapse
|
46
|
Sofer T, Cornelis MC, Kraft P, Tchetgen Tchetgen EJ. CONTROL FUNCTION ASSISTED IPW ESTIMATION WITH A SECONDARY OUTCOME IN CASE-CONTROL STUDIES. Stat Sin 2017. [PMID: 28649172 DOI: 10.5705/ss.202015.0116] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Case-control studies are designed towards studying associations between risk factors and a single, primary outcome. Information about additional, secondary outcomes is also collected, but association studies targeting such secondary outcomes should account for the case-control sampling scheme, or otherwise results may be biased. Often, one uses inverse probability weighted (IPW) estimators to estimate population effects in such studies. IPW estimators are robust, as they only require correct specification of the mean regression model of the secondary outcome on covariates, and knowledge of the disease prevalence. However, IPW estimators are inefficient relative to estimators that make additional assumptions about the data generating mechanism. We propose a class of estimators for the effect of risk factors on a secondary outcome in case-control studies that combine IPW with an additional modeling assumption: specification of the disease outcome probability model. We incorporate this model via a mean zero control function. We derive the class of all regular and asymptotically linear estimators corresponding to our modeling assumption, when the secondary outcome mean is modeled using either the identity or the log link. We find the efficient estimator in our class of estimators and show that it reduces to standard IPW when the model for the primary disease outcome is unrestricted, and is more efficient than standard IPW when the model is either parametric or semiparametric.
Collapse
Affiliation(s)
- Tamar Sofer
- University of Washington and Harvard T.H. Chan School of Public Health
| | | | - Peter Kraft
- University of Washington and Harvard T.H. Chan School of Public Health
| | | |
Collapse
|
47
|
Zhu W, Yuan Y, Zhang J, Zhou F, Knickmeyer RC, Zhu H. Genome-wide association analysis of secondary imaging phenotypes from the Alzheimer's disease neuroimaging initiative study. Neuroimage 2016; 146:983-1002. [PMID: 27717770 DOI: 10.1016/j.neuroimage.2016.09.055] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2016] [Revised: 08/13/2016] [Accepted: 09/21/2016] [Indexed: 11/17/2022] Open
Abstract
The aim of this paper is to systematically evaluate a biased sampling issue associated with genome-wide association analysis (GWAS) of imaging phenotypes for most imaging genetic studies, including the Alzheimer's Disease Neuroimaging Initiative (ADNI). Specifically, the original sampling scheme of these imaging genetic studies is primarily the retrospective case-control design, whereas most existing statistical analyses of these studies ignore such sampling scheme by directly correlating imaging phenotypes (called the secondary traits) with genotype. Although it has been well documented in genetic epidemiology that ignoring the case-control sampling scheme can produce highly biased estimates, and subsequently lead to misleading results and suspicious associations, such findings are not well documented in imaging genetics. We use extensive simulations and a large-scale imaging genetic data analysis of the Alzheimer's Disease Neuroimaging Initiative (ADNI) data to evaluate the effects of the case-control sampling scheme on GWAS results based on some standard statistical methods, such as linear regression methods, while comparing it with several advanced statistical methods that appropriately adjust for the case-control sampling scheme.
Collapse
Affiliation(s)
- Wensheng Zhu
- School of Mathematics & Statistics and KLAS, Northeast Normal University, Changchun 130024, China; Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ying Yuan
- Takeda Pharmaceuticals U.S.A., Inc., 300 Massachusetts Ave, Cambridge, MA 02139, USA
| | - Jingwen Zhang
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Fan Zhou
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rebecca C Knickmeyer
- Departments of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hongtu Zhu
- Departments of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA; Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
48
|
Yung G, Lin X. Validity of using ad hoc methods to analyze secondary traits in case-control association studies. Genet Epidemiol 2016; 40:732-743. [PMID: 27670932 DOI: 10.1002/gepi.21994] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 06/23/2016] [Accepted: 06/26/2016] [Indexed: 11/10/2022]
Abstract
Case-control association studies often collect from their subjects information on secondary phenotypes. Reusing the data and studying the association between genes and secondary phenotypes provide an attractive and cost-effective approach that can lead to discovery of new genetic associations. A number of approaches have been proposed, including simple and computationally efficient ad hoc methods that ignore ascertainment or stratify on case-control status. Justification for these approaches relies on the assumption of no covariates and the correct specification of the primary disease model as a logistic model. Both might not be true in practice, for example, in the presence of population stratification or the primary disease model following a probit model. In this paper, we investigate the validity of ad hoc methods in the presence of covariates and possible disease model misspecification. We show that in taking an ad hoc approach, it may be desirable to include covariates that affect the primary disease in the secondary phenotype model, even though these covariates are not necessarily associated with the secondary phenotype. We also show that when the disease is rare, ad hoc methods can lead to severely biased estimation and inference if the true disease model follows a probit model instead of a logistic model. Our results are justified theoretically and via simulations. Applied to real data analysis of genetic associations with cigarette smoking, ad hoc methods collectively identified as highly significant (P<10-5) single nucleotide polymorphisms from over 10 genes, genes that were identified in previous studies of smoking cessation.
Collapse
Affiliation(s)
- Godwin Yung
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
49
|
Zhang H, Wu CO, Yang Y, Berndt SI, Chanock SJ, Yu K. A multi-locus genetic association test for a dichotomous trait and its secondary phenotype. Stat Methods Med Res 2016; 27:1464-1475. [PMID: 27507288 DOI: 10.1177/0962280216662071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Genetic association studies often collect information on secondary phenotypes related to the primary disease status. In many situations, the secondary phenotypes are only measured in subjects with the disease condition. It would be advantageous to model the primary trait and the secondary phenotype together if they share certain level of genetic heritability. We propose a family of multi-locus testing procedures to detect the composite association between a set of genetic markers and two traits (the primary trait and a secondary phenotype), in order to identify genes influencing both traits. The proposed test is derived from a random effect model with two variance components, with each presenting the genetic effect on one trait, and incorporates a model selection procedure for seeking the optimal model to represent the two sources of genetic effects. We conduct simulation studies to evaluate performance of the proposed procedure and apply the method to a genome-wide association study of prostate cancer with the Gleason score as the secondary phenotype.
Collapse
Affiliation(s)
- Han Zhang
- 1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
| | - Colin O Wu
- 2 Office of Biostatistics Research, National Heart, Lung and Blood Institute, Bethesda, USA
| | - Yifan Yang
- 3 Department of Statistics, University of Kentucky, Lexington, USA
| | - Sonja I Berndt
- 1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
| | - Stephen J Chanock
- 1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
| | - Kai Yu
- 1 Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, USA
| |
Collapse
|
50
|
Xing C, M McCarthy J, Dupuis J, Adrienne Cupples L, B Meigs J, Lin X, S Allen A. Robust analysis of secondary phenotypes in case-control genetic association studies. Stat Med 2016; 35:4226-37. [PMID: 27241694 DOI: 10.1002/sim.6976] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2015] [Revised: 02/04/2016] [Accepted: 04/04/2016] [Indexed: 11/11/2022]
Abstract
The case-control study is a common design for assessing the association between genetic exposures and a disease phenotype. Though association with a given (case-control) phenotype is always of primary interest, there is often considerable interest in assessing relationships between genetic exposures and other (secondary) phenotypes. However, the case-control sample represents a biased sample from the general population. As a result, if this sampling framework is not correctly taken into account, analyses estimating the effect of exposures on secondary phenotypes can be biased leading to incorrect inference. In this paper, we address this problem and propose a general approach for estimating and testing the population effect of a genetic variant on a secondary phenotype. Our approach is based on inverse probability weighted estimating equations, where the weights depend on genotype and the secondary phenotype. We show that, though slightly less efficient than a full likelihood-based analysis when the likelihood is correctly specified, it is substantially more robust to model misspecification, and can out-perform likelihood-based analysis, both in terms of validity and power, when the model is misspecified. We illustrate our approach with an application to a case-control study extracted from the Framingham Heart Study. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Chuanhua Xing
- Department of Biostatistics, Boston University School of Public Health, Boston, 02118, MA, U.S.A
| | - Janice M McCarthy
- Department of Biostatistics and Bioinformatics, Duke University, Durham, 27710, NC, U.S.A
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, 02118, MA, U.S.A.,National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, 01702, MA, U.S.A
| | - L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, 02118, MA, U.S.A.,National Heart, Lung and Blood Institute's Framingham Heart Study, Framingham, 01702, MA, U.S.A
| | - James B Meigs
- General Medicine Division, Massachusetts General Hospital, Boston, 02114, MA, U.S.A.,Department of Medicine, Harvard Medical School, Boston, 02115, MA, U.S.A
| | - Xihong Lin
- Department of Biostatistics, Harvard University, Cambridge, 01238, MA, U.S.A
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, 27710, NC, U.S.A.,Center for Human Genome Variation, Duke University, Durham, 27710, NC, U.S.A
| |
Collapse
|