1
|
Mbatchou J, McPeek MS. JASPER: Fast, powerful, multitrait association testing in structured samples gives insight on pleiotropy in gene expression. Am J Hum Genet 2024; 111:1750-1769. [PMID: 39025064 DOI: 10.1016/j.ajhg.2024.06.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 06/19/2024] [Accepted: 06/20/2024] [Indexed: 07/20/2024] Open
Abstract
Joint association analysis of multiple traits with multiple genetic variants can provide insight into genetic architecture and pleiotropy, improve trait prediction, and increase power for detecting association. Furthermore, some traits are naturally high-dimensional, e.g., images, networks, or longitudinally measured traits. Assessing significance for multitrait genetic association can be challenging, especially when the sample has population sub-structure and/or related individuals. Failure to adequately adjust for sample structure can lead to power loss and inflated type 1 error, and commonly used methods for assessing significance can work poorly with a large number of traits or be computationally slow. We developed JASPER, a fast, powerful, robust method for assessing significance of multitrait association with a set of genetic variants, in samples that have population sub-structure, admixture, and/or relatedness. In simulations, JASPER has higher power, better type 1 error control, and faster computation than existing methods, with the power and speed advantage of JASPER increasing with the number of traits. JASPER is potentially applicable to a wide range of association testing applications, including for multiple disease traits, expression traits, image-derived traits, and microbiome abundances. It allows for covariates, ascertainment, and rare variants and is robust to phenotype model misspecification. We apply JASPER to analyze gene expression in the Framingham Heart Study, where, compared to alternative approaches, JASPER finds more significant associations, including several that indicate pleiotropic effects, most of which replicate previous results, while others have not previously been reported. Our results demonstrate the promise of JASPER for powerful multitrait analysis in structured samples.
Collapse
Affiliation(s)
- Joelle Mbatchou
- Regeneron Genetics Center, Tarrytown, NY 10591, USA; Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
| | - Mary Sara McPeek
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA; Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
2
|
Rossi N, Syed N, Visconti A, Aliyev E, Berry S, Bourbon M, Spector TD, Hysi PG, Fakhro KA, Falchi M. Rare variants at KCNJ2 are associated with LDL-cholesterol levels in a cross-population study. NPJ Genom Med 2024; 9:36. [PMID: 38942744 PMCID: PMC11213907 DOI: 10.1038/s41525-024-00417-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 05/03/2024] [Indexed: 06/30/2024] Open
Abstract
Leveraging whole genome sequencing data of 1751 individuals from the UK and 2587 Qatari subjects, we suggest here an association of rare variants mapping to the sour taste-associated gene KCNJ2 with reduced low-density lipoprotein cholesterol (LDL-C, P = 2.10 × 10-12) and with a 22% decreased dietary trans-fat intake. This study identifies a novel candidate rare locus for LDL-C, adding insights into the genetic architecture of a complex trait implicated in cardiovascular disease.
Collapse
Affiliation(s)
- Niccolò Rossi
- Department of Twin Research & Genetic Epidemiology, King's College London, London, UK
| | - Najeeb Syed
- Department of Human Genetics, Sidra Medical and Research Center, Doha, Qatar
| | - Alessia Visconti
- Department of Twin Research & Genetic Epidemiology, King's College London, London, UK
- Center for Biostatistics, Epidemiology and Public Health, Department of Clinical and Biological Sciences, University of Turin, Turin, Italy
| | - Elbay Aliyev
- Department of Human Genetics, Sidra Medical and Research Center, Doha, Qatar
| | - Sarah Berry
- Department of Nutritional Sciences, King's College London, London, UK
| | - Mafalda Bourbon
- Cardiovascular Research Group, Department of Health Promotion and Prevention of non-Communicable Diseases, Instituto Nacional de Saúde Dr. Ricardo Jorge, Lisbon, Portugal
| | - Tim D Spector
- Department of Twin Research & Genetic Epidemiology, King's College London, London, UK
| | - Pirro G Hysi
- Department of Twin Research & Genetic Epidemiology, King's College London, London, UK
| | - Khalid A Fakhro
- Department of Human Genetics, Sidra Medical and Research Center, Doha, Qatar
- Department of Genetic Medicine, Weill-Cornell Medical College, Doha, Qatar
| | - Mario Falchi
- Department of Twin Research & Genetic Epidemiology, King's College London, London, UK.
| |
Collapse
|
3
|
Mbatchou J, McPeek MS. JASPER: fast, powerful, multitrait association testing in structured samples gives insight on pleiotropy in gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.18.571948. [PMID: 38187553 PMCID: PMC10769254 DOI: 10.1101/2023.12.18.571948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Joint association analysis of multiple traits with multiple genetic variants can provide insight into genetic architecture and pleiotropy, improve trait prediction and increase power for detecting association. Furthermore, some traits are naturally high-dimensional, e.g., images, networks or longitudinally measured traits. Assessing significance for multitrait genetic association can be challenging, especially when the sample has population sub-structure and/or related individuals. Failure to adequately adjust for sample structure can lead to power loss and inflated type 1 error, and commonly used methods for assessing significance can work poorly with a large number of traits or be computationally slow. We developed JASPER, a fast, powerful, robust method for assessing significance of multitrait association with a set of genetic variants, in samples that have population sub-structure, admixture and/or relatedness. In simulations, JASPER has higher power, better type 1 error control, and faster computation than existing methods, with the power and speed advantage of JASPER increasing with the number of traits. JASPER is potentially applicable to a wide range of association testing applications, including for multiple disease traits, expression traits, image-derived traits and microbiome abundances. It allows for covariates, ascertainment and rare variants and is robust to phenotype model misspecification. We apply JASPER to analyze gene expression in the Framingham Heart Study, where, compared to alternative approaches, JASPER finds more significant associations, including several that indicate pleiotropic effects, some of which replicate previous results, while others have not previously been reported. Our results demonstrate the promise of JASPER for powerful multitrait analysis in structured samples.
Collapse
Affiliation(s)
- Joelle Mbatchou
- Regeneron Genetics Center, Tarrytown, NY 10591, USA
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
| | - Mary Sara McPeek
- Department of Statistics, The University of Chicago, Chicago, IL 60637, USA
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
4
|
Dossa HRG, Bureau A, Maziade M, Lakhal-Chaieb L, Oualkacha K. A novel rare variants association test for binary traits in family-based designs via copulas. Stat Methods Med Res 2023; 32:2096-2122. [PMID: 37832140 PMCID: PMC10683345 DOI: 10.1177/09622802231197977] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2023]
Abstract
With the cost-effectiveness technology in whole-genome sequencing, more sophisticated statistical methods for testing genetic association with both rare and common variants are being investigated to identify the genetic variation between individuals. Several methods which group variants, also called gene-based approaches, are developed. For instance, advanced extensions of the sequence kernel association test, which is a widely used variant-set test, have been proposed for unrelated samples and extended for family data. Family data have been shown to be powerful when analyzing rare variants. However, most of such methods capture familial relatedness using a random effect component within the generalized linear mixed model framework. Therefore, there is a need to develop unified and flexible methods to study the association between a set of genetic variants and a trait, especially for a binary outcome. Copulas are multivariate distribution functions with uniform margins on the [ 0 , 1 ] interval and they provide suitable models to capture familial dependence structure. In this work, we propose a flexible family-based association test for both rare and common variants in the presence of binary traits. The method, termed novel rare variant association test (NRVAT), uses a marginal logistic model and a Gaussian Copula. The latter is employed to model the dependence between relatives. An analytic score-type test is derived. Through simulations, we show that our method can achieve greater power than existing approaches. The proposed model is applied to investigate the association between schizophrenia and bipolar disorder in a family-based cohort consisting of 17 extended families from Eastern Quebec.
Collapse
Affiliation(s)
- Houssou R. G. Dossa
- Département de Mathématiques, Université du Québec à Montréal (UQAM) et, Québec, Canada
| | - Alexandre Bureau
- Département de Médecine Sociale et Préventive, Université Laval, Québec, Canada
- Centre de Recherche CERVO, Quebec, Canada
| | - Michel Maziade
- Centre de Recherche CERVO, Quebec, Canada
- Département de Psychiatrie et Neuroscience, Université Laval, Québec, Canada
| | - Lajmi Lakhal-Chaieb
- Département de Mathématiques et Statistique, Université Laval, Québec, Canada
| | - Karim Oualkacha
- Département de Mathématiques, Université du Québec à Montréal (UQAM) et, Québec, Canada
| |
Collapse
|
5
|
Mueller SH, Lai AG, Valkovskaya M, Michailidou K, Bolla MK, Wang Q, Dennis J, Lush M, Abu-Ful Z, Ahearn TU, Andrulis IL, Anton-Culver H, Antonenkova NN, Arndt V, Aronson KJ, Augustinsson A, Baert T, Freeman LEB, Beckmann MW, Behrens S, Benitez J, Bermisheva M, Blomqvist C, Bogdanova NV, Bojesen SE, Bonanni B, Brenner H, Brucker SY, Buys SS, Castelao JE, Chan TL, Chang-Claude J, Chanock SJ, Choi JY, Chung WK, Colonna SV, Cornelissen S, Couch FJ, Czene K, Daly MB, Devilee P, Dörk T, Dossus L, Dwek M, Eccles DM, Ekici AB, Eliassen AH, Engel C, Evans DG, Fasching PA, Fletcher O, Flyger H, Gago-Dominguez M, Gao YT, García-Closas M, García-Sáenz JA, Genkinger J, Gentry-Maharaj A, Grassmann F, Guénel P, Gündert M, Haeberle L, Hahnen E, Haiman CA, Håkansson N, Hall P, Harkness EF, Harrington PA, Hartikainen JM, Hartman M, Hein A, Ho WK, Hooning MJ, Hoppe R, Hopper JL, Houlston RS, Howell A, Hunter DJ, Huo D, Ito H, Iwasaki M, Jakubowska A, Janni W, John EM, Jones ME, Jung A, Kaaks R, Kang D, Khusnutdinova EK, Kim SW, Kitahara CM, Koutros S, Kraft P, Kristensen VN, Kubelka-Sabit K, Kurian AW, Kwong A, Lacey JV, Lambrechts D, Le Marchand L, Li J, Linet M, Lo WY, Long J, Lophatananon A, Mannermaa A, Manoochehri M, Margolin S, Matsuo K, Mavroudis D, Menon U, Muir K, Murphy RA, Nevanlinna H, Newman WG, Niederacher D, O'Brien KM, Obi N, Offit K, Olopade OI, Olshan AF, Olsson H, Park SK, Patel AV, Patel A, Perou CM, Peto J, Pharoah PDP, Plaseska-Karanfilska D, Presneau N, Rack B, Radice P, Ramachandran D, Rashid MU, Rennert G, Romero A, Ruddy KJ, Ruebner M, Saloustros E, Sandler DP, Sawyer EJ, Schmidt MK, Schmutzler RK, Schneider MO, Scott C, Shah M, Sharma P, Shen CY, Shu XO, Simard J, Surowy H, Tamimi RM, Tapper WJ, Taylor JA, Teo SH, Teras LR, Toland AE, Tollenaar RAEM, Torres D, Torres-Mejía G, Troester MA, Truong T, Vachon CM, Vijai J, Weinberg CR, Wendt C, Winqvist R, Wolk A, Wu AH, Yamaji T, Yang XR, Yu JC, Zheng W, Ziogas A, Ziv E, Dunning AM, Easton DF, Hemingway H, Hamann U, Kuchenbaecker KB. Aggregation tests identify new gene associations with breast cancer in populations with diverse ancestry. Genome Med 2023; 15:7. [PMID: 36703164 PMCID: PMC9878779 DOI: 10.1186/s13073-022-01152-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 12/16/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Low-frequency variants play an important role in breast cancer (BC) susceptibility. Gene-based methods can increase power by combining multiple variants in the same gene and help identify target genes. METHODS We evaluated the potential of gene-based aggregation in the Breast Cancer Association Consortium cohorts including 83,471 cases and 59,199 controls. Low-frequency variants were aggregated for individual genes' coding and regulatory regions. Association results in European ancestry samples were compared to single-marker association results in the same cohort. Gene-based associations were also combined in meta-analysis across individuals with European, Asian, African, and Latin American and Hispanic ancestry. RESULTS In European ancestry samples, 14 genes were significantly associated (q < 0.05) with BC. Of those, two genes, FMNL3 (P = 6.11 × 10-6) and AC058822.1 (P = 1.47 × 10-4), represent new associations. High FMNL3 expression has previously been linked to poor prognosis in several other cancers. Meta-analysis of samples with diverse ancestry discovered further associations including established candidate genes ESR1 and CBLB. Furthermore, literature review and database query found further support for a biologically plausible link with cancer for genes CBLB, FMNL3, FGFR2, LSP1, MAP3K1, and SRGAP2C. CONCLUSIONS Using extended gene-based aggregation tests including coding and regulatory variation, we report identification of plausible target genes for previously identified single-marker associations with BC as well as the discovery of novel genes implicated in BC development. Including multi ancestral cohorts in this study enabled the identification of otherwise missed disease associations as ESR1 (P = 1.31 × 10-5), demonstrating the importance of diversifying study cohorts.
Collapse
Affiliation(s)
| | - Alvina G Lai
- Institute of Health Informatics, University College London, London, UK
| | | | - Kyriaki Michailidou
- Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
- Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Manjeet K Bolla
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Qin Wang
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Joe Dennis
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Michael Lush
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Zomoruda Abu-Ful
- Clalit National Cancer Control Center, Carmel Medical Center and Technion Faculty of Medicine, 35254, Haifa, Israel
| | - Thomas U Ahearn
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - Irene L Andrulis
- Fred A. Litwin Center for Cancer Genetics, Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, ON, M5G 1X5, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, M5S 1A8, Canada
| | - Hoda Anton-Culver
- Department of Medicine, Genetic Epidemiology Research Institute, University of California Irvine, Irvine, CA, 92617, USA
| | - Natalia N Antonenkova
- N.N. Alexandrov Research Institute of Oncology and Medical Radiology, 223040, Minsk, Belarus
| | - Volker Arndt
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Kristan J Aronson
- Department of Public Health Sciences, and Cancer Research Institute, Queen's University, Kingston, ON, K7L 3N6, Canada
| | - Annelie Augustinsson
- Department of Cancer Epidemiology, Clinical Sciences, Lund University, 222 42, Lund, Sweden
| | - Thais Baert
- Leuven Multidisciplinary Breast Center, Department of Oncology, Leuven Cancer Institute, University Hospitals Leuven, 3000, Louvain, Belgium
| | - Laura E Beane Freeman
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - Matthias W Beckmann
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | - Sabine Behrens
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Javier Benitez
- Biomedical Network On Rare Diseases (CIBERER), 28029, Madrid, Spain
- Human Cancer Genetics Programme, Spanish National Cancer Research Centre (CNIO), 28029, Madrid, Spain
| | - Marina Bermisheva
- Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, 450054, Russia
| | - Carl Blomqvist
- Department of Oncology, Helsinki University Hospital, University of Helsinki, 00290, Helsinki, Finland
- Department of Oncology, Örebro University Hospital, 70185, Örebro, Sweden
| | - Natalia V Bogdanova
- N.N. Alexandrov Research Institute of Oncology and Medical Radiology, 223040, Minsk, Belarus
- Department of Radiation Oncology, Hannover Medical School, 30625, Hannover, Germany
- Gynaecology Research Unit, Hannover Medical School, 30625, Hannover, Germany
| | - Stig E Bojesen
- Copenhagen General Population Study, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730, Herlev, Denmark
- Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730, Herlev, Denmark
- Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Bernardo Bonanni
- Division of Cancer Prevention and Genetics, IEO, European Institute of Oncology IRCCS, 20141, Milan, Italy
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
- Division of Preventive Oncology, German Cancer Research Center (DKFZ), National Center for Tumor Diseases (NCT), 69120, Heidelberg, Germany
- German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Sara Y Brucker
- Department of Gynecology and Obstetrics, University of Tübingen, 72076, Tübingen, Germany
| | - Saundra S Buys
- Department of Medicine, Huntsman Cancer Institute, Salt Lake City, UT, 84112, USA
| | - Jose E Castelao
- Oncology and Genetics Unit, Instituto de Investigación Sanitaria Galicia Sur (IISGS), Xerencia de Xestion Integrada de Vigo-SERGAS, 36312, Vigo, Spain
| | - Tsun L Chan
- Hong Kong Hereditary Breast Cancer Family Registry, Hong Kong, China
- Department of Molecular Pathology, Hong Kong Sanatorium and Hospital, Hong Kong, China
| | - Jenny Chang-Claude
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
- Cancer Epidemiology Group, University Cancer Center Hamburg (UCCH), University Medical Center Hamburg-Eppendorf, 20246, Hamburg, Germany
| | - Stephen J Chanock
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - Ji-Yeob Choi
- Department of Biomedical Sciences, Seoul National University Graduate School, Seoul, 03080, Korea
- Cancer Research Institute, Seoul National University, Seoul, 03080, Korea
- Institute of Health Policy and Management, Seoul National University Medical Research Center, Seoul, 03080, Korea
| | - Wendy K Chung
- Departments of Pediatrics and Medicine, Columbia University, New York, NY, 10032, USA
| | - Sarah V Colonna
- Department of Medicine, Huntsman Cancer Institute, Salt Lake City, UT, 84112, USA
| | - Sten Cornelissen
- Division of Molecular Pathology, The Netherlands Cancer Institute - Antoni Van Leeuwenhoek Hospital, Amsterdam, 1066 CX, The Netherlands
| | - Fergus J Couch
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Kamila Czene
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 65, Stockholm, Sweden
| | - Mary B Daly
- Department of Clinical Genetics, Fox Chase Cancer Center, Philadelphia, PA, 19111, USA
| | - Peter Devilee
- Department of Pathology, Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Thilo Dörk
- Gynaecology Research Unit, Hannover Medical School, 30625, Hannover, Germany
| | - Laure Dossus
- Nutrition and Metabolism Section, International Agency for Research On Cancer (IARC-WHO), 69372, Lyon, France
| | - Miriam Dwek
- School of Life Sciences, University of Westminster, London, W1W 6UW, UK
| | - Diana M Eccles
- Faculty of Medicine, University of Southampton, Southampton, SO17 1BJ, UK
| | - Arif B Ekici
- Institute of Human Genetics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | - A Heather Eliassen
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Christoph Engel
- Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, 04107, Leipzig, Germany
- LIFE - Leipzig Research Centre for Civilization Diseases, University of Leipzig, 04103, Leipzig, Germany
| | - D Gareth Evans
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
- North West Genomics Laboratory Hub, Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| | - Peter A Fasching
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
- David Geffen School of Medicine, Department of Medicine Division of Hematology and Oncology, University of California at Los Angeles, Los Angeles, CA, 90095, USA
| | - Olivia Fletcher
- The Breast Cancer Now Toby Robins Research Centre, The Institute of Cancer Research, London, SW7 3RP, UK
| | - Henrik Flyger
- Department of Breast Surgery, Herlev and Gentofte Hospital, Copenhagen University Hospital, 2730, Herlev, Denmark
| | - Manuela Gago-Dominguez
- Genomic Medicine Group, International Cancer Genetics and Epidemiology Group, Fundación Pœblica Galega de Medicina Xenómica, Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), Complejo Hospitalario Universitario de Santiago, SERGAS, 15706, Santiago de Compostela, Spain
- Moores Cancer Center, University of California San Diego, La Jolla, CA, 92037, USA
| | - Yu-Tang Gao
- Department of Epidemiology, Shanghai Cancer Institute, Shanghai, 20032, China
| | - Montserrat García-Closas
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - José A García-Sáenz
- Medical Oncology Department, Centro Investigación Biomédica en Red de Cáncer (CIBERONC), Hospital Clínico San Carlos, Instituto de Investigación Sanitaria San Carlos (IdISSC), 28040, Madrid, Spain
| | - Jeanine Genkinger
- Department of Epidemiology, Mailman School of Public Health, Columbia University, New York, NY, 10032, USA
| | | | - Felix Grassmann
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 65, Stockholm, Sweden
- Health and Medical University, 14471, Potsdam, Germany
| | - Pascal Guénel
- Center for Research in Epidemiology and Population Health (CESP), Team Exposome and Heredity, INSERM, University Paris-Saclay, 94805, Villejuif, France
| | - Melanie Gündert
- Molecular Epidemiology Group, German Cancer Research Center (DKFZ), C08069120, Heidelberg, Germany
- Molecular Biology of Breast Cancer, University Womens Clinic Heidelberg, University of Heidelberg, 69120, Heidelberg, Germany
- Institute of Diabetes Research, Helmholtz Zentrum München, German Research Center for Environmental Health, 85764, Neuherberg, Germany
| | - Lothar Haeberle
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | - Eric Hahnen
- Center for Familial Breast and Ovarian Cancer, Faculty of Medicine, University Hospital Cologne, University of Cologne, 50937, Cologne, Germany
- Center for Integrated Oncology (CIO), Faculty of Medicine, University Hospital Cologne, University of Cologne, 50937, Cologne, Germany
| | - Christopher A Haiman
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Niclas Håkansson
- Institute of Environmental Medicine, Karolinska Institutet, 171 77, Stockholm, Sweden
| | - Per Hall
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 65, Stockholm, Sweden
- Department of Oncology, 118 83, Sšdersjukhuset, Stockholm, Sweden
| | - Elaine F Harkness
- Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, M13 9PT, UK
- Nightingale and Genesis Prevention Centre, Wythenshawe Hospital, Manchester University NHS Foundation Trust, Manchester, M23 9LT, UK
- NIHR Manchester Biomedical Research Unit, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| | - Patricia A Harrington
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Jaana M Hartikainen
- Translational Cancer Research Area, University of Eastern Finland, 70210, Kuopio, Finland
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, 70210, Kuopio, Finland
| | - Mikael Hartman
- Saw Swee Hock School of Public Health, National University of Singapore, National University Health System, Singapore, 119077, Singapore
- Department of Surgery, National University Health System, Singapore, 119228, Singapore
| | - Alexander Hein
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | - Weang-Kee Ho
- Department of Mathematical Sciences, Faculty of Science and Engineering, University of Nottingham Malaysia Campus, 43500, Semenyih, Selangor, Malaysia
- Breast Cancer Research Programme, Cancer Research Malaysia, Subang Jaya, 47500, Selangor, Malaysia
| | - Maartje J Hooning
- Department of Medical Oncology, Erasmus MC Cancer Institute, Rotterdam, 3015 GD, The Netherlands
| | - Reiner Hoppe
- Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, 70376, Stuttgart, Germany
- University of Tübingen, 72074, Tübingen, Germany
| | - John L Hopper
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Melbourne, VIC, 3010, Australia
| | - Richard S Houlston
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, SM2 5NG, UK
| | - Anthony Howell
- Division of Cancer Sciences, University of Manchester, Manchester, M13 9PL, UK
| | - David J Hunter
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK
| | - Dezheng Huo
- Center for Clinical Cancer Genetics, The University of Chicago, Chicago, IL, 60637, USA
| | - Hidemi Ito
- Division of Cancer Information and Control, Aichi Cancer Center Research Institute, Nagoya, 464-8681, Japan
- Division of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, 466-8550, Japan
| | - Motoki Iwasaki
- Division of Epidemiology, Center for Public Health Sciences, National Cancer Center Institute for Cancer Control, Tokyo, 104-0045, Japan
| | - Anna Jakubowska
- Department of Genetics and Pathology, Pomeranian Medical University, 71-252, Szczecin, Poland
- Independent Laboratory of Molecular Biology and Genetic Diagnostics, Pomeranian Medical University, 71-252, Szczecin, Poland
| | - Wolfgang Janni
- Department of Gynaecology and Obstetrics, University Hospital Ulm, 89075, Ulm, Germany
| | - Esther M John
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Medicine, Division of Oncology, Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, 94304, USA
| | - Michael E Jones
- Division of Genetics and Epidemiology, The Institute of Cancer Research, London, SM2 5NG, UK
| | - Audrey Jung
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Rudolf Kaaks
- Division of Cancer Epidemiology, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Daehee Kang
- Cancer Research Institute, Seoul National University, Seoul, 03080, Korea
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, 03080, Korea
| | - Elza K Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Federal Research Centre of the Russian Academy of Sciences, Ufa, 450054, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, 450000, Russia
| | - Sung-Won Kim
- Department of Surgery, Daerim Saint Mary's Hospital, Seoul, 07442, Korea
| | - Cari M Kitahara
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Stella Koutros
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Genetic Epidemiology and Statistical Genetics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
| | - Vessela N Kristensen
- Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, 0450, Oslo, Norway
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0379, Oslo, Norway
| | - Katerina Kubelka-Sabit
- Department of Histopathology and Cytology, Clinical Hospital Acibadem Sistina, Skopje, 1000, Republic of North Macedonia
| | - Allison W Kurian
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, 94305, USA
- Department of Medicine, Division of Oncology, Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA, 94304, USA
| | - Ava Kwong
- Hong Kong Hereditary Breast Cancer Family Registry, Hong Kong, China
- Department of Surgery, The University of Hong Kong, Hong Kong, China
- Department of Surgery and Cancer Genetics Center, Hong Kong Sanatorium and Hospital, Hong Kong, China
| | - James V Lacey
- Department of Computational and Quantitative Medicine, City of Hope, Duarte, CA, 91010, USA
- City of Hope Comprehensive Cancer Center, City of Hope, Duarte, CA, 91010, USA
| | - Diether Lambrechts
- VIB Center for Cancer Biology, 3001, Louvain, Belgium
- Laboratory for Translational Genetics, Department of Human Genetics, University of Leuven, 3000, Louvain, Belgium
| | - Loic Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Jingmei Li
- Human Genetics Division, Genome Institute of Singapore, Singapore, 138672, Singapore
| | - Martha Linet
- Radiation Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD, 20892, USA
| | - Wing-Yee Lo
- Dr. Margarete Fischer-Bosch-Institute of Clinical Pharmacology, 70376, Stuttgart, Germany
- University of Tübingen, 72074, Tübingen, Germany
| | - Jirong Long
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
| | - Artitaya Lophatananon
- Division of Population Health, Health Services Research and Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK
| | - Arto Mannermaa
- Translational Cancer Research Area, University of Eastern Finland, 70210, Kuopio, Finland
- Institute of Clinical Medicine, Pathology and Forensic Medicine, University of Eastern Finland, 70210, Kuopio, Finland
- Biobank of Eastern Finland, Kuopio University Hospital, Kuopio, Finland
| | - Mehdi Manoochehri
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Sara Margolin
- Department of Oncology, 118 83, Sšdersjukhuset, Stockholm, Sweden
- Department of Clinical Science and Education, Sšdersjukhuset, Karolinska Institutet, 118 83, Stockholm, Sweden
| | - Keitaro Matsuo
- Division of Cancer Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, 466-8550, Japan
- Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, 464-8681, Japan
| | - Dimitrios Mavroudis
- Department of Medical Oncology, University Hospital of Heraklion, 711 10, Heraklion, Greece
| | - Usha Menon
- Institute of Clinical Trials and Methodology, University College London, London, WC1V 6LJ, UK
| | - Kenneth Muir
- Division of Population Health, Health Services Research and Primary Care, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PL, UK
| | - Rachel A Murphy
- School of Population and Public Health, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Cancer Control Research, BC Cancer, Vancouver, BC, V5Z 1L3, Canada
| | - Heli Nevanlinna
- Department of Obstetrics and Gynecology, Helsinki University Hospital, University of Helsinki, 00290, Helsinki, Finland
| | - William G Newman
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
- North West Genomics Laboratory Hub, Manchester Centre for Genomic Medicine, St Mary's Hospital, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| | - Dieter Niederacher
- Department of Gynecology and Obstetrics, University Hospital Düsseldorf, Heinrich-Heine University Düsseldorf, 40225, Düsseldorf, Germany
| | - Katie M O'Brien
- Epidemiology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, 27709, USA
| | - Nadia Obi
- Institute for Medical Biometry and Epidemiology, University Medical Center Hamburg-Eppendorf, 20246, Hamburg, Germany
| | - Kenneth Offit
- Clinical Genetics Research Lab, Department of Cancer Biology and Genetics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | | | - Andrew F Olshan
- Department of Epidemiology, Gillings School of Global Public Health and UNC Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Håkan Olsson
- Department of Cancer Epidemiology, Clinical Sciences, Lund University, 222 42, Lund, Sweden
| | - Sue K Park
- Cancer Research Institute, Seoul National University, Seoul, 03080, Korea
- Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, 03080, Korea
- Integrated Major in Innovative Medical Science, Seoul National University College of Medicine, Seoul, 03080, South Korea
| | - Alpa V Patel
- Department of Population Science, American Cancer Society, Atlanta, GA, 30303, USA
| | - Achal Patel
- Department of Epidemiology, Gillings School of Global Public Health and UNC Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Charles M Perou
- Department of Genetics, Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Julian Peto
- Department of Non-Communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, WC1E 7HT, UK
| | - Paul D P Pharoah
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Dijana Plaseska-Karanfilska
- Research Centre for Genetic Engineering and Biotechnology "Georgi D. Efremov", MASA, Skopje, 1000, Republic of North Macedonia
| | - Nadege Presneau
- School of Life Sciences, University of Westminster, London, W1W 6UW, UK
| | - Brigitte Rack
- Department of Gynaecology and Obstetrics, University Hospital Ulm, 89075, Ulm, Germany
| | - Paolo Radice
- Unit of Molecular Bases of Genetic Risk and Genetic Testing, Department of Research, Fondazione IRCCS Istituto Nazionale Dei Tumori (INT), 20133, Milan, Italy
| | - Dhanya Ramachandran
- Gynaecology Research Unit, Hannover Medical School, 30625, Hannover, Germany
| | - Muhammad U Rashid
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
- Department of Basic Sciences, Shaukat Khanum Memorial Cancer Hospital and Research Centre (SKMCH & RC), Lahore, 54000, Pakistan
| | - Gad Rennert
- Clalit National Cancer Control Center, Carmel Medical Center and Technion Faculty of Medicine, 35254, Haifa, Israel
| | - Atocha Romero
- Medical Oncology Department, Hospital Universitario Puerta de Hierro, 28222, Madrid, Spain
| | - Kathryn J Ruddy
- Department of Oncology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Matthias Ruebner
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | | | - Dale P Sandler
- Epidemiology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, 27709, USA
| | - Elinor J Sawyer
- School of Cancer and Pharmaceutical Sciences, Comprehensive Cancer Centre, Guy's Campus, King's College London, London, SE1 9RT, UK
| | - Marjanka K Schmidt
- Division of Molecular Pathology, The Netherlands Cancer Institute - Antoni Van Leeuwenhoek Hospital, Amsterdam, 1066 CX, The Netherlands
- Division of Psychosocial Research and Epidemiology, The Netherlands Cancer Institute - Antoni Van Leeuwenhoek Hospital, Amsterdam, 1066 CX, The Netherlands
| | - Rita K Schmutzler
- Center for Familial Breast and Ovarian Cancer, Faculty of Medicine, University Hospital Cologne, University of Cologne, 50937, Cologne, Germany
- Center for Integrated Oncology (CIO), Faculty of Medicine, University Hospital Cologne, University of Cologne, 50937, Cologne, Germany
- Center for Molecular Medicine Cologne (CMMC), Faculty of Medicine and University Hospital Cologne, University of Cologne, 50931, Cologne, Germany
| | - Michael O Schneider
- Department of Gynecology and Obstetrics, Comprehensive Cancer Center Erlangen-EMN, University Hospital Erlangen, Friedrich-Alexander University Erlangen-Nuremberg (FAU), 91054, Erlangen, Germany
| | - Christopher Scott
- Department of Health Sciences Research, Mayo Clinic, Rochester, MN, 55905, USA
| | - Mitul Shah
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Priyanka Sharma
- Department of Internal Medicine, Division of Medical Oncology, University of Kansas Medical Center, Westwood, KS, 66205, USA
| | - Chen-Yang Shen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, 115, Taiwan
- School of Public Health, China Medical University, Taichung, Taiwan
| | - Xiao-Ou Shu
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
| | - Jacques Simard
- Genomics Center, Centre Hospitalier Universitaire de Québec - Université Laval Research Center, Québec City, QC, G1V 4G2, Canada
| | - Harald Surowy
- Molecular Epidemiology Group, German Cancer Research Center (DKFZ), C08069120, Heidelberg, Germany
- Molecular Biology of Breast Cancer, University Womens Clinic Heidelberg, University of Heidelberg, 69120, Heidelberg, Germany
| | - Rulla M Tamimi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, 10065, USA
| | - William J Tapper
- Faculty of Medicine, University of Southampton, Southampton, SO17 1BJ, UK
| | - Jack A Taylor
- Epidemiology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, 27709, USA
- Epigenetic and Stem Cell Biology Laboratory, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, 27709, USA
| | - Soo Hwang Teo
- Breast Cancer Research Programme, Cancer Research Malaysia, Subang Jaya, 47500, Selangor, Malaysia
- Department of Surgery, Faculty of Medicine, University of Malaya, 50603, Kuala Lumpur, Malaysia
| | - Lauren R Teras
- Department of Population Science, American Cancer Society, Atlanta, GA, 30303, USA
| | - Amanda E Toland
- Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH, 43210, USA
| | - Rob A E M Tollenaar
- Department of Surgery, Leiden University Medical Center, Leiden, 2333 ZA, The Netherlands
| | - Diana Torres
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
- Institute of Human Genetics, Pontificia Universidad Javeriana, 110231, Bogota, Colombia
| | - Gabriela Torres-Mejía
- Center for Population Health Research, National Institute of Public Health, 62100, Cuernavaca, Morelos, Mexico
| | - Melissa A Troester
- Department of Epidemiology, Gillings School of Global Public Health and UNC Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Thérèse Truong
- Center for Research in Epidemiology and Population Health (CESP), Team Exposome and Heredity, INSERM, University Paris-Saclay, 94805, Villejuif, France
| | - Celine M Vachon
- Department of Quantitative Health Sciences, Division of Epidemiology, Mayo Clinic, Rochester, MN, 55905, USA
| | - Joseph Vijai
- Clinical Genetics Research Lab, Department of Cancer Biology and Genetics, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Clinical Genetics Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - Clarice R Weinberg
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, NIH, Research Triangle Park, NC, 27709, USA
| | - Camilla Wendt
- Department of Clinical Science and Education, Sšdersjukhuset, Karolinska Institutet, 118 83, Stockholm, Sweden
| | - Robert Winqvist
- Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit, Biocenter Oulu, University of Oulu, 90570, Oulu, Finland
- Laboratory of Cancer Genetics and Tumor Biology, Northern Finland Laboratory Centre Oulu, 90570, Oulu, Finland
| | - Alicja Wolk
- Institute of Environmental Medicine, Karolinska Institutet, 171 77, Stockholm, Sweden
- Department of Surgical Sciences, Uppsala University, 751 05, Uppsala, Sweden
| | - Anna H Wu
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Taiki Yamaji
- Division of Epidemiology, Center for Public Health Sciences, National Cancer Center Institute for Cancer Control, Tokyo, 104-0045, Japan
| | - Xiaohong R Yang
- Division of Cancer Epidemiology and Genetics, Department of Health and Human Services, National Cancer Institute, National Institutes of Health, Bethesda, MD, 20850, USA
| | - Jyh-Cherng Yu
- Department of Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei, 114, Taiwan
| | - Wei Zheng
- Division of Epidemiology, Department of Medicine, Vanderbilt Epidemiology Center, Vanderbilt-Ingram Cancer Center, Vanderbilt University School of Medicine, Nashville, TN, 37232, USA
| | - Argyrios Ziogas
- Department of Medicine, Genetic Epidemiology Research Institute, University of California Irvine, Irvine, CA, 92617, USA
| | - Elad Ziv
- Department of Medicine, Diller Family Comprehensive Cancer Center, Institute for Human Genetics, UCSF Helen, University of California San Francisco, San Francisco, CA, 94115, USA
| | - Alison M Dunning
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, CB1 8RN, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, CB1 8RN, UK
| | - Harry Hemingway
- Institute of Health Informatics, University College London, London, UK
- Health Data Research UK, University College London, London, UK
- University College London Hospitals Biomedical Research Centre (UCLH BRC), London, UK
- The Alan Turing Institute, London, UK
| | - Ute Hamann
- Molecular Genetics of Breast Cancer, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Karoline B Kuchenbaecker
- Division of Psychiatry, University College London, London, UK.
- UCL Genetics Institute, University College London, London, UK.
| |
Collapse
|
6
|
Wang Y, Chen H, Peloso GM, Meigs JB, Beiser AS, Seshadri S, DeStefano AL, Dupuis J. Family history aggregation unit-based tests to detect rare genetic variant associations with application to the Framingham Heart Study. Am J Hum Genet 2022; 109:738-749. [PMID: 35316615 PMCID: PMC9069079 DOI: 10.1016/j.ajhg.2022.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 02/28/2022] [Indexed: 11/15/2022] Open
Abstract
A challenge in standard genetic studies is maintaining good power to detect associations, especially for low prevalent diseases and rare variants. The traditional methods are most powerful when evaluating the association between variants in balanced study designs. Without accounting for family correlation and unbalanced case-control ratio, these analyses could result in inflated type I error. One cost-effective solution to increase statistical power is exploitation of available family history (FH) that contains valuable information about disease heritability. Here, we develop methods to address the aforementioned type I error issues while providing optimal power to analyze aggregates of rare variants by incorporating additional information from FH. With enhanced power in these methods exploiting FH and accounting for relatedness and unbalanced designs, we successfully detect genes with suggestive associations with Alzheimer disease, dementia, and type 2 diabetes by using the exome chip data from the Framingham Heart Study.
Collapse
Affiliation(s)
- Yanbing Wang
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02215, USA.
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA; Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Gina M Peloso
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02215, USA
| | - James B Meigs
- Division of General Internal Medicine, Massachusetts General Hospital, Boston, MA 02214, USA; Harvard Medical School, Boston, MA 02215, USA; Program in Medical and Population Genetics, Broad Institute, Cambridge, MA 02115, USA
| | - Alexa S Beiser
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02215, USA; Framingham Heart Study, Framingham, MA 01701, USA; Department of Neurology, Boston University School of Medicine, Boston, MA 02215, USA
| | - Sudha Seshadri
- Framingham Heart Study, Framingham, MA 01701, USA; Department of Neurology, Boston University School of Medicine, Boston, MA 02215, USA; Glenn Biggs Institute for Alzheimer Disease and Neurodegenerative Diseases, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Anita L DeStefano
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02215, USA
| | - Josée Dupuis
- Department of Biostatistics, School of Public Health, Boston University, Boston, MA 02215, USA
| |
Collapse
|
7
|
Insights into the genetic architecture of haematological traits from deep phenotyping and whole-genome sequencing for two Mediterranean isolated populations. Sci Rep 2022; 12:1131. [PMID: 35064169 PMCID: PMC8782863 DOI: 10.1038/s41598-021-04436-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 12/06/2021] [Indexed: 11/08/2022] Open
Abstract
Haematological traits are linked to cardiovascular, metabolic, infectious and immune disorders, as well as cancer. Here, we examine the role of genetic variation in shaping haematological traits in two isolated Mediterranean populations. Using whole-genome sequencing data at 22× depth for 1457 individuals from Crete (MANOLIS) and 1617 from the Pomak villages in Greece, we carry out a genome-wide association scan for haematological traits using linear mixed models. We discover novel associations (p < 5 × 10–9) of five rare non-coding variants with alleles conferring effects of 1.44–2.63 units of standard deviation on red and white blood cell count, platelet and red cell distribution width. Moreover, 10.0% of individuals in the Pomak population and 6.8% in MANOLIS carry a pathogenic mutation in the Haemoglobin Subunit Beta (HBB) gene. The mutational spectrum is highly diverse (10 different mutations). The most frequent mutation in MANOLIS is the common Mediterranean variant IVS-I-110 (G>A) (rs35004220). In the Pomak population, c.364C>A (“HbO-Arab”, rs33946267) is most frequent (4.4% allele frequency). We demonstrate effects on haematological and other traits, including bilirubin, cholesterol, and, in MANOLIS, height and gestation age. We find less severe effects on red blood cell traits for HbS, HbO, and IVS-I-6 (T>C) compared to other b+ mutations. Overall, we uncover allelic diversity of HBB in Greek isolated populations and find an important role for additional rare variants outside of HBB.
Collapse
|
8
|
Novel directions in data pre-processing and genome-wide association study (GWAS) methodologies to overcome ongoing challenges. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
9
|
Whole-genome sequencing analysis of the cardiometabolic proteome. Nat Commun 2020; 11:6336. [PMID: 33303764 PMCID: PMC7729872 DOI: 10.1038/s41467-020-20079-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 10/26/2020] [Indexed: 12/14/2022] Open
Abstract
The human proteome is a crucial intermediate between complex diseases and their genetic and environmental components, and an important source of drug development targets and biomarkers. Here, we comprehensively assess the genetic architecture of 257 circulating protein biomarkers of cardiometabolic relevance through high-depth (22.5×) whole-genome sequencing (WGS) in 1328 individuals. We discover 131 independent sequence variant associations (P < 7.45 × 10−11) across the allele frequency spectrum, all of which replicate in an independent cohort (n = 1605, 18.4x WGS). We identify for the first time replicating evidence for rare-variant cis-acting protein quantitative trait loci for five genes, involving both coding and noncoding variation. We construct and validate polygenic scores that explain up to 45% of protein level variation. We find causal links between protein levels and disease risk, identifying high-value biomarkers and drug development targets. The human proteome represents a crucial link between complex disease and genetic/environmental factors. Here, the authors investigate 257 cardiometabolic-relevant protein biomarkers in whole genome sequencing data from 1328 individuals, revealing the genetic architecture underlying biomarker variation.
Collapse
|
10
|
Gao C, Sha Q, Zhang S, Zhang K. MF-TOWmuT: Testing an optimally weighted combination of common and rare variants with multiple traits using family data. Genet Epidemiol 2020; 45:64-81. [PMID: 33047835 DOI: 10.1002/gepi.22355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 08/03/2020] [Accepted: 08/18/2020] [Indexed: 11/11/2022]
Abstract
With rapid advancements of sequencing technologies and accumulations of electronic health records, a large number of genetic variants and multiple correlated human complex traits have become available in many genetic association studies. Thus, it becomes necessary and important to develop new methods that can jointly analyze the association between multiple genetic variants and multiple traits. Compared with methods that only use a single marker or trait, the joint analysis of multiple genetic variants and multiple traits is more powerful since such an analysis can fully incorporate the correlation structure of genetic variants and/or traits and their mutual dependence patterns. However, most of existing methods that simultaneously analyze multiple genetic variants and multiple traits are only applicable to unrelated samples. We develop a new method called MF-TOWmuT to detect association of multiple phenotypes and multiple genetic variants in a genomic region with family samples. MF-TOWmuT is based on an optimally weighted combination of variants. Our method can be applied to both rare and common variants and both qualitative and quantitative traits. Our simulation results show that (1) the type I error of MF-TOWmuT is preserved; (2) MF-TOWmuT outperforms two existing methods such as Multiple Family-based Quasi-Likelihood Score Test and Multivariate Family-based Rare Variant Association Test in terms of power. We also illustrate the usefulness of MF-TOWmuT by analyzing genotypic and phenotipic data from the Genetics of Kidneys in Diabetes study. R program is available at https://github.com/gaochengPRC/MF-TOWmuT.
Collapse
Affiliation(s)
- Cheng Gao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Kui Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| |
Collapse
|
11
|
Wang Y, Bandyopadhyay D, Shaffer JR, Wu X. Gene-Based Association Mapping for Dental Caries in The GENEVA Consortium. JOURNAL OF DENTISTRY AND DENTAL MEDICINE 2020; 3:156. [PMID: 34622142 PMCID: PMC8494074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
OBJECTIVE Dental caries is a multifactorial disease with high prevalence in both children and adults. Recent genome-wide association studies (GWASs) have revealed that genetic factors play an important role in caries incidence. However, existing methods are not sufficient to identify caries-associated genes, due to the complex correlation structure of caries GWAS data, and lack of appropriate summarization at the gene level. This paper attempts to address that by analyzing data from the Gene, Environment Association Studies (GENEVA) consortium. METHODS We investigated gene-based genetic associations for dental caries based on genome-wide data derived from the GENEVA database, with adjustment to covariates, linkage disequilibrium among single-nucleotide polymorphisms, and family relations, in sampled individuals. RESULTS Several suggestive genes were identified, in which some of them have been previously found to have potential biological functions on cariogenesis. CONCLUSIONS By comparing the gene sets identified from gene-based and SNP-based association testing methods, we found a non-negligible overlap, which indicates that our gene-based analysis can provide substantial supplement to the traditional GWAS analysis.
Collapse
Affiliation(s)
- Yueyao Wang
- Department of Statistics, Virginia Polytechnic Institute & State University, Blacksburg, VA
| | | | - John R. Shaffer
- Department of Human Genetics, University of Pittsburgh, Pittsburgh, PA
| | - Xiaowei Wu
- Department of Statistics, Virginia Polytechnic Institute & State University, Blacksburg, VA
| |
Collapse
|
12
|
Peterson RE, Kuchenbaecker K, Walters RK, Chen CY, Popejoy AB, Periyasamy S, Lam M, Iyegbe C, Strawbridge RJ, Brick L, Carey CE, Martin AR, Meyers JL, Su J, Chen J, Edwards AC, Kalungi A, Koen N, Majara L, Schwarz E, Smoller JW, Stahl EA, Sullivan PF, Vassos E, Mowry B, Prieto ML, Cuellar-Barboza A, Bigdeli TB, Edenberg HJ, Huang H, Duncan LE. Genome-wide Association Studies in Ancestrally Diverse Populations: Opportunities, Methods, Pitfalls, and Recommendations. Cell 2019; 179:589-603. [PMID: 31607513 PMCID: PMC6939869 DOI: 10.1016/j.cell.2019.08.051] [Citation(s) in RCA: 371] [Impact Index Per Article: 74.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 07/10/2019] [Accepted: 08/26/2019] [Indexed: 12/19/2022]
Abstract
Genome-wide association studies (GWASs) have focused primarily on populations of European descent, but it is essential that diverse populations become better represented. Increasing diversity among study participants will advance our understanding of genetic architecture in all populations and ensure that genetic research is broadly applicable. To facilitate and promote research in multi-ancestry and admixed cohorts, we outline key methodological considerations and highlight opportunities, challenges, solutions, and areas in need of development. Despite the perception that analyzing genetic data from diverse populations is difficult, it is scientifically and ethically imperative, and there is an expanding analytical toolbox to do it well.
Collapse
Affiliation(s)
- Roseann E Peterson
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA 23298, USA.
| | - Karoline Kuchenbaecker
- Division of Psychiatry and UCL Genetics Institute, University College London, London W1T 7NF, UK
| | - Raymond K Walters
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Chia-Yen Chen
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA; Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Alice B Popejoy
- Department of Biomedical Data Science, School of Medicine, Stanford University, Stanford, CA 94305, USA
| | - Sathish Periyasamy
- Queensland Brain Institute and Queensland Centre for Mental Health Research, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Max Lam
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Conrad Iyegbe
- Department of Psychosis Studies, Institute of Psychiatry, Psychology and Neuroscience, King's College London, London SE5 8AF, UK
| | - Rona J Strawbridge
- Institute of Health and Wellbeing, University of Glasgow, Glasgow G12 8RZ, UK; Department of Medicine Solna, Karolinska Institute, Stockholm, SE 17176, Sweden
| | - Leslie Brick
- Department of Psychiatry and Human Behavior, Warren Alpert Medical School, Brown University, Providence, RI 02906, USA
| | - Caitlin E Carey
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Alicia R Martin
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Jacquelyn L Meyers
- Department of Psychiatry, State University of New York Downstate Medical Center, Brooklyn, NY 11203, USA
| | - Jinni Su
- Department of Psychology, Arizona State University, Tempe, AZ 85281, USA
| | - Junfang Chen
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany
| | - Alexis C Edwards
- Virginia Institute for Psychiatric and Behavioral Genetics, Department of Psychiatry, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Allan Kalungi
- Mental Health Section of MRC/UVRI and LSHTM Uganda Research Unit, P.O. Box 49, Entebbe, Uganda; Department of Psychiatry, Faculty of Medicine & Health Sciences, University of Stellenbosch, Cape Town, South Africa; Department of Medical Microbiology, College of Health Sciences, Makerere University, Kampala, Uganda; Global Initiative for Neuropsychiatric Genetics Education in Research, Harvard T.H. Chan School of Public Health and Broad Institute, Boston, MA 02115, USA
| | - Nastassja Koen
- Department of Psychiatry, Faculty of Medicine & Health Sciences, University of Stellenbosch, Cape Town, South Africa; Department of Medical Microbiology, College of Health Sciences, Makerere University, Kampala, Uganda; Global Initiative for Neuropsychiatric Genetics Education in Research, Harvard T.H. Chan School of Public Health and Broad Institute, Boston, MA 02115, USA
| | - Lerato Majara
- Global Initiative for Neuropsychiatric Genetics Education in Research, Harvard T.H. Chan School of Public Health and Broad Institute, Boston, MA 02115, USA; MRC Human Genetics Research Unit, Division of Human Genetics, Department of Pathology, Institute of Infectious Diseases and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Cape Town, 7925, South Africa
| | - Emanuel Schwarz
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, 68159 Mannheim, Germany
| | - Jordan W Smoller
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Eli A Stahl
- Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Patrick F Sullivan
- Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, SE 17176, Sweden; Genetics and Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Evangelos Vassos
- Social, Genetic & Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, SE5 8AF, UK
| | - Bryan Mowry
- Queensland Brain Institute and Queensland Centre for Mental Health Research, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Miguel L Prieto
- Department of Psychiatry, Faculty of Medicine, Universidad de los Andes, Santiago 7620001, Chile; Mental Health Service, Clínica Universidad de los Andes, Santiago 7620001, Chile; Department of Psychiatry and Psychology, Mayo Clinic, Rochester, MN, USA
| | - Alfredo Cuellar-Barboza
- Department of Psychiatry, University Hospital and School of Medicine, Universidad Autonoma de Nuevo Leon, Monterrey, Mexico; Department of Psychiatry and Psychology, Mayo Clinic, Rochester, MN, USA
| | - Tim B Bigdeli
- Department of Psychiatry, State University of New York Downstate Medical Center, Brooklyn, NY 11203, USA
| | - Howard J Edenberg
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Hailiang Huang
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Medicine, Harvard Medical School, Boston, MA 02115, USA
| | - Laramie E Duncan
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
13
|
Svishcheva GR. A generalized model for combining dependent SNP-level summary statistics and its extensions to statistics of other levels. Sci Rep 2019; 9:5461. [PMID: 30940856 PMCID: PMC6445108 DOI: 10.1038/s41598-019-41827-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2018] [Accepted: 03/06/2019] [Indexed: 11/12/2022] Open
Abstract
Here I propose a fundamentally new flexible model to reveal the association between a trait and a set of genetic variants in a genomic region/gene. This model was developed for the situation when original individual-level phenotype and genotype data are not available, but the researcher possesses the results of statistical analyses conducted on these data (namely, SNP-level summary Z score statistics and SNP-by-SNP correlations). The new model was analytically derived from the classical multiple linear regression model applied for the region-based association analysis of individual-level phenotype and genotype data by using the linear compression of data, where the SNP-by-SNP correlations are among the explanatory variables, and the summary Z score statistics are categorized as the response variables. I analytically show that the regional association analysis methods developed within the framework of the classical multiple linear regression model with additive effects of genetic variants can be reformulated in terms of the new model without the loss of information. The results obtained from the regional association analysis utilizing the classical model and those derived using the proposed model are identical when SNP-by-SNP correlations and SNP-level statistics are estimated from the same genetic data.
Collapse
Affiliation(s)
- Gulnara R Svishcheva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, 630090, Russia. .,Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, 119991, Russia.
| |
Collapse
|
14
|
Chen H, Huffman JE, Brody JA, Wang C, Lee S, Li Z, Gogarten SM, Sofer T, Bielak LF, Bis JC, Blangero J, Bowler RP, Cade BE, Cho MH, Correa A, Curran JE, de Vries PS, Glahn DC, Guo X, Johnson AD, Kardia S, Kooperberg C, Lewis JP, Liu X, Mathias RA, Mitchell BD, O’Connell JR, Peyser PA, Post WS, Reiner AP, Rich SS, Rotter JI, Silverman EK, Smith JA, Vasan RS, Wilson JG, Yanek LR, Redline S, Smith NL, Boerwinkle E, Borecki IB, Cupples LA, Laurie CC, Morrison AC, Rice KM, Lin X, Rice KM, Lin X. Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies. Am J Hum Genet 2019; 104:260-274. [PMID: 30639324 DOI: 10.1016/j.ajhg.2018.12.012] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Accepted: 12/17/2018] [Indexed: 12/12/2022] Open
Abstract
With advances in whole-genome sequencing (WGS) technology, more advanced statistical methods for testing genetic association with rare variants are being developed. Methods in which variants are grouped for analysis are also known as variant-set, gene-based, and aggregate unit tests. The burden test and sequence kernel association test (SKAT) are two widely used variant-set tests, which were originally developed for samples of unrelated individuals and later have been extended to family data with known pedigree structures. However, computationally efficient and powerful variant-set tests are needed to make analyses tractable in large-scale WGS studies with complex study samples. In this paper, we propose the variant-set mixed model association tests (SMMAT) for continuous and binary traits using the generalized linear mixed model framework. These tests can be applied to large-scale WGS studies involving samples with population structure and relatedness, such as in the National Heart, Lung, and Blood Institute's Trans-Omics for Precision Medicine (TOPMed) program. SMMATs share the same null model for different variant sets, and a virtue of this null model, which includes covariates only, is that it needs to be fit only once for all tests in each genome-wide analysis. Simulation studies show that all the proposed SMMATs correctly control type I error rates for both continuous and binary traits in the presence of population structure and relatedness. We also illustrate our tests in a real data example of analysis of plasma fibrinogen levels in the TOPMed program (n = 23,763), using the Analysis Commons, a cloud-based computing platform.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Statistics, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
15
|
Qi W, Allen AS, Li YJ. Family-based association tests for rare variants with censored traits. PLoS One 2019; 14:e0210870. [PMID: 30682063 PMCID: PMC6347269 DOI: 10.1371/journal.pone.0210870] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2018] [Accepted: 12/27/2018] [Indexed: 11/30/2022] Open
Abstract
We propose a set of family-based burden and kernel tests for censored traits (FamBAC and FamKAC). Here, censored traits refer to time-to-event outcomes, for instance, age-at-onset of a disease. To model censored traits in family-based designs, we used the frailty model, which incorporated not only fixed genetic effects of rare variants in a region of interest but also random polygenic effects shared within families. We first partitioned genotype scores of rare variants into orthogonal between- and within-family components, and then derived their corresponding efficient score statistics from the frailty model. Finally, FamBAC and FamKAC were constructed by aggregating the weighted efficient scores of the within-family components across rare variants and subjects. FamBAC collapsed rare variants within subject first to form a burden test that followed a chi-squared distribution; whereas FamKAC was a variant component test following a mixture of chi-squared distributions. For FamKAC, p-values can be computed by permutation tests or for computational efficiency by approximation methods. Through simulation studies, we showed that type I error was correctly controlled by FamBAC for various variant weighting schemes (0.0371 to 0.0527). However, FamKAC type I error rates based on approximation methods were deflated (max 0.0376) but improved by permutation tests. Our simulations also demonstrated that burden test FamBAC had higher power than kernel test FamKAC when high proportion (e.g. ≥ 80%) of causal variants had effects in the same direction. In contrast, when the effects of causal variants on the censored trait were in mixed directions, FamKAC outperformed FamBAC and had comparable or higher power than an existing method, RVFam. Our proposed framework has the flexibility to accommodate general nuclear families, and can be used to analyze sequence data for censored traits such as age-at-onset of a complex disease of interest.
Collapse
Affiliation(s)
- Wenjing Qi
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Duke Molecular Physiology Institute, Duke University, Durham, NC, United States of America
| | - Andrew S. Allen
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Center for Statistical Genetics and Genomics, Duke University, Durham, NC, United States of America
| | - Yi-Ju Li
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, United States of America
- Duke Molecular Physiology Institute, Duke University, Durham, NC, United States of America
- * E-mail:
| |
Collapse
|
16
|
Rhoades R, Jackson F, Teng S. Discovery of rare variants implicated in schizophrenia using next-generation sequencing. JOURNAL OF TRANSLATIONAL GENETICS AND GENOMICS 2019; 3:1-20. [PMID: 33981965 PMCID: PMC8112455 DOI: 10.20517/jtgg.2018.26] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/12/2023]
Abstract
Schizophrenia is a highly heritable psychiatric disorder that affects 1% of the population. Genome-wide association studies have identified common variants in candidate genes associated with schizophrenia, but the genetics mechanisms of this disorder have not yet been elucidated. The discovery of rare genetic variants that contribute to schizophrenia symptoms promises to help explain the missing heritability of the disease. Next generation sequencing techniques are revolutionizing the field of psychiatric genetics. Various statistical approaches have been developed for rare variant association testing in case-control and family studies. Targeted resequencing, whole exome sequencing and whole genome sequencing combined with these computational tools are used for the discovery of rare genetic variations in schizophrenia. The findings provide useful information for characterizing the rare mutations and elucidating the genetic mechanisms by which the variants cause schizophrenia.
Collapse
Affiliation(s)
- Raina Rhoades
- Department of Biology, Howard University, Washington, DC 20059, USA
| | - Fatimah Jackson
- Department of Biology, Howard University, Washington, DC 20059, USA
| | - Shaolei Teng
- Department of Biology, Howard University, Washington, DC 20059, USA
| |
Collapse
|
17
|
Robust Rare-Variant Association Tests for Quantitative Traits in General Pedigrees. STATISTICS IN BIOSCIENCES 2018; 10:491-505. [DOI: 10.1007/s12561-017-9197-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
18
|
Zhang Q, Sahana G, Su G, Guldbrandtsen B, Lund MS, Calus MPL. Impact of rare and low-frequency sequence variants on reliability of genomic prediction in dairy cattle. Genet Sel Evol 2018; 50:62. [PMID: 30458700 PMCID: PMC6247626 DOI: 10.1186/s12711-018-0432-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 11/14/2018] [Indexed: 11/05/2022] Open
Abstract
Background Availability of whole-genome sequence data for a large number of cattle and efficient imputation methodologies open a new opportunity to include rare and low-frequency variants (RLFV) in genomic prediction in dairy cattle. The objective of this study was to examine the impact of including RLFV that are within genes and selected from whole-genome sequence variants, on the reliability of genomic prediction for fertility, health and longevity in dairy cattle. Results All genic RLFV with a minor allele frequency lower than 0.05 were extracted from imputed sequence data and subsets were created using different strategies. These subsets were subsequently combined with Illumina 50 k single nucleotide polymorphism (SNP) data and used for genomic prediction. Reliability of prediction obtained by using 50 k SNP data alone was used as reference value and absolute changes in reliabilities are referred to as changes in percentage points. Adding a component that included either all the genic or a subset of selected RLFV into the model in addition to the 50 k component changed the reliability of predictions by − 2.2 to 1.1%, i.e. hardly no change in reliability of prediction was found, regardless of how the RLFV were selected. In addition to these empirical analyses, a simulation study was performed to evaluate the potential impact of adding RLFV in the model on the reliability of prediction. Three sets of causal RLFV (containing 21,468, 1348 and 235 RLFV) that were randomly selected from different numbers of genes were generated and accounted for 10% additional genetic variance of the estimated variance explained by the 50 k SNPs. When genic RLFV based on mapping results were included in the prediction model, reliabilities improved by up to 4.0% and when the causal RLFV were included they improved by up to 6.8%. Conclusions Using selected RLFV from whole-genome sequence data had only a small impact on the empirical reliability of genomic prediction in dairy cattle. Our simulations revealed that for sequence data to bring a benefit, the key is to identify causal RLFV. Electronic supplementary material The online version of this article (10.1186/s12711-018-0432-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Qianqian Zhang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark. .,Wageningen University and Research, Animal Breeding and Genomics, Wageningen, The Netherlands. .,Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Guosheng Su
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Bernt Guldbrandtsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, Denmark
| | - Mario P L Calus
- Wageningen University and Research, Animal Breeding and Genomics, Wageningen, The Netherlands
| |
Collapse
|
19
|
Gilly A, Suveges D, Kuchenbaecker K, Pollard M, Southam L, Hatzikotoulas K, Farmaki AE, Bjornland T, Waples R, Appel EVR, Casalone E, Melloni G, Kilian B, Rayner NW, Ntalla I, Kundu K, Walter K, Danesh J, Butterworth A, Barroso I, Tsafantakis E, Dedoussis G, Moltke I, Zeggini E. Cohort-wide deep whole genome sequencing and the allelic architecture of complex traits. Nat Commun 2018; 9:4674. [PMID: 30405126 PMCID: PMC6220258 DOI: 10.1038/s41467-018-07070-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2018] [Accepted: 10/08/2018] [Indexed: 11/08/2022] Open
Abstract
The role of rare variants in complex traits remains uncharted. Here, we conduct deep whole genome sequencing of 1457 individuals from an isolated population, and test for rare variant burdens across six cardiometabolic traits. We identify a role for rare regulatory variation, which has hitherto been missed. We find evidence of rare variant burdens that are independent of established common variant signals (ADIPOQ and adiponectin, P = 4.2 × 10-8; APOC3 and triglyceride levels, P = 1.5 × 10-26), and identify replicating evidence for a burden associated with triglyceride levels in FAM189B (P = 2.2 × 10-8), indicating a role for this gene in lipid metabolism.
Collapse
Affiliation(s)
- Arthur Gilly
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Daniel Suveges
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Karoline Kuchenbaecker
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Division of Psychiatry, University College of London, London, W1T 7NF, United Kingdom
- UCL Genetics Institute, University College London, London, WC1E 6BT, United Kingdom
| | - Martin Pollard
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Department of Medicine, Addenbrooke's Hospital, University of Cambridge, Hills Road, Cambridge, CB2 0QQ, United Kingdom
| | - Lorraine Southam
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
| | - Konstantinos Hatzikotoulas
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Institute of Translational Genomics, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, D-85764, Germany
| | - Aliki-Eleni Farmaki
- Department of Health Sciences, College of Life Sciences, University of Leicester, Leicester, LE1 6TP, United Kingdom
- Department of Nutrition and Dietetics, School of Health Science and Education, Harokopio University of Athens, Athens, 176-71, Greece
| | - Thea Bjornland
- Department of Mathematical Sciences, Norwegian Institute of Science and Technology, Trondheim, 7491, Norway
| | - Ryan Waples
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Copenhagen, 2200, Denmark
| | - Emil V R Appel
- Section for Metabolic Genetics, Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Copenhagen, 2200, Denmark
| | | | - Giorgio Melloni
- Department of Biomedical Informatics, Harvard Medical School, Boston, 02115, MA, USA
| | - Britt Kilian
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - Nigel W Rayner
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, United Kingdom
- Oxford Centre for Diabetes, Endocrinology and Metabolism, Radcliffe Department of Medicine, University of Oxford, Old Road, Headington, Oxford, OX3 7LE, United Kingdom
| | - Ioanna Ntalla
- William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, United Kingdom
| | - Kousik Kundu
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- Department of Haematology, Cambridge Biomedical Campus, University of Cambridge, Long Road, Cambridge, CB2 0PT, United Kingdom
| | - Klaudia Walter
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | - John Danesh
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
- The National Institute for Health Research Blood and Transplant Unit (NIHR BTRU) in Donor Health and Genomics at the University of Cambridge, Strangeways Research Laboratory, Wort's Causeway, University of Cambridge, Cambridge, CB1 8RN, United Kingdom
- MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, Wort's Causeway, University of Cambridge, Strangeways Research Laboratory, Cambridge, CB1 8RN, United Kingdom
| | - Adam Butterworth
- The National Institute for Health Research Blood and Transplant Unit (NIHR BTRU) in Donor Health and Genomics at the University of Cambridge, Strangeways Research Laboratory, Wort's Causeway, University of Cambridge, Cambridge, CB1 8RN, United Kingdom
- MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, Wort's Causeway, University of Cambridge, Strangeways Research Laboratory, Cambridge, CB1 8RN, United Kingdom
- British Heart Foundation Centre of Excellence, Division of Cardiovascular Medicine, Addenbrooke's Hospital, Hills Road, Cambridge, CB2 0QQ, United Kingdom
| | - Inês Barroso
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom
| | | | - George Dedoussis
- Department of Nutrition and Dietetics, School of Health Science and Education, Harokopio University of Athens, Athens, 176-71, Greece
| | - Ida Moltke
- The Bioinformatics Centre, Department of Biology, University of Copenhagen, Copenhagen, 2200, Denmark
| | - Eleftheria Zeggini
- Department of Human Genetics, Wellcome Sanger Institute, Hinxton, CB10 1SA, United Kingdom.
- Institute of Translational Genomics, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, D-85764, Germany.
| |
Collapse
|
20
|
Qiao D, Ameli A, Prokopenko D, Chen H, Kho AT, Parker MM, Morrow J, Hobbs BD, Liu Y, Beaty TH, Crapo JD, Barnes KC, Nickerson DA, Bamshad M, Hersh CP, Lomas DA, Agusti A, Make BJ, Calverley PMA, Donner CF, Wouters EF, Vestbo J, Paré PD, Levy RD, Rennard SI, Tal-Singer R, Spitz MR, Sharma A, Ruczinski I, Lange C, Silverman EK, Cho MH. Whole exome sequencing analysis in severe chronic obstructive pulmonary disease. Hum Mol Genet 2018; 27:3801-3812. [PMID: 30060175 PMCID: PMC6196654 DOI: 10.1093/hmg/ddy269] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2018] [Revised: 07/09/2018] [Accepted: 07/17/2018] [Indexed: 12/13/2022] Open
Abstract
Chronic obstructive pulmonary disease (COPD), one of the leading causes of death worldwide, is substantially influenced by genetic factors. Alpha-1 antitrypsin deficiency demonstrates that rare coding variants of large effect can influence COPD susceptibility. To identify additional rare coding variants in patients with severe COPD, we conducted whole exome sequencing analysis in 2543 subjects from two family-based studies (Boston Early-Onset COPD Study and International COPD Genetics Network) and one case-control study (COPDGene). Applying a gene-based segregation test in the family-based data, we identified significant segregation of rare loss of function variants in TBC1D10A and RFPL1 (P-value < 2x10-6), but were unable to find similar variants in the case-control study. In single-variant, gene-based and pathway association analyses, we were unable to find significant findings that replicated or were significant in meta-analysis. However, we found that the top results in the two datasets were in proximity to each other in the protein-protein interaction network (P-value = 0.014), suggesting enrichment of these results for similar biological processes. A network of these association results and their neighbors was significantly enriched in the transforming growth factor beta-receptor binding and cilia-related pathways. Finally, in a more detailed examination of candidate genes, we identified individuals with putative high-risk variants, including patients harboring homozygous mutations in genes associated with cutis laxa and Niemann-Pick Disease Type C. Our results likely reflect heterogeneity of genetic risk for COPD along with limitations of statistical power and functional annotation, and highlight the potential of network analysis to gain insight into genetic association studies.
Collapse
Affiliation(s)
- Dandi Qiao
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Asher Ameli
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Physics, Northeastern University, Boston, Massachusetts, United States of America
| | - Dmitry Prokopenko
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America
- Center for Precision Health, School of Public Health and School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, United States of America
| | - Alvin T Kho
- Boston Children’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Margaret M Parker
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Jarrett Morrow
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Brian D Hobbs
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Yanhong Liu
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, Texas, United States of America
| | - Terri H Beaty
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - James D Crapo
- National Jewish Health, Denver, Colorado, United States of America
| | - Kathleen C Barnes
- Division of Allergy and Clinical Immunology, Department of Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Deborah A Nickerson
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Michael Bamshad
- Division of Genetic Medicine, Department of Pediatrics, University of Washington and Seattle Children’s Hospital, Seattle, Washington , United States of America
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | | | - Alvar Agusti
- Respiratory Institute, Hospital Clinic, IDIBAPS, University of Barcelona, CIBERES, Barcelona, Spain
| | - Barry J Make
- National Jewish Health, Denver, Colorado, United States of America
| | | | - Claudio F Donner
- Mondo Medico di I.F.I.M. srl, Multidisciplinary and Rehabilitation Outpatient Clinic, Borgomanero, Novara, Italy
| | - Emiel F Wouters
- Department of Respiratory Medicine, Maastricht University Medical Center, AZ Maastricht, The Netherlands
| | - Jørgen Vestbo
- University of Manchester, Manchester, United Kingdom
| | - Peter D Paré
- Respiratory Division, Department of Medicine, University of British Columbia, Vancouver, British Columbia V6T, Canada
| | - Robert D Levy
- Respiratory Division, Department of Medicine, University of British Columbia, Vancouver, British Columbia V6T, Canada
| | - Stephen I Rennard
- University of Nebraska Medical Center, Omaha, Nebraska, United States of America
- AstraZeneca, Cambridge CB2 0RE, United Kingdom
| | - Ruth Tal-Singer
- GSK Research and Development, KingOf Prussia, Pennsylvania, United States of America
| | - Margaret R Spitz
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, Texas, United States of America
| | - Amitabh Sharma
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Ingo Ruczinski
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America
| | - Christoph Lange
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, Massachusetts, United States of America
- Channing Division of Network Medicine, Longwood Avenue, Boston, MA, USA
| |
Collapse
|
21
|
Genetic pleiotropy between mood disorders, metabolic, and endocrine traits in a multigenerational pedigree. Transl Psychiatry 2018; 8:218. [PMID: 30315151 PMCID: PMC6185949 DOI: 10.1038/s41398-018-0226-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Revised: 05/10/2018] [Accepted: 07/14/2018] [Indexed: 12/15/2022] Open
Abstract
Bipolar disorder (BD) is a mental disorder characterized by alternating periods of depression and mania. Individuals with BD have higher levels of early mortality than the general population, and a substantial proportion of this is due to increased risk for comorbid diseases. To identify the molecular events that underlie BD and related medical comorbidities, we generated imputed whole-genome sequence data using a population-specific reference panel for an extended multigenerational Old Order Amish pedigree (n = 394), segregating BD and related disorders. First, we investigated all putative disease-causing variants at known Mendelian disease loci present in this pedigree. Second, we performed genomic profiling using polygenic risk scores (PRS) to establish each individual's risk for several complex diseases. We identified a set of Mendelian variants that co-occur in individuals with BD more frequently than their unaffected family members, including the R3527Q mutation in APOB associated with hypercholesterolemia. Using PRS, we demonstrated that BD individuals from this pedigree were enriched for the same common risk alleles for BD as the general population (β = 0.416, p = 6 × 10-4). Furthermore, we find evidence for a common genetic etiology between BD risk and polygenic risk for clinical autoimmune thyroid disease (p = 1 × 10-4), diabetes (p = 1 × 10-3), and lipid traits such as triglyceride levels (p = 3 × 10-4) in the pedigree. We identify genomic regions that contribute to the differences between BD individuals and unaffected family members by calculating local genetic risk for independent LD blocks. Our findings provide evidence for the extensive genetic pleiotropy that can drive epidemiological findings of comorbidities between diseases and other complex traits.
Collapse
|
22
|
Wu X, Guan T, Liu DJ, León Novelo LG, Bandyopadhyay D. ADAPTIVE-WEIGHT BURDEN TEST FOR ASSOCIATIONS BETWEEN QUANTITATIVE TRAITS AND GENOTYPE DATA WITH COMPLEX CORRELATIONS. Ann Appl Stat 2018; 12:1558-1582. [PMID: 30214655 DOI: 10.1214/17-aoas1121] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
High-throughput sequencing has often been used to screen samples from pedigrees or with population structure, producing genotype data with complex correlations rendered from both familial relation and linkage disequilibrium. With such data, it is critical to account for these genotypic correlations when assessing the contribution of variants by gene or pathway. Recognizing the limitations of existing association testing methods, we propose Adaptive-weight Burden Test (ABT), a retrospective, mixed-model test for genetic association of quantitative traits on genotype data with complex correlations. This method makes full use of genotypic correlations across both samples and variants, and adopts "data-driven" weights to improve power. We derive the ABT statistic and its explicit distribution under the null hypothesis, and demonstrate through simulation studies that it is generally more powerful than the fixed-weight burden test and family-based SKAT in various scenarios, controlling for the type I error rate. Further investigation reveals the connection of ABT with kernel tests, as well as the adaptability of its weights to the direction of genetic effects. The application of ABT is illustrated by a whole genome analysis of genes with common and rare variants associated with fasting glucose from the NHLBI "Grand Opportunity" Exome Sequencing Project.
Collapse
Affiliation(s)
- Xiaowei Wu
- Department of Statistics, Virginia Tech, 250 Drillfield Drive, MC0439, Blacksburg, VA 24061, USA
| | - Ting Guan
- Department of Statistics, Virginia Tech, 250 Drillfield Drive, MC0439, Blacksburg, VA 24061, USA
| | - Dajiang J Liu
- Department of Public Health Sciences, Hershey Institute of Personalized Medicine, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Luis G León Novelo
- Department of Biostatistics, School of Public Health, University of Texas Health Science Center, Houston, TX 77030, USA
| | | |
Collapse
|
23
|
Wang X, Zhang Z, Morris N, Cai T, Lee S, Wang C, Yu TW, Walsh CA, Lin X. Rare variant association test in family-based sequencing studies. Brief Bioinform 2018; 18:954-961. [PMID: 27677958 DOI: 10.1093/bib/bbw083] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2016] [Indexed: 12/20/2022] Open
Abstract
The objective of this article is to introduce valid and robust methods for the analysis of rare variants for family-based exome chips, whole-exome sequencing or whole-genome sequencing data. Family-based designs provide unique opportunities to detect genetic variants that complement studies of unrelated individuals. Currently, limited methods and software tools have been developed to assist family-based association studies with rare variants, especially for analyzing binary traits. In this article, we address this gap by extending existing burden and kernel-based gene set association tests for population data to related samples, with a particular emphasis on binary phenotypes. The proposed approach blends the strengths of kernel machine methods and generalized estimating equations. Importantly, the efficient generalized kernel score test can be applied as a mega-analysis framework to combine studies with different designs. We illustrate the application of the proposed method using data from an exome sequencing study of autism. Methods discussed in this article are implemented in an R package 'gskat', which is available on CRAN and GitHub.
Collapse
|
24
|
Lee S, Choi S, Qiao D, Cho M, Silverman EK, Park T, Won S. WISARD: workbench for integrated superfast association studies for related datasets. BMC Med Genomics 2018; 11:39. [PMID: 29697360 PMCID: PMC5918457 DOI: 10.1186/s12920-018-0345-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND A Mendelian transmission produces phenotypic and genetic relatedness between family members, giving family-based analytical methods an important role in genetic epidemiological studies-from heritability estimations to genetic association analyses. With the advance in genotyping technologies, whole-genome sequence data can be utilized for genetic epidemiological studies, and family-based samples may become more useful for detecting de novo mutations. However, genetic analyses employing family-based samples usually suffer from the complexity of the computational/statistical algorithms, and certain types of family designs, such as incorporating data from extended families, have rarely been used. RESULTS We present a Workbench for Integrated Superfast Association studies for Related Data (WISARD) programmed in C/C++. WISARD enables the fast and a comprehensive analysis of SNP-chip and next-generation sequencing data on extended families, with applications from designing genetic studies to summarizing analysis results. In addition, WISARD can automatically be run in a fully multithreaded manner, and the integration of R software for visualization makes it more accessible to non-experts. CONCLUSIONS Comparison with existing toolsets showed that WISARD is computationally suitable for integrated analysis of related subjects, and demonstrated that WISARD outperforms existing toolsets. WISARD has also been successfully utilized to analyze the large-scale massive sequencing dataset of chronic obstructive pulmonary disease data (COPD), and we identified multiple genes associated with COPD, which demonstrates its practical value.
Collapse
Affiliation(s)
- Sungyoung Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Sungkyoung Choi
- Department of Pharmacology, Yonsei University College of Medicine, Seoul, South Korea
| | - Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea. .,Department of Statistics, Seoul National University, 1 Kwanak-ro, Kwanak-gu, Seoul, 151-742, South Korea.
| | - Sungho Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea. .,Department of Public Health Sciences, Graduate School of Public Health, Seoul National University, 1 Kwanak-ro, Kwanak-gu, Seoul, 151-742, South Korea. .,Institute of Health and Environment, Seoul National University, Seoul, South Korea.
| |
Collapse
|
25
|
Abstract
Relatedness within a sample can be of ancient (population stratification) or recent (familial structure) origin, and can either be known (pedigree data) or unknown (cryptic relatedness). All of these forms of familial relatedness have the potential to confound the results of genome-wide association studies. This chapter reviews the major methods available to researchers to adjust for the biases introduced by relatedness and maximize power to detect associations. The advantages and disadvantages of different methods are presented with reference to elements of study design, population characteristics, and computational requirements.
Collapse
Affiliation(s)
- Russell Thomson
- Centre for Research in Mathematics, School of Computing, Engineering and Mathematics, Western Sydney University, Parramatta, Australia.
| | - Rebekah McWhirter
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia
| |
Collapse
|
26
|
Abstract
While genome-wide association studies have been very successful in identifying associations of common genetic variants with many different traits, the rarer frequency spectrum of the genome has not yet been comprehensively explored. Technological developments increasingly lift restrictions to access rare genetic variation. Dense reference panels enable improved genotype imputation for rarer variants in studies using DNA microarrays. Moreover, the decreasing cost of next generation sequencing makes whole exome and genome sequencing increasingly affordable for large samples. Large-scale efforts based on sequencing, such as ExAC, 100,000 Genomes, and TopMed, are likely to significantly advance this field.The main challenge in evaluating complex trait associations of rare variants is statistical power. The choice of population should be considered carefully because allele frequencies and linkage disequilibrium structure differ between populations. Genetically isolated populations can have favorable genomic characteristics for the study of rare variants.One strategy to increase power is to assess the combined effect of multiple rare variants within a region, known as aggregate testing. A range of methods have been developed for this. Model performance depends on the genetic architecture of the region of interest.
Collapse
Affiliation(s)
- Karoline Kuchenbaecker
- Wellcome Trust Sanger Institute, Cambridge, UK. .,University College London, London, UK.
| | - Emil Vincent Rosenbaum Appel
- Novo Nordisk Foundation Center for Basic Metabolic Research, Section for Metabolic Genetics, Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
27
|
Amin N, Belonogova NM, Jovanova O, Brouwer RWW, van Rooij JGJ, van den Hout MCGN, Svishcheva GR, Kraaij R, Zorkoltseva IV, Kirichenko AV, Hofman A, Uitterlinden AG, van IJcken WFJ, Tiemeier H, Axenovich TI, van Duijn CM. Nonsynonymous Variation in NKPD1 Increases Depressive Symptoms in European Populations. Biol Psychiatry 2017; 81:702-707. [PMID: 27745872 DOI: 10.1016/j.biopsych.2016.08.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/06/2016] [Revised: 07/28/2016] [Accepted: 08/02/2016] [Indexed: 12/21/2022]
Abstract
BACKGROUND Despite high heritability, little success was achieved in mapping genetic determinants of depression-related traits by means of genome-wide association studies. METHODS To identify genes associated with depressive symptomology, we performed a gene-based association analysis of nonsynonymous variation captured using exome-sequencing and exome-chip genotyping in a genetically isolated population from the Netherlands (n = 1999). Finally, we reproduced our significant findings in an independent population-based cohort (n = 1604). RESULTS We detected significant association of depressive symptoms with a gene NKPD1 (p = 3.7 × 10-08). Nonsynonymous variants in the gene explained 0.9% of sex- and age-adjusted variance of depressive symptoms in the discovery study, which is translated into 3.8% of the total estimated heritability (h2 = 0.24). Significant association of depressive symptoms with NKPD1 was also observed (n = 1604; p = 1.5 × 10-03) in the independent replication sample despite little overlap with the discovery cohort in the set of nonsynonymous genetic variants observed in the NKPD1 gene. Meta-analysis of the discovery and replication studies improved the association signal (p = 1.0 × 10-09). CONCLUSIONS Our study suggests that nonsynonymous variation in the gene NKPD1 affects depressive symptoms in the general population. NKPD1 is predicted to be involved in the de novo synthesis of sphingolipids, which have been implicated in the pathogenesis of depression.
Collapse
Affiliation(s)
- Najaf Amin
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands.
| | | | - Olivera Jovanova
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Rutger W W Brouwer
- Center for Biomics, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Jeroen G J van Rooij
- Internal Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands
| | | | - Gulnara R Svishcheva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Robert Kraaij
- Internal Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Irina V Zorkoltseva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Anatoly V Kirichenko
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Albert Hofman
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
| | - André G Uitterlinden
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands; Internal Medicine, Erasmus University Medical Center, Rotterdam, the Netherlands
| | | | - Henning Tiemeier
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Tatiana I Axenovich
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia; Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Cornelia M van Duijn
- Departments of Epidemiology, Erasmus University Medical Center, Rotterdam, the Netherlands
| |
Collapse
|
28
|
Wang Z, Xu K, Zhang X, Wu X, Wang Z. Longitudinal SNP-set association analysis of quantitative phenotypes. Genet Epidemiol 2017; 41:81-93. [PMID: 27859628 PMCID: PMC5154867 DOI: 10.1002/gepi.22016] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Revised: 08/10/2016] [Accepted: 09/19/2016] [Indexed: 02/06/2023]
Abstract
Many genetic epidemiological studies collect repeated measurements over time. This design not only provides a more accurate assessment of disease condition, but allows us to explore the genetic influence on disease development and progression. Thus, it is of great interest to study the longitudinal contribution of genes to disease susceptibility. Most association testing methods for longitudinal phenotypes are developed for single variant, and may have limited power to detect association, especially for variants with low minor allele frequency. We propose Longitudinal SNP-set/sequence kernel association test (LSKAT), a robust, mixed-effects method for association testing of rare and common variants with longitudinal quantitative phenotypes. LSKAT uses several random effects to account for the within-subject correlation in longitudinal data, and allows for adjustment for both static and time-varying covariates. We also present a longitudinal trait burden test (LBT), where we test association between the trait and the burden score in linear mixed models. In simulation studies, we demonstrate that LBT achieves high power when variants are almost all deleterious or all protective, while LSKAT performs well in a wide range of genetic models. By making full use of trait values from repeated measures, LSKAT is more powerful than several tests applied to a single measurement or average over all time points. Moreover, LSKAT is robust to misspecification of the covariance structure. We apply the LSKAT and LBT methods to detect association with longitudinally measured body mass index in the Framingham Heart Study, where we are able to replicate association with a circadian gene NR1D2.
Collapse
Affiliation(s)
- Zhong Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
- Baker Institute for Animal Health, Cornell University, Ithaca, New York, United States of America
- Center for Computational Biology, Beijing Forestry University, Beijing, China
| | - Ke Xu
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, United States of America
- VA Connecticut Healthcare System, West Haven, Connecticut, United States of America
| | - Xinyu Zhang
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut, United States of America
- VA Connecticut Healthcare System, West Haven, Connecticut, United States of America
| | - Xiaowei Wu
- Department of Statistics, Virginia Polytechnic Institute and State University, Blacksburg, Virginia, United States of America
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| |
Collapse
|
29
|
Xu L, Craiu RV, Sun L, Paterson AD. Parameter Expanded Algorithms for Bayesian Latent Variable Modeling of Genetic Pleiotropy Data. J Comput Graph Stat 2016; 25:405-425. [PMID: 27752219 DOI: 10.1080/10618600.2014.988337] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Motivated by genetic association studies of pleiotropy, we propose a Bayesian latent variable approach to jointly study multiple outcomes. The models studied here can incorporate both continuous and binary responses, and can account for serial and cluster correlations. We consider Bayesian estimation for the model parameters, and we develop a novel MCMC algorithm that builds upon hierarchical centering and parameter expansion techniques to efficiently sample from the posterior distribution. We evaluate the proposed method via extensive simulations and demonstrate its utility with an application to aa association study of various complication outcomes related to type 1 diabetes. This article has supplementary material online.
Collapse
Affiliation(s)
- Lizhen Xu
- Department of Statistical Sciences, University of Toronto,
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto,
| | - Lei Sun
- Department of Statistical Sciences and Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto,
| | - Andrew D Paterson
- Program in Genetics and Genomic Biology, Hospital for Sick Children, and Dalla Lana School of Public Health, University of Toronto, Toronto,
| |
Collapse
|
30
|
Darst BF, Engelman CD. Transmission and decorrelation methods for detecting rare variants using sequencing data from related individuals. BMC Proc 2016; 10:203-207. [PMID: 27980637 PMCID: PMC5133523 DOI: 10.1186/s12919-016-0031-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Advances in whole genome sequencing have enabled the investigation of rare variants, which could explain some of the missing heritability that genome-wide association studies are unable to detect. Most methods to detect associations with rare variants are developed for unrelated individuals; however, several methods exist that utilize family studies and could have better power to detect such associations. METHODS Using whole genome sequencing data and simulated phenotypes provided by the organizers of the Genetic Analysis Workshop 19 (GAW19), we compared family-based methods that test for associations between rare and common variants with a quantitative trait. This was done using 2 fairly novel methods: family-based association test for rare variants (FBAT-RV), which is a transmission-based method that utilizes the transmission of genetic information from parent to offspring; and Minimum p value Optimized Nuisance parameter Score Test Extended to Relatives (MONSTER), which is a decorrelation method that instead attempts to adjust for relatedness using a regression-based method. We also considered family-based association test linear combination (FBAT-LC) and FBAT-Min P, which are slightly older methods that do not allow for the weighting of rare or common variants, but contrast some of the limitations of FBAT-RV. RESULTS MONSTER had much higher overall power than FBAT-RV and FBAT-Min P. Interestingly, FBAT-LC had similar overall power as MONSTER. MONSTER had the highest power for a gene accounting for a larger percent of the phenotypic variance, whereas MONSTER and FBAT-LC both had the highest power for a gene accounting for moderate variance. FBAT-LC had the highest power for a gene accounting for the least variance. CONCLUSIONS Based on the simulated data from GAW19, MONSTER and FBAT-LC were the most powerful of the methods assessed. However, there are limitations to each of these methods that should be carefully considered when conducting an analysis of rare variants in related individuals. This emphasizes the need for methods that can incorporate the advantages of each of these methods into 1 family-based association test for rare variants.
Collapse
Affiliation(s)
- Burcu F. Darst
- University of Wisconsin, Madison, WI USA
- Department of Population Health Sciences, University of Wisconsin School of Medicine and Public Health, Madison, WI USA
| | - Corinne D. Engelman
- University of Wisconsin, Madison, WI USA
- Department of Population Health Sciences, University of Wisconsin School of Medicine and Public Health, Madison, WI USA
| |
Collapse
|
31
|
Sippy R, Kolesar JM, Darst BF, Engelman CD. Prioritization of family member sequencing for the detection of rare variants. BMC Proc 2016; 10:227-231. [PMID: 27980641 PMCID: PMC5133500 DOI: 10.1186/s12919-016-0035-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The advent of affordable sequencing has enabled researchers to discover many variants contributing to disease, including rare variants. There are methods for determining the most informative individuals for sequencing, but the application of these methods is more complex when working with families. Sets of large families can be beneficial in finding rare variants, but it may be unfeasible to sequence all members of these family sets. METHODS Using simulated data from the Genetic Analysis Workshop 19, we apply multiple regression to identify cases and controls. To find the best controls for each case, we used kinship coefficients to match within families. Selected cases and controls were analyzed for rare variants, collapsed by gene, associated with hypertension using the family-based rare variant association test (FARVAT). RESULTS The gene with the strongest simulated effect, MAP4, did not meet the Bonferroni corrected significance threshold. However, analysis of cases and controls using our selection method substantially improved the significance of MAP4, despite the reduction in sample size. CONCLUSIONS Taking the additional steps to select the optimal cases and controls from large family data sets can help ensure that only informative individuals are included in analysis and may improve the ability to detect rare variants.
Collapse
Affiliation(s)
- Rachel Sippy
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| | - Jill M Kolesar
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, WI 53705 USA
| | - Burcu F Darst
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| | - Corinne D Engelman
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| |
Collapse
|
32
|
Zhang Q, Guldbrandtsen B, Calus MPL, Lund MS, Sahana G. Comparison of gene-based rare variant association mapping methods for quantitative traits in a bovine population with complex familial relationships. Genet Sel Evol 2016; 48:60. [PMID: 27534618 PMCID: PMC4989328 DOI: 10.1186/s12711-016-0238-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 08/04/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is growing interest in the role of rare variants in the variation of complex traits due to increasing evidence that rare variants are associated with quantitative traits. However, association methods that are commonly used for mapping common variants are not effective to map rare variants. Besides, livestock populations have large half-sib families and the occurrence of rare variants may be confounded with family structure, which makes it difficult to disentangle their effects from family mean effects. We compared the power of methods that are commonly applied in human genetics to map rare variants in cattle using whole-genome sequence data and simulated phenotypes. We also studied the power of mapping rare variants using linear mixed models (LMM), which are the method of choice to account for both family relationships and population structure in cattle. RESULTS We observed that the power of the LMM approach was low for mapping a rare variant (defined as those that have frequencies lower than 0.01) with a moderate effect (5 to 8 % of phenotypic variance explained by multiple rare variants that vary from 5 to 21 in number) contributing to a QTL with a sample size of 1000. In contrast, across the scenarios studied, statistical methods that are specialized for mapping rare variants increased power regardless of whether multiple rare variants or a single rare variant underlie a QTL. Different methods for combining rare variants in the test single nucleotide polymorphism set resulted in similar power irrespective of the proportion of total genetic variance explained by the QTL. However, when the QTL variance is very small (only 0.1 % of the total genetic variance), these specialized methods for mapping rare variants and LMM generally had no power to map the variants within a gene with sample sizes of 1000 or 5000. CONCLUSIONS We observed that the methods that combine multiple rare variants within a gene into a meta-variant generally had greater power to map rare variants compared to LMM. Therefore, it is recommended to use rare variant association mapping methods to map rare genetic variants that affect quantitative traits in livestock, such as bovine populations.
Collapse
Affiliation(s)
- Qianqian Zhang
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, 8830, Denmark. .,Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, Wageningen, The Netherlands.
| | - Bernt Guldbrandtsen
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, 8830, Denmark
| | - Mario P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, Wageningen, The Netherlands
| | - Mogens Sandø Lund
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, 8830, Denmark
| | - Goutam Sahana
- Department of Molecular Biology and Genetics, Center for Quantitative Genetics and Genomics, Aarhus University, Tjele, 8830, Denmark
| |
Collapse
|
33
|
Abstract
Participants in the family-based analysis group at Genetic Analysis Workshop 19 addressed diverse topics, all of which used the family data. Topics addressed included questions of study design and data quality control (QC), genotype imputation to augment available sequence data, and linkage and/or association analyses. Results show that pedigree-based tests that are sensitive to genotype error may be useful for QC. Imputation quality improved with inclusion of small amounts of pedigree information used to phase the data in evaluation of 5 commonly used approaches for imputation in samples of (typically) unrelated subjects. It improved still further when pedigree-based imputation using larger pedigrees was also added. An important distinction was made between methods that do versus do not make use of Mendelian transmission in pedigrees, because this serves as a key difference between underlying models and assumptions. Methods that model relatedness generally had higher power in association testing than did analyses that carry out testing in the presence of a transmission model, but this may reflect details of implementation and/or ability of more general methods to jointly include data from larger pedigrees. In either case, for single nucleotide polymorphism-set approaches, weights that incorporate information on functional effects may be more useful than those that are based only on allele frequencies. The overall results demonstrate that family data continue to provide important information in the search for trait loci.
Collapse
Affiliation(s)
- Ellen M Wijsman
- Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, 98195, USA.
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
34
|
Chien LC, Hsu FC, Bowden DW, Chiu YF. Generalization of Rare Variant Association Tests for Longitudinal Family Studies. Genet Epidemiol 2016; 40:101-12. [PMID: 26783077 DOI: 10.1002/gepi.21951] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2015] [Revised: 11/19/2015] [Accepted: 11/19/2015] [Indexed: 11/06/2022]
Abstract
Given the functional relevance of many rare variants, their identification is frequently critical for dissecting disease etiology. Functional variants are likely to be aggregated in family studies enriched with affected members, and this aggregation increases the statistical power to detect rare variants associated with a trait of interest. Longitudinal family studies provide additional information for identifying genetic and environmental factors associated with disease over time. However, methods to analyze rare variants in longitudinal family data remain fairly limited. These methods should be capable of accounting for different sources of correlations and handling large amounts of sequencing data efficiently. To identify rare variants associated with a phenotype in longitudinal family studies, we extended pedigree-based burden (BT) and kernel (KS) association tests to genetic longitudinal studies. Generalized estimating equation (GEE) approaches were used to generalize the pedigree-based BT and KS to multiple correlated phenotypes under the generalized linear model framework, adjusting for fixed effects of confounding factors. These tests accounted for complex correlations between repeated measures of the same phenotype (serial correlations) and between individuals in the same family (familial correlations). We conducted comprehensive simulation studies to compare the proposed tests with mixed-effects models and marginal models, using GEEs under various configurations. When the proposed tests were applied to data from the Diabetes Heart Study, we found exome variants of POMGNT1 and JAK1 genes were associated with type 2 diabetes.
Collapse
Affiliation(s)
- Li-Chu Chien
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Fang-Chi Hsu
- Department of Biostatistical Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Donald W Bowden
- Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Yen-Feng Chiu
- Institute of Population Health Sciences, National Health Research Institutes, Miaoli, Taiwan
| |
Collapse
|
35
|
McCoy AM, Beeson SK, Splan RK, Lykkjen S, Ralston SL, Mickelson JR, McCue ME. Identification and validation of risk loci for osteochondrosis in standardbreds. BMC Genomics 2016; 17:41. [PMID: 26753841 PMCID: PMC4709891 DOI: 10.1186/s12864-016-2385-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2015] [Accepted: 01/07/2016] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Osteochondrosis (OC), simply defined as a failure of endochondral ossification, is a complex disease with both genetic and environmental risk factors that is commonly diagnosed in young horses, as well as other domestic species. Although up to 50 % of the risk for developing OC is reportedly inherited, specific genes and alleles underlying risk are thus far completely unknown. Regions of the genome identified as associated with OC vary across studies in different populations of horses. In this study, we used a cohort of Standardbred horses from the U.S. (n = 182) specifically selected for a shared early environment (to reduce confounding factors) to identify regions of the genome associated with tarsal OC. Subsequently, putative risk variants within these regions were evaluated in both the discovery population and an independently sampled validation population of Norwegian Standardbreds (n = 139) with tarsal OC. RESULTS After genome-wide association analysis of imputed data with information from >200,000 single nucleotide polymorphisms, two regions on equine chromosome 14 were associated with OC in the discovery cohort. Variant discovery in these and 30 additional regions of interest (including 11 from other published studies) was performed via whole-genome sequencing. 240 putative risk variants from 10 chromosomes were subsequently genotyped in both the discovery and validation cohorts. After correction for population structure, gait (trot or pace) and sex, the variants most highly associated with OC status in both populations were located within the chromosome 14 regions of association. CONCLUSIONS The association of putative risk alleles from within the same regions with disease status in two independent populations of Standardbreds suggest that these are true risk loci in this breed, although population-specific risk factors may still exist. Evaluation of these loci in other populations will help determine if they are specific to the Standardbred breed, or to tarsal OC or are universal risk loci for OC. Further work is needed to identify the specific variants underlying OC risk within these loci. This is the first step towards the long-term goal of constructing a genetic risk model for OC that allows for genetic testing and quantification of risk in individuals.
Collapse
Affiliation(s)
- Annette M McCoy
- Veterinary Population Medicine Department, University of Minnesota, 1365 Gortner Ave., St. Paul, MN, USA. .,Department of Veterinary Clinical Medicine, University of Illinois, 1008 Hazelwood Dr., Urbana, IL, USA.
| | - Samantha K Beeson
- Veterinary Population Medicine Department, University of Minnesota, 1365 Gortner Ave., St. Paul, MN, USA.
| | - Rebecca K Splan
- Department of Animal and Poultry Sciences, Virginia Tech, 3470 Litton Reaves Hall, Blacksburg, VA, USA.
| | - Sigrid Lykkjen
- Faculty of Veterinary Medicine and Biosciences, Norwegian University of Life Sciences, NMBU-School of Veterinary Science, P.O. Box 8146 Dep., Oslo, Norway.
| | - Sarah L Ralston
- School of Environmental and Biological Sciences, Rutgers, The State University of New Jersey, 84 Lipman Dr., New Brunswick, NJ, USA.
| | - James R Mickelson
- Veterinary Biological Sciences Department, University of Minnesota, 1988 Fitch Ave., St. Paul, MN, USA.
| | - Molly E McCue
- Veterinary Population Medicine Department, University of Minnesota, 1365 Gortner Ave., St. Paul, MN, USA.
| |
Collapse
|
36
|
Wu B, Guan W, Pankow JS. On Efficient and Accurate Calculation of Significance P-Values for Sequence Kernel Association Testing of Variant Set. Ann Hum Genet 2016; 80:123-35. [PMID: 26757198 DOI: 10.1111/ahg.12144] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 11/12/2015] [Indexed: 01/04/2023]
Abstract
The objective of this paper is to discuss and develop alternative computational methods to accurately and efficiently calculate significance P-values for the commonly used sequence kernel association test (SKAT) and adaptive sum of SKAT and burden test (SKAT-O) for variant set association. We show that the existing software can lead to either conservative or inflated type I errors. We develop alternative and efficient computational algorithms that quickly compute the SKAT P-value and have well-controlled type I errors. In addition, we derive an alternative and simplified formula for calculating the significance P-value of SKAT-O, which sheds light on the development of efficient and accurate numerical algorithms. We implement the proposed methods in the publicly available R package that can be readily used or adapted to large-scale sequencing studies. Given that more and more large-scale exome and whole genome sequencing or re-sequencing studies are being conducted, the proposed methods are practically very important. We conduct extensive numerical studies to investigate the performance of the proposed methods. We further illustrate their usefulness with application to associations between rare exonic variants and fasting glucose levels in the Atherosclerosis Risk in Communities (ARIC) study.
Collapse
Affiliation(s)
- Baolin Wu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Weihua Guan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - James S Pankow
- Division of Epidemiology and Community Health, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
37
|
Lakhal-Chaieb L, Oualkacha K, Richards BJ, Greenwood CM. A rare variant association test in family-based designs and non-normal quantitative traits. Stat Med 2015; 35:905-21. [DOI: 10.1002/sim.6750] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Revised: 09/04/2015] [Accepted: 09/05/2015] [Indexed: 12/13/2022]
Affiliation(s)
- Lajmi Lakhal-Chaieb
- Département de mathématiques et statistique; Université Laval; Québec G1V 0A6 Québec Canada
| | - Karim Oualkacha
- Département de mathématiques; Université de Québec À Montréal; Montreal Québec Canada
| | - Brent J. Richards
- Lady Davis Institute for Medical Research; Jewish General Hospital; Montreal Québec Canada
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Québec Canada
- Department of Twin Research; King's College London; London U.K
| | - Celia M.T. Greenwood
- Lady Davis Institute for Medical Research; Jewish General Hospital; Montreal Québec Canada
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Québec Canada
- Departments of Oncology and Human Genetics; McGill University; Montreal Québec Canada
| |
Collapse
|
38
|
Xu Z, Pan W. Approximate score-based testing with application to multivariate trait association analysis. Genet Epidemiol 2015. [PMID: 26198454 DOI: 10.1002/gepi.21911] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
For genome-wide association studies and DNA sequencing studies, several powerful score-based tests, such as kernel machine regression and sum of powered score tests, have been proposed in the last few years. However, extensions of these score-based tests to more complex models, such as mixed-effects models for analysis of multiple and correlated traits, have been hindered by the unavailability of the score vector, due to either no output from statistical software or no closed-form solution at all. We propose a simple and general method to asymptotically approximate the score vector based on an asymptotically normal and consistent estimate of a parameter vector to be tested and its (consistent) covariance matrix. The proposed method is applicable to both maximum-likelihood estimation and estimating function-based approaches. We use the derived approximate score vector to extend several score-based tests to mixed-effects models. We demonstrate the feasibility and possible power gains of these tests in association analysis of multiple and correlated quantitative or binary traits with both real and simulated data. The proposed method is easy to implement with a wide applicability.
Collapse
Affiliation(s)
- Zhiyuan Xu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| | | |
Collapse
|
39
|
Svishcheva GR, Belonogova NM, Axenovich TI. Region-Based Association Test for Familial Data under Functional Linear Models. PLoS One 2015; 10:e0128999. [PMID: 26111046 PMCID: PMC4481467 DOI: 10.1371/journal.pone.0128999] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Accepted: 05/04/2015] [Indexed: 12/22/2022] Open
Abstract
Region-based association analysis is a more powerful tool for gene mapping than testing of individual genetic variants, particularly for rare genetic variants. The most powerful methods for regional mapping are based on the functional data analysis approach, which assumes that the regional genome of an individual may be considered as a continuous stochastic function that contains information about both linkage and linkage disequilibrium. Here, we extend this powerful approach, earlier applied only to independent samples, to the samples of related individuals. To this end, we additionally include a random polygene effects in functional linear model used for testing association between quantitative traits and multiple genetic variants in the region. We compare the statistical power of different methods using Genetic Analysis Workshop 17 mini-exome family data and a wide range of simulation scenarios. Our method increases the power of regional association analysis of quantitative traits compared with burden-based and kernel-based methods for the majority of the scenarios. In addition, we estimate the statistical power of our method using regions with small number of genetic variants, and show that our method retains its advantage over burden-based and kernel-based methods in this case as well. The new method is implemented as the R-function 'famFLM' using two types of basis functions: the B-spline and Fourier bases. We compare the properties of the new method using models that differ from each other in the type of their function basis. The models based on the Fourier basis functions have an advantage in terms of speed and power over the models that use the B-spline basis functions and those that combine B-spline and Fourier basis functions. The 'famFLM' function is distributed under GPLv3 license and is freely available at http://mga.bionet.nsc.ru/soft/famFLM/.
Collapse
Affiliation(s)
- Gulnara R. Svishcheva
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Nadezhda M. Belonogova
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Tatiana I. Axenovich
- Institute of Cytology and Genetics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
- Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
40
|
A statistical approach for rare-variant association testing in affected sibships. Am J Hum Genet 2015; 96:543-54. [PMID: 25799106 DOI: 10.1016/j.ajhg.2015.01.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 01/30/2015] [Indexed: 11/21/2022] Open
Abstract
Sequencing and exome-chip technologies have motivated development of novel statistical tests to identify rare genetic variation that influences complex diseases. Although many rare-variant association tests exist for case-control or cross-sectional studies, far fewer methods exist for testing association in families. This is unfortunate, because cosegregation of rare variation and disease status in families can amplify association signals for rare variants. Many researchers have begun sequencing (or genotyping via exome chips) familial samples that were either recently collected or previously collected for linkage studies. Because many linkage studies of complex diseases sampled affected sibships, we propose a strategy for association testing of rare variants for use in this study design. The logic behind our approach is that rare susceptibility variants should be found more often on regions shared identical by descent by affected sibling pairs than on regions not shared identical by descent. We propose both burden and variance-component tests of rare variation that are applicable to affected sibships of arbitrary size and that do not require genotype information from unaffected siblings or independent controls. Our approaches are robust to population stratification and produce analytic p values, thereby enabling our approach to scale easily to genome-wide studies of rare variation. We illustrate our methods by using simulated data and exome chip data from sibships ascertained for hypertension collected as part of the Genetic Epidemiology Network of Arteriopathy (GENOA) study.
Collapse
|
41
|
Thornton T. Statistical methods for genome-wide and sequencing association studies of complex traits in related samples. CURRENT PROTOCOLS IN HUMAN GENETICS 2015; 84:1.28.1-1.28.9. [PMID: 25599666 PMCID: PMC4327940 DOI: 10.1002/0471142905.hg0128s84] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Genome-wide association studies (GWAS) and sequencing studies are routinely conducted for the identification of genetic variants that are associated with complex traits. Many genetic studies for association mapping include related individuals. When relatives are included in an association analysis, familial correlations must be appropriately taken into account to ensure correct type I error and to increase power. This unit provides an overview of statistical methods that are available for GWAS and sequencing association studies of complex traits in samples with related individuals.
Collapse
Affiliation(s)
- Timothy Thornton
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
42
|
Peng B. Reproducible simulations of realistic samples for next-generation sequencing studies using Variant Simulation Tools. Genet Epidemiol 2015; 39:45-52. [PMID: 25395236 PMCID: PMC6432799 DOI: 10.1002/gepi.21867] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2014] [Revised: 09/14/2014] [Accepted: 09/26/2014] [Indexed: 12/31/2022]
Abstract
Computer simulations have been widely used to validate and evaluate the power of statistical methods for genetic epidemiological studies. Although a large number of simulation methods and software packages have been developed for genome-wide association studies, methodological and bioinformatics challenges have limited their applications in simulating datasets for whole-genome and whole-exome sequencing studies. With the development of more sophisticated statistical methods that make fuller use of available data and our knowledge of the human genome, there is a pressing need for genetic simulators that capture more features of empirical data (e.g., multiallele variants, indels, use of the Variant Call Format) and the human genome (e.g., functional annotations of genetic variants). This article introduces Variant Simulation Tools (VST), a module of Variant Tools for the simulation of genetic variants for sequencing-based genetic epidemiological studies. Although multiple simulation engines are provided, the core of VST is a novel forward-time simulation engine that simulates real nucleotide sequences of the human genome using DNA mutation models, fine-scale recombination maps, and a selection model based on amino acid changes of translated protein sequences. The design of VST allows users to easily create and distribute simulation methods and simulated datasets for a variety of applications and encourages fair comparison between statistical methods through the use of existing or reproduced simulated datasets.
Collapse
Affiliation(s)
- Bo Peng
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, 1400 Pressler Street, Unit 1401, Houston, TX, 77030
| |
Collapse
|
43
|
Saad M, Wijsman EM. Combining family- and population-based imputation data for association analysis of rare and common variants in large pedigrees. Genet Epidemiol 2014; 38:579-90. [PMID: 25132070 PMCID: PMC4190076 DOI: 10.1002/gepi.21844] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Revised: 05/24/2014] [Accepted: 06/27/2014] [Indexed: 12/27/2022]
Abstract
In the last two decades, complex traits have become the main focus of genetic studies. The hypothesis that both rare and common variants are associated with complex traits is increasingly being discussed. Family-based association studies using relatively large pedigrees are suitable for both rare and common variant identification. Because of the high cost of sequencing technologies, imputation methods are important for increasing the amount of information at low cost. A recent family-based imputation method, Genotype Imputation Given Inheritance (GIGI), is able to handle large pedigrees and accurately impute rare variants, but does less well for common variants where population-based methods perform better. Here, we propose a flexible approach to combine imputation data from both family- and population-based methods. We also extend the Sequence Kernel Association Test for Rare and Common variants (SKAT-RC), originally proposed for data from unrelated subjects, to family data in order to make use of such imputed data. We call this extension "famSKAT-RC." We compare the performance of famSKAT-RC and several other existing burden and kernel association tests. In simulated pedigree sequence data, our results show an increase of imputation accuracy from use of our combining approach. Also, they show an increase of power of the association tests with this approach over the use of either family- or population-based imputation methods alone, in the context of rare and common variants. Moreover, our results show better performance of famSKAT-RC compared to the other considered tests, in most scenarios investigated here.
Collapse
Affiliation(s)
- Mohamad Saad
- Division of Medical Genetics, Department of Medicine; and Department
of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Ellen M. Wijsman
- Division of Medical Genetics, Department of Medicine; and Department
of Biostatistics, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
44
|
Jiang Y, Conneely KN, Epstein MP. Flexible and robust methods for rare-variant testing of quantitative traits in trios and nuclear families. Genet Epidemiol 2014; 38:542-51. [PMID: 25044337 DOI: 10.1002/gepi.21839] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Revised: 05/21/2014] [Accepted: 05/29/2014] [Indexed: 11/07/2022]
Abstract
Most rare-variant association tests for complex traits are applicable only to population-based or case-control resequencing studies. There are fewer rare-variant association tests for family-based resequencing studies, which is unfortunate because pedigrees possess many attractive characteristics for such analyses. Family-based studies can be more powerful than their population-based counterparts due to increased genetic load and further enable the implementation of rare-variant association tests that, by design, are robust to confounding due to population stratification. With this in mind, we propose a rare-variant association test for quantitative traits in families; this test integrates the QTDT approach of Abecasis et al. [Abecasis et al., ] into the kernel-based SNP association test KMFAM of Schifano et al. [Schifano et al., ]. The resulting within-family test enjoys the many benefits of the kernel framework for rare-variant association testing, including rapid evaluation of P-values and preservation of power when a region harbors rare causal variation that acts in different directions on phenotype. Additionally, by design, this within-family test is robust to confounding due to population stratification. Although within-family association tests are generally less powerful than their counterparts that use all genetic information, we show that we can recover much of this power (although still ensuring robustness to population stratification) using a straightforward screening procedure. Our method accommodates covariates and allows for missing parental genotype data, and we have written software implementing the approach in R for public use.
Collapse
Affiliation(s)
- Yunxuan Jiang
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
| | | | | |
Collapse
|
45
|
Abstract
The use of genetically isolated populations can empower next-generation association studies. In this review, we discuss the advantages of this approach and review study design and analytical considerations of genetic association studies focusing on isolates. We cite successful examples of using population isolates in association studies and outline potential ways forward.
Collapse
|