1
|
Timoteo VJ, Chiang KM, Yang HC, Pan WH. Common and ethnic-specific genetic determinants of hemoglobin concentration between Taiwanese Han Chinese and European Whites: findings from comparative two-stage genome-wide association studies. J Nutr Biochem 2023; 111:109126. [PMID: 35964923 DOI: 10.1016/j.jnutbio.2022.109126] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 06/21/2022] [Accepted: 07/15/2022] [Indexed: 12/23/2022]
Abstract
Human iron nutrition is a result of interplays between genetic and environmental factors. However, there has been scarcity of data on the genetic variants associated with altered iron homeostasis and ethnic-specific associations are further lacking. In this study, we compared between the Taiwanese Han Chinese (HC) and European Whites the genetic determinants of hemoglobin (Hb) concentration, a biochemical parameter that in part reflects the amount of functional iron in the body. Through sex-specific two-stage genome-wide association studies (2S-GWAS), we observed the consistent Hb-association of SNPs in TMPRSS6 (chr 22), ABO (chr 9), and PRKCE (chr 2) across sexes in both ethnic groups. Specific to the Taiwanese HC, the Hb-association of AXIN1, together with other loci near the chr 16 alpha-globin gene cluster, was found novel. On the other hand, majority of the Hb-associated SNPs among Europeans were identified along the chr 6 major histocompatibility complex (MHC) region, which has established roles in immune system control. We report here strong Hb-associations of HFE and members of gene families (SLC17; H2A, H2B, H3, H4, H1; TRIM; ZSCAN, ZKSCAN, ZNF; HLA; BTN, OR), numerous SNPs in/nearby CARMIL1, PRRC2A, PSORS1C1, NOTCH4, TSBP1, C6orf15, and distinct associations with non-coding RNA genes. Our findings provide evidence for both common and ethnic-specific genetic determinants of Hb between East Asians and Caucasians. These will help to further our understanding of the iron and/or erythropoiesis physiology in humans and to identify high risk subgroups for iron imbalances - a primary requirement to meet the goal of precision nutrition for optimal health.
Collapse
Affiliation(s)
- Vanessa Joy Timoteo
- Taiwan International Graduate Program in Molecular Medicine, National Yang Ming Chiao Tung University and Academia Sinica, Taipei City, Taiwan; Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Kuang-Mao Chiang
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan
| | - Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei City, Taiwan
| | - Wen-Harn Pan
- Institute of Biomedical Sciences, Academia Sinica, Taipei City, Taiwan.
| |
Collapse
|
2
|
Perng W, Aslibekyan S. Find the Needle in the Haystack, Then Find It Again: Replication and Validation in the 'Omics Era. Metabolites 2020; 10:metabo10070286. [PMID: 32664690 PMCID: PMC7408356 DOI: 10.3390/metabo10070286] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 07/01/2020] [Accepted: 07/10/2020] [Indexed: 01/25/2023] Open
Abstract
Advancements in high-throughput technologies have made it feasible to study thousands of biological pathways simultaneously for a holistic assessment of health and disease risk via ‘omics platforms. A major challenge in ‘omics research revolves around the reproducibility of findings—a feat that hinges upon balancing false-positive associations with generalizability. Given the foundational role of reproducibility in scientific inference, replication and validation of ‘omics findings are cornerstones of this effort. In this narrative review, we define key terms relevant to replication and validation, present issues surrounding each concept with historical and contemporary examples from genomics (the most well-established and upstream ‘omics), discuss special issues and unique considerations for replication and validation in metabolomics (an emerging field and most downstream ‘omics for which best practices remain yet to be established), and make suggestions for future research leveraging multiple ‘omics datasets.
Collapse
Affiliation(s)
- Wei Perng
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Denver Anschutz Medical Campus, Aurora, CO 80045, USA
- Correspondence:
| | - Stella Aslibekyan
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL 35294, USA;
| |
Collapse
|
3
|
Kawabata T, Emoto R, Nishino J, Takahashi K, Matsui S. Two-stage analysis for selecting fixed numbers of features in omics association studies. Stat Med 2019; 38:2956-2971. [PMID: 30931544 DOI: 10.1002/sim.8150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 12/31/2018] [Accepted: 02/28/2019] [Indexed: 11/07/2022]
Abstract
One of main roles of omics-based association studies with high-throughput technologies is to screen out relevant molecular features, such as genetic variants, genes, and proteins, from a large pool of such candidate features based on their associations with the phenotype of interest. Typically, screened features are subject to validation studies using more established or conventional assays, where the number of evaluable features is relatively limited, so that there may exist a fixed number of features measurable by these assays. Such a limitation necessitates narrowing a feature set down to a fixed size, following an initial screening analysis via multiple testing where adjustment for multiplicity is made. We propose a two-stage screening approach to control the false discovery rate (FDR) for a feature set with fixed size that is subject to validation studies, rather than for a feature set from the initial screening analysis. Out of the feature set selected in the first stage with a relaxed FDR level, a fraction of features with most statistical significance is firstly selected. For the remaining feature set, features are selected based on biological consideration only, without regard to any statistical information, which allows evaluating the FDR level for the finally selected feature set with fixed size. Improvement of the power is discussed in the proposed two-stage screening approach. Simulation experiments based on parametric models and real microarray datasets demonstrated substantial increment in the number of screened features for biological consideration compared with the standard screening approach, allowing for more extensive and in-depth biological investigations in omics association studies.
Collapse
Affiliation(s)
- Takanori Kawabata
- Clinical Research Promotion Unit, Clinical Research Center, Shizuoka Cancer Center, Shizuoka, Japan
| | - Ryo Emoto
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Jo Nishino
- Department of Medical Science Mathematics, Medical Research Institute, Tokyo Medical and Dental University, Tokyo, Japan
| | - Kunihiko Takahashi
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Shigeyuki Matsui
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
4
|
Moon CM, Kim SW, Ahn JB, Ma HW, Che X, Kim TI, Kim WH, Cheon JH. Deep Resequencing of Ulcerative Colitis-Associated Genes Identifies Novel Variants in Candidate Genes in the Korean Population. Inflamm Bowel Dis 2018; 24:1706-1717. [PMID: 29733354 DOI: 10.1093/ibd/izy122] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Indexed: 01/08/2023]
Abstract
BACKGROUND Genome-wide association studies and meta-analyses have revealed the genetic background of ulcerative colitis (UC) by identifying common variants. However, these variants do not fully explain the disease variance in UC. To identify novel variants, we performed deep resequencing of UC-associated genes in Korean UC patients and subsequently investigated the functional roles of identified susceptibility genes. METHODS We performed targeted deep resequencing of 108 genes in 24 Korean UC patients and then performed association analysis with data from 126 healthy controls. We validated these variants using 2-stage replication studies including 793 UC patients and 783 controls. We performed in silico and pathway analyses and functional analyses. RESULTS The combined analysis including 2 replication studies identified 6 novel susceptibility loci and reconfirmed 10 previously reported loci. Among the novel single nucleotide variants (SNVs), rs10035653 in C5orf55 (P = 2.08 × 10-3; OR = 1.50), rs41417449 in BTNL2 (P = 1.27 × 10-2; OR = 1.32), rs3117099 in HCG23 (P = 9.98 × 10-6; OR = 1.40), rs7192 in HLA-DRA (P = 6.95 × 10-9; OR = 1.57), and rs3744246 in ORMDL3 (P = 2.21 × 10-2; OR = 1.21) were identified as causal variants, whereas rs713669 in IL17REL (P = 2.69 × 10-2; OR = 0.84) as a protective variant for UC. When correcting multiple testing, 3 novel SNVs (rs41417449 in BTNL2, rs3744246 in ORMDL3, and rs713669 in IL17REL) and 4 previously reported SNVs did not reach a statistical significance. Functional study suggested that SNVs of BTNL2 and C5orf55 exacerbated the inflammatory response both in vitro and in vivo. CONCLUSIONS This study identified 3 novel susceptibility loci and validated 6 previously reported SNVs for UC through deep resequencing in Koreans and revealed the functional roles of BTNL2 and C5orf55.
Collapse
Affiliation(s)
- Chang Mo Moon
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Seung Won Kim
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea.,Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea.,Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Jae Bum Ahn
- Department of Medicine, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Hyun Woo Ma
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Xiumei Che
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea.,Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Tae Il Kim
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Won Ho Kim
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Jae Hee Cheon
- Department of Internal Medicine, Institute of Gastroenterology, Yonsei University College of Medicine, Seoul, Republic of Korea.,Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea.,Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
5
|
Floyd JS, Sitlani CM, Avery CL, Noordam R, Li X, Smith AV, Gogarten SM, Li J, Broer L, Evans DS, Trompet S, Brody JA, Stewart JD, Eicher JD, Seyerle AA, Roach J, Lange LA, Lin HJ, Kors JA, Harris TB, Li-Gao R, Sattar N, Cummings SR, Wiggins KL, Napier MD, Stürmer T, Bis JC, Kerr KF, Uitterlinden AG, Taylor KD, Stott DJ, de Mutsert R, Launer LJ, Busch EL, Méndez-Giráldez R, Sotoodehnia N, Soliman EZ, Li Y, Duan Q, Rosendaal FR, Slagboom PE, Wilhelmsen KC, Reiner AP, Chen YDI, Heckbert SR, Kaplan RC, Rice KM, Jukema JW, Johnson AD, Liu Y, Mook-Kanamori DO, Gudnason V, Wilson JG, Rotter JI, Laurie CC, Psaty BM, Whitsel EA, Cupples LA, Stricker BH. Large-scale pharmacogenomic study of sulfonylureas and the QT, JT and QRS intervals: CHARGE Pharmacogenomics Working Group. THE PHARMACOGENOMICS JOURNAL 2018; 18:127-135. [PMID: 27958378 PMCID: PMC5468495 DOI: 10.1038/tpj.2016.90] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Revised: 10/25/2016] [Accepted: 11/14/2016] [Indexed: 12/17/2022]
Abstract
Sulfonylureas, a commonly used class of medication used to treat type 2 diabetes, have been associated with an increased risk of cardiovascular disease. Their effects on QT interval duration and related electrocardiographic phenotypes are potential mechanisms for this adverse effect. In 11 ethnically diverse cohorts that included 71 857 European, African-American and Hispanic/Latino ancestry individuals with repeated measures of medication use and electrocardiogram (ECG) measurements, we conducted a pharmacogenomic genome-wide association study of sulfonylurea use and three ECG phenotypes: QT, JT and QRS intervals. In ancestry-specific meta-analyses, eight novel pharmacogenomic loci met the threshold for genome-wide significance (P<5 × 10-8), and a pharmacokinetic variant in CYP2C9 (rs1057910) that has been associated with sulfonylurea-related treatment effects and other adverse drug reactions in previous studies was replicated. Additional research is needed to replicate the novel findings and to understand their biological basis.
Collapse
Affiliation(s)
- James S Floyd
- Deparments of Epidemiology and Medicine, University of Washington, Seattle, WA, USA
| | | | - Christy L Avery
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Raymond Noordam
- Department of Epidemiology, Erasmus MC - University Medical Center Rotterdam, Rotterdam, the Netherlands
- Department of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands
| | - Xiaohui Li
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Albert V Smith
- Icelandic Heart Association, Kopavogur, Iceland
- Faculty of Medicine, University of Iceland, Reykavik, Iceland
| | | | - Jin Li
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Palo Alto, CA, USA
| | - Linda Broer
- Department of Internal Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Daniel S Evans
- California Pacific Medical Center Research Institute, San Francisco, CA, USA
| | - Stella Trompet
- Department of Cardiology and Department of Gerontology and Geriatrics, Leiden University Medical Center, Leiden, the Netherlands
| | - Jennifer A Brody
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - James D Stewart
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
- Carolina Population Center, University of North Carolina, Chapel Hill, NC, USA
| | - John D Eicher
- Population Sciences Branch, National Heart Lung and Blood Institute, National Institutes of Health, Framingham, MA USA
- The Framingham Heart Study, Framingham, MA, USA
| | - Amanda A Seyerle
- Department of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN, USA
| | - Jeffrey Roach
- Research Computing Center, University of North Carolina, Chapel Hill, NC
| | - Leslie A Lange
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Henry J Lin
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, USA
- Division of Medical Genetics, Harbor-UCLA Medical Center, Torrance, California, USA
| | - Jan A Kors
- Department of Medical Informatics, Erasmus MC - University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Tamara B Harris
- Laboratory of Epidemiology, Demography, and Biometry, National Institue on Aging, Bethesda, MD, USA
| | - Ruifang Li-Gao
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
| | - Naveed Sattar
- BHF Glasgow Cardiovascular Research Centre, Faculty of Medicine, Glasgow, United Kingdom
| | - Steven R Cummings
- California Pacific Medical Center Research Institute, San Francisco, CA, USA
| | - Kerri L Wiggins
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Melanie D Napier
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Til Stürmer
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC, USA
- Center for Pharmacoepidemiology, University of North Carolina, Chapel Hill, NC, USA
| | - Joshua C Bis
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Kathleen F Kerr
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - André G Uitterlinden
- Department of Internal Medicine, Erasmus MC - University Medical Center Rotterdam, Rotterdam, the Netherlands
| | - Kent D Taylor
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, USA
| | - David J Stott
- Institute of Cardiovascular and Medical Sciences, Faculty of Medicine, University of Glasgow, Scotland, United Kingdom
| | - Renée de Mutsert
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
| | - Lenore J Launer
- Laboratory of Epidemiology, Demography, and Biometry, National Institue on Aging, Bethesda, MD, USA
| | - Evan L Busch
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | | | - Nona Sotoodehnia
- Deparments of Epidemiology and Medicine, University of Washington, Seattle, WA, USA
| | - Elsayed Z Soliman
- Epidemiological Cardiology Research Center (EPICARE), Wake Forest School of Medicine, Winston-Salem, NC, USA
| | - Yun Li
- Department of Biostatistics, Computer Science, and Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Qing Duan
- Research Computing Center, University of North Carolina, Chapel Hill, NC
| | - Frits R Rosendaal
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
| | - P Eline Slagboom
- Department of Medical Statistics and Bioinformatics, Section of Molecular Epidemiology, Leiden University Medical Center, Leiden, The Netherlands
| | - Kirk C Wilhelmsen
- Research Computing Center, University of North Carolina, Chapel Hill, NC
- The Renaissance Computing Institute, Chapel Hill, NC, USA
| | - Alexander P Reiner
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Yii-Der I Chen
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Susan R Heckbert
- Department of Epidemiology, University of Washington, Seattle, WA, USA
| | - Robert C Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Kenneth M Rice
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - J Wouter Jukema
- Department of Cardiology, Leiden University Medical Center, Leiden, The Netherlands
- Einthoven Laboratory for Experimental Vascular Medicine, Leiden University Medical Center, Leiden, the Netherlands
- Interuniversity Cardiology Institute of the Netherlands, Utrecht, The Netherlands
| | - Andrew D Johnson
- Population Sciences Branch, National Heart Lung and Blood Institute, National Institutes of Health, Framingham, MA USA
- The Framingham Heart Study, Framingham, MA, USA
| | - Yongmei Liu
- Department of Epidemiology and Prevention, Division of Public Health Sciences, Wake Forest University, Winston-Salem, NC, USA
| | - Dennis O Mook-Kanamori
- Department of Clinical Epidemiology, Leiden University Medical Center, Leiden, the Netherlands
- Department of Public Health and Primary Care, Leiden University Medical Center, Leiden, the Netherlands
| | - Vilmundur Gudnason
- Icelandic Heart Association, Kopavogur, Iceland
- Faculty of Medicine, University of Iceland, Reykavik, Iceland
| | - James G Wilson
- Department of Physiology and Biophysics, University of Mississippi Medical Center, Jackson, MS, USA
| | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, USA
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Bruce M Psaty
- Departments of Epidemiology, Health Services, and Medicine, University of Washington, Seattle, WA, USA
- Group Health Research Institute, Group Health Cooperative, Seattle, WA, USA
| | - Eric A Whitsel
- Departments of Epidemiology and Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - L Adrienne Cupples
- The Framingham Heart Study, Framingham, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Bruno H Stricker
- Department of Epidemiology, Erasmus MC - University Medical Center Rotterdam, Rotterdam, the Netherlands
- Inspectorate of Health Care, Utrecht, the Netherlands
| |
Collapse
|
6
|
Espin-Garcia O, Craiu RV, Bull SB. Two-phase designs for joint quantitative-trait-dependent and genotype-dependent sampling in post-GWAS regional sequencing. Genet Epidemiol 2017; 42:104-116. [PMID: 29239496 PMCID: PMC5814750 DOI: 10.1002/gepi.22099] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Revised: 10/23/2017] [Accepted: 10/23/2017] [Indexed: 11/09/2022]
Abstract
We evaluate two‐phase designs to follow‐up findings from genome‐wide association study (GWAS) when the cost of regional sequencing in the entire cohort is prohibitive. We develop novel expectation‐maximization‐based inference under a semiparametric maximum likelihood formulation tailored for post‐GWAS inference. A GWAS‐SNP (where SNP is single nucleotide polymorphism) serves as a surrogate covariate in inferring association between a sequence variant and a normally distributed quantitative trait (QT). We assess test validity and quantify efficiency and power of joint QT‐SNP‐dependent sampling and analysis under alternative sample allocations by simulations. Joint allocation balanced on SNP genotype and extreme‐QT strata yields significant power improvements compared to marginal QT‐ or SNP‐based allocations. We illustrate the proposed method and evaluate the sensitivity of sample allocation to sampling variation using data from a sequencing study of systolic blood pressure.
Collapse
Affiliation(s)
- Osvaldo Espin-Garcia
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | - Radu V Craiu
- Department of Statistical Sciences, University of Toronto, Toronto, ON, Canada
| | - Shelley B Bull
- Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| |
Collapse
|
7
|
Bull SB, Andrulis IL, Paterson AD. Statistical challenges in high-dimensional molecular and genetic epidemiology. CAN J STAT 2017. [DOI: 10.1002/cjs.11342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Shelley B. Bull
- Lunenfeld-Tanenbaum Research Institute; Sinai Health System; Toronto Ontario, Canada M5T 3L9
- Dalla Lana School of Public Health; University of Toronto; Toronto, Ontario Canada M5T 3M7
| | - Irene L. Andrulis
- Lunenfeld-Tanenbaum Research Institute; Sinai Health System; Toronto Ontario, Canada M5T 3L9
- Department of Molecular Genetics; University of Toronto; Toronto, Ontario Canada M5S 1A8
| | - Andrew D. Paterson
- Dalla Lana School of Public Health; University of Toronto; Toronto, Ontario Canada M5T 3M7
- Genetics and Genome Biology Program; The Hospital for Sick Children; Toronto, Ontario Canada M5G 0A4
| |
Collapse
|
8
|
Pattaro C. Genome-wide association studies of albuminuria: towards genetic stratification in diabetes? J Nephrol 2017; 31:475-487. [PMID: 28918587 DOI: 10.1007/s40620-017-0437-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2017] [Accepted: 09/02/2017] [Indexed: 12/16/2022]
Abstract
Genome-wide association studies (GWAS) have been very successful in unraveling the polygenic structure of several complex diseases and traits. In the case of albuminuria, despite the large sample size achieved by some studies, results look sparse with a limited number of loci reported so far. This review searched for GWAS studies of albumin excretion, albuminuria, and proteinuria. The resulting picture sets elements of uniqueness for albuminuria GWAS with respect to other complex traits. So far, very few loci associated with albuminuria have been validated by means of genome-wide significant evidence or formal replication. With rare exceptions, the validated loci are ethnicity specific. Within a given ethnicity, variants are common and have relatively large effects, especially in the presence of diabetes. In most cases, the identified variants were functional and a biological involvement of the target genes in renal damage was established. Recently reported variants associated with albuminuria in diabetes may be potentially combined into a genetic risk score, making it possible to rank diabetic patients by increasing risk of albuminuria. Validation of this model is required. To expand the understanding of the biological basis of albumin excretion regulation, future initiatives should achieve larger sample sizes and favor a transethnic study design.
Collapse
Affiliation(s)
- Cristian Pattaro
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, Via Galvani 31, 39100, Bolzano, Italy.
| |
Collapse
|
9
|
Yin X, Bizon C, Tilson J, Lin Y, Gizer IR, Ehlers CL, Wilhelmsen KC. Genome-wide meta-analysis identifies a novel susceptibility signal at CACNA2D3 for nicotine dependence. Am J Med Genet B Neuropsychiatr Genet 2017; 174:557-567. [PMID: 28440896 PMCID: PMC5656555 DOI: 10.1002/ajmg.b.32540] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2016] [Accepted: 03/07/2017] [Indexed: 11/11/2022]
Abstract
Nicotine dependence (ND) has a reported heritability of 40-70%. Low-coverage whole-genome sequencing was conducted in 1,889 samples from the UCSF Family study. Linear mixed models were used to conduct genome-wide association (GWA) tests of ND in this and five cohorts obtained from the database of Genotypes and Phenotypes. Fixed-effect meta-analysis was carried out separately for European (n = 14,713) and African (n = 3,369) participants, and then in a combined analysis of both ancestral groups. The meta-analysis of African participants identified a significant and novel susceptibility signal (rs56247223; p = 4.11 × 10-8 ). Data from the Genotype-Tissue Expression (GTEx) study suggested the protective allele is associated with reduced mRNA expression of CACNA2D3 in three human brain tissues (p < 4.94 × 10-2 ). Sequence data from the UCSF Family study suggested that a rare nonsynonymous variant in this gene conferred increased risk for ND (p = 0.01) providing further support for CACNA2D3 involvement in ND. Suggestive associations were observed in six additional regions in both European and merged populations (p < 5.00 × 10-6 ). The top variants were found to regulate mRNA expression levels of genes in human brains using GTEx data (p < 0.05): HAX1 and CHRNB2 (rs1760803), ADAMTSL1 (rs17198023), PEX2 (rs12680810), GLIS3 (rs12348139), non-coding RNA for LINC00476 (rs10759883), and GABBR1 (rs56020557 and rs62392942). A gene-based association test further supported the relation between GABBR1 and ND (p = 6.36 × 10-7 ). These findings will inform the biological mechanisms and development of therapeutic targets for ND.
Collapse
Affiliation(s)
- Xianyong Yin
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, United States
| | - Chris Bizon
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, United States
| | - Jeffrey Tilson
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, United States
| | - Yuan Lin
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, United States
| | - Ian R. Gizer
- Department of Psychological Sciences, University of Missouri, 210 McAlester Hall, Columbia, MO 65211, United States
| | - Cindy L. Ehlers
- Department of Molecular and Cellular Neurosciences, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037, United States
| | - Kirk C. Wilhelmsen
- Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, United States,Correspondence to: Kirk C. Wilhelmsen, MD, PhD, Department of Genetics, and Renaissance Computing Institute, University of North Carolina at Chapel Hill, 120 Mason Farm Road 5000 D, Chapel Hill, NC 27599-7264, USA. Tel: 1-919-966-1373; Fax: 1-919-843-4682;
| |
Collapse
|
10
|
Lynch SM, Mitra N, Ross M, Newcomb C, Dailey K, Jackson T, Zeigler-Johnson CM, Riethman H, Branas CC, Rebbeck TR. A Neighborhood-Wide Association Study (NWAS): Example of prostate cancer aggressiveness. PLoS One 2017; 12:e0174548. [PMID: 28346484 PMCID: PMC5367705 DOI: 10.1371/journal.pone.0174548] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 03/11/2017] [Indexed: 12/31/2022] Open
Abstract
PURPOSE Cancer results from complex interactions of multiple variables at the biologic, individual, and social levels. Compared to other levels, social effects that occur geospatially in neighborhoods are not as well-studied, and empiric methods to assess these effects are limited. We propose a novel Neighborhood-Wide Association Study(NWAS), analogous to genome-wide association studies(GWAS), that utilizes high-dimensional computing approaches from biology to comprehensively and empirically identify neighborhood factors associated with disease. METHODS Pennsylvania Cancer Registry data were linked to U.S. Census data. In a successively more stringent multiphase approach, we evaluated the association between neighborhood (n = 14,663 census variables) and prostate cancer aggressiveness(PCA) with n = 6,416 aggressive (Stage≥3/Gleason grade≥7 cases) vs. n = 70,670 non-aggressive (Stage<3/Gleason grade<7) cases in White men. Analyses accounted for age, year of diagnosis, spatial correlation, and multiple-testing. We used generalized estimating equations in Phase 1 and Bayesian mixed effects models in Phase 2 to calculate odds ratios(OR) and confidence/credible intervals(CI). In Phase 3, principal components analysis grouped correlated variables. RESULTS We identified 17 new neighborhood variables associated with PCA. These variables represented income, housing, employment, immigration, access to care, and social support. The top hits or most significant variables related to transportation (OR = 1.05;CI = 1.001-1.09) and poverty (OR = 1.07;CI = 1.01-1.12). CONCLUSIONS This study introduces the application of high-dimensional, computational methods to large-scale, publically-available geospatial data. Although NWAS requires further testing, it is hypothesis-generating and addresses gaps in geospatial analysis related to empiric assessment. Further, NWAS could have broad implications for many diseases and future precision medicine studies focused on multilevel risk factors of disease.
Collapse
Affiliation(s)
- Shannon M. Lynch
- Fox Chase Cancer Center, Cancer Prevention and Control, Philadelphia, Pennsylvania, United States of America
- * E-mail:
| | - Nandita Mitra
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Michelle Ross
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Craig Newcomb
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Karl Dailey
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | - Tara Jackson
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
| | | | - Harold Riethman
- Old Dominion University, Norfolk, Virginia, United States of America
| | - Charles C. Branas
- University of Pennsylvania, Perelman School of Medicine, Philadelphia, Pennsylvania, United States of America
- Columbia University, Mailman School of Public Health, New York, New York, United States of America
| | - Timothy R. Rebbeck
- Dana Farber Cancer Institute and Harvard TH Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
11
|
Jiang W, Yu W. Jointly determining significance levels of primary and replication studies by controlling the false discovery rate in two-stage genome-wide association studies. Stat Methods Med Res 2017; 27:2795-2808. [PMID: 28067114 DOI: 10.1177/0962280216687168] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In genome-wide association studies, we normally discover associations between genetic variants and diseases/traits in primary studies, and validate the findings in replication studies. We consider the associations identified in both primary and replication studies as true findings. An important question under this two-stage setting is how to determine significance levels in both studies. In traditional methods, significance levels of the primary and replication studies are determined separately. We argue that the separate determination strategy reduces the power in the overall two-stage study. Therefore, we propose a novel method to determine significance levels jointly. Our method is a reanalysis method that needs summary statistics from both studies. We find the most powerful significance levels when controlling the false discovery rate in the two-stage study. To enjoy the power improvement from the joint determination method, we need to select single nucleotide polymorphisms for replication at a less stringent significance level. This is a common practice in studies designed for discovery purpose. We suggest this practice is also suitable in studies with validation purpose in order to identify more true findings. Simulation experiments show that our method can provide more power than traditional methods and that the false discovery rate is well-controlled. Empirical experiments on datasets of five diseases/traits demonstrate that our method can help identify more associations. The R-package is available at: http://bioinformatics.ust.hk/RFdr.html .
Collapse
Affiliation(s)
- Wei Jiang
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | - Weichuan Yu
- Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| |
Collapse
|
12
|
Robertson DS, Prevost AT, Bowden J. Accounting for selection and correlation in the analysis of two-stage genome-wide association studies. Biostatistics 2016; 17:634-49. [PMID: 26993061 PMCID: PMC5031943 DOI: 10.1093/biostatistics/kxw012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 11/11/2015] [Accepted: 01/15/2016] [Indexed: 11/15/2022] Open
Abstract
The problem of selection bias has long been recognized in the analysis of two-stage trials, where promising candidates are selected in stage 1 for confirmatory analysis in stage 2. To efficiently correct for bias, uniformly minimum variance conditionally unbiased estimators (UMVCUEs) have been proposed for a wide variety of trial settings, but where the population parameter estimates are assumed to be independent. We relax this assumption and derive the UMVCUE in the multivariate normal setting with an arbitrary known covariance structure. One area of application is the estimation of odds ratios (ORs) when combining a genome-wide scan with a replication study. Our framework explicitly accounts for correlated single nucleotide polymorphisms, as might occur due to linkage disequilibrium. We illustrate our approach on the measurement of the association between 11 genetic variants and the risk of Crohn's disease, as reported in Parkes and others (2007. Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility. Nat. Gen. 39: (7), 830-832.), and show that the estimated ORs can vary substantially if both selection and correlation are taken into account.
Collapse
Affiliation(s)
- David S Robertson
- MRC Biostatistics Unit, IPH Forvie Site, Robinson Way, Cambridge CB2 0SR, UK
| | - A Toby Prevost
- Imperial College London, 1st Floor, Stadium House, 68 Wood Lane, London W12 7RH, UK
| | - Jack Bowden
- MRC Integrative Epidemiology Unit, University of Bristol, Oakfield House, Bristol BS8 2BN, UK and MRC Biostatistics Unit, IPH Forvie Site, Robinson Way, Cambridge CB2 0SR, UK
| |
Collapse
|
13
|
Holland D, Wang Y, Thompson WK, Schork A, Chen CH, Lo MT, Witoelar A, Werge T, O'Donovan M, Andreassen OA, Dale AM. Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics. Front Genet 2016; 7:15. [PMID: 26909100 PMCID: PMC4754432 DOI: 10.3389/fgene.2016.00015] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 01/28/2016] [Indexed: 12/19/2022] Open
Abstract
Genome-wide Association Studies (GWAS) result in millions of summary statistics (“z-scores”) for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent of genetic contributions to complex phenotypes such as psychiatric disorders, which are understood to have substantial genetic components that arise from very large numbers of SNPs. The complexity of the datasets, however, poses a significant challenge to maximizing their utility. This is reflected in a need for better understanding the landscape of z-scores, as such knowledge would enhance causal SNP and gene discovery, help elucidate mechanistic pathways, and inform future study design. Here we present a parsimonious methodology for modeling effect sizes and replication probabilities, relying only on summary statistics from GWAS substudies, and a scheme allowing for direct empirical validation. We show that modeling z-scores as a mixture of Gaussians is conceptually appropriate, in particular taking into account ubiquitous non-null effects that are likely in the datasets due to weak linkage disequilibrium with causal SNPs. The four-parameter model allows for estimating the degree of polygenicity of the phenotype and predicting the proportion of chip heritability explainable by genome-wide significant SNPs in future studies with larger sample sizes. We apply the model to recent GWAS of schizophrenia (N = 82,315) and putamen volume (N = 12,596), with approximately 9.3 million SNP z-scores in both cases. We show that, over a broad range of z-scores and sample sizes, the model accurately predicts expectation estimates of true effect sizes and replication probabilities in multistage GWAS designs. We assess the degree to which effect sizes are over-estimated when based on linear-regression association coefficients. We estimate the polygenicity of schizophrenia to be 0.037 and the putamen to be 0.001, while the respective sample sizes required to approach fully explaining the chip heritability are 106 and 105. The model can be extended to incorporate prior knowledge such as pleiotropy and SNP annotation. The current findings suggest that the model is applicable to a broad array of complex phenotypes and will enhance understanding of their genetic architectures.
Collapse
Affiliation(s)
- Dominic Holland
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA
| | - Yunpeng Wang
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA; NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of OsloOslo, Norway; Division of Mental Health and Addiction, Oslo University HospitalOslo, Norway
| | - Wesley K Thompson
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Andrew Schork
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Cognitive Sciences, University of CaliforniaSan Diego, La Jolla, CA, USA
| | - Chi-Hua Chen
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Radiology, University of CaliforniaSan Diego, La Jolla, CA, USA
| | - Min-Tzu Lo
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Radiology, University of CaliforniaSan Diego, La Jolla, CA, USA
| | - Aree Witoelar
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of OsloOslo, Norway; Division of Mental Health and Addiction, Oslo University HospitalOslo, Norway
| | | | | | - Thomas Werge
- Institute of Biological Psychiatry, MHC, Sct. Hans Hospital and University of Copenhagen Copenhagen, Denmark
| | - Michael O'Donovan
- MRC Centre for Neuropsychiatric Genetics and Genomics, School of Medicine, Cardiff University Cardiff, UK
| | - Ole A Andreassen
- NORMENT, KG Jebsen Centre for Psychosis Research, Institute of Clinical Medicine, University of OsloOslo, Norway; Division of Mental Health and Addiction, Oslo University HospitalOslo, Norway
| | - Anders M Dale
- Multimodal Imaging Laboratory, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Neurosciences, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Psychiatry, University of CaliforniaSan Diego, La Jolla, CA, USA; Department of Radiology, University of CaliforniaSan Diego, La Jolla, CA, USA
| |
Collapse
|
14
|
Minelli C, Dean CH, Hind M, Alves AC, Amaral AFS, Siroux V, Huikari V, Soler Artigas M, Evans DM, Loth DW, Bossé Y, Postma DS, Sin D, Thompson J, Demenais F, Henderson J, Bouzigon E, Jarvis D, Järvelin MR, Burney P. Association of Forced Vital Capacity with the Developmental Gene NCOR2. PLoS One 2016; 11:e0147388. [PMID: 26836265 PMCID: PMC4737618 DOI: 10.1371/journal.pone.0147388] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 01/04/2016] [Indexed: 12/31/2022] Open
Abstract
Background Forced Vital Capacity (FVC) is an important predictor of all-cause mortality in the absence of chronic respiratory conditions. Epidemiological evidence highlights the role of early life factors on adult FVC, pointing to environmental exposures and genes affecting lung development as risk factors for low FVC later in life. Although highly heritable, a small number of genes have been found associated with FVC, and we aimed at identifying further genetic variants by focusing on lung development genes. Methods Per-allele effects of 24,728 SNPs in 403 genes involved in lung development were tested in 7,749 adults from three studies (NFBC1966, ECRHS, EGEA). The most significant SNP for the top 25 genes was followed-up in 46,103 adults (CHARGE and SpiroMeta consortia) and 5,062 children (ALSPAC). Associations were considered replicated if the replication p-value survived Bonferroni correction (p<0.002; 0.05/25), with a nominal p-value considered as suggestive evidence. For SNPs with evidence of replication, effects on the expression levels of nearby genes in lung tissue were tested in 1,111 lung samples (Lung eQTL consortium), with further functional investigation performed using public epigenomic profiling data (ENCODE). Results NCOR2-rs12708369 showed strong replication in children (p = 0.0002), with replication unavailable in adults due to low imputation quality. This intronic variant is in a strong transcriptional enhancer element in lung fibroblasts, but its eQTL effects could not be tested due to low imputation quality in the eQTL dataset. SERPINE2-rs6754561 replicated at nominal level in both adults (p = 0.036) and children (p = 0.045), while WNT16-rs2707469 replicated at nominal level only in adults (p = 0.026). The eQTL analyses showed association of WNT16-rs2707469 with expression levels of the nearby gene CPED1. We found no statistically significant eQTL effects for SERPINE2-rs6754561. Conclusions We have identified a new gene, NCOR2, in the retinoic acid signalling pathway pointing to a role of vitamin A metabolism in the regulation of FVC. Our findings also support SERPINE2, a COPD gene with weak previous evidence of association with FVC, and suggest WNT16 as a further promising candidate.
Collapse
Affiliation(s)
- Cosetta Minelli
- Respiratory Epidemiology, Occupational Medicine and Public Health, National Heart and Lung Institute, Imperial College, London, United Kingdom
- * E-mail:
| | - Charlotte H. Dean
- Leukocyte Biology, National Heart and Lung Institute, Imperial College London, London, United Kingdom
- Mammalian Genetics Unit, MRC Harwell, Oxon, United Kingdom
| | - Matthew Hind
- Respiratory Department, Royal Brompton and Harefield NHS Foundation Trust, London, United Kingdom
| | - Alexessander Couto Alves
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, United Kingdom
| | - André F. S. Amaral
- Respiratory Epidemiology, Occupational Medicine and Public Health, National Heart and Lung Institute, Imperial College, London, United Kingdom
- MRC-PHE Centre for Environment & Health, London, United Kingdom
| | - Valerie Siroux
- Univ. Grenoble Alpes, IAB, Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, F-38000, Grenoble, France
- INSERM, IAB, Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, F-38000, Grenoble, France
- CHU de Grenoble, IAB, Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, F-38000, Grenoble, France
| | | | - María Soler Artigas
- Genetic Epidemiology Group, Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - David M. Evans
- University of Queensland Diamantina Institute, Translational Research Institute, Brisbane, Australia
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, United Kingdom
| | - Daan W. Loth
- Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands
| | - Yohan Bossé
- Institut universitaire de cardiologie et de pneumologie de Québec, Department of Molecular Medicine, Laval University, Québec, Canada
| | - Dirkje S. Postma
- Department of Pulmonary Diseases, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Don Sin
- The University of British Columbia Center for Heart Lung Innovation, St-Paul’s Hospital, Vancouver, Canada
| | - John Thompson
- Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - Florence Demenais
- INSERM, UMRS-946, Genetic Variation of Human Diseases Unit, Paris, France
- Univ. Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d’Hématologie, F-75007, Paris, France
| | - John Henderson
- School of Social and Community Medicine, University of Bristol, Bristol, United Kingdom
| | - SpiroMeta consortium
- SpiroMeta consortium, Genetic Epidemiology Group, Department of Health Sciences, University of Leicester, Leicester, United Kingdom
| | - CHARGE consortium
- CHARGE consortium, Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, United States of America
| | - Emmanuelle Bouzigon
- INSERM, UMRS-946, Genetic Variation of Human Diseases Unit, Paris, France
- Univ. Paris Diderot, Sorbonne Paris Cité, Institut Universitaire d’Hématologie, F-75007, Paris, France
| | - Deborah Jarvis
- Respiratory Epidemiology, Occupational Medicine and Public Health, National Heart and Lung Institute, Imperial College, London, United Kingdom
- MRC-PHE Centre for Environment & Health, London, United Kingdom
| | - Marjo-Riitta Järvelin
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, United Kingdom
- MRC-PHE Centre for Environment & Health, London, United Kingdom
- Biocenter Oulu, University of Oulu, Oulu, Finland
- Center for Life Course Epidemiology, Faculty of Medicine, P.O. Box 5000, FI-90014 University of Oulu, Oulu, Finland
- Unit of Primary Care, Oulu University Hospital, Kajaanintie 50, P.O. Box 20, FI-90220, Oulu, 90029 OYS, Finland
| | - Peter Burney
- Respiratory Epidemiology, Occupational Medicine and Public Health, National Heart and Lung Institute, Imperial College, London, United Kingdom
- MRC-PHE Centre for Environment & Health, London, United Kingdom
| |
Collapse
|
15
|
Robust Association Tests for the Replication of Genome-Wide Association Studies. BIOMED RESEARCH INTERNATIONAL 2015; 2015:461593. [PMID: 26345547 PMCID: PMC4539975 DOI: 10.1155/2015/461593] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2014] [Revised: 02/14/2015] [Accepted: 02/14/2015] [Indexed: 11/18/2022]
Abstract
In genome-wide association study (GWAS), robust genetic association tests such as maximum of three CATTs (MAX3), each corresponding to recessive, additive, and dominant genetic models, the minimum p value of Pearson's Chi-square test with 2 degrees of freedom, and CATT based on additive genetic model (MIN2), genetic model selection (GMS), and genetic model exclusion (GME) methods have been shown to provide better power performance under wide range of underlying genetic models. In this paper, we demonstrate how these robust tests can be applied to the replication study of GWAS and how the overall statistical significance can be evaluated using the combined test formed by p values of the discovery and replication studies.
Collapse
|
16
|
Verification of the Chromosome Region 9q21 Association with Pelvic Organ Prolapse Using RegulomeDB Annotations. BIOMED RESEARCH INTERNATIONAL 2015; 2015:837904. [PMID: 26347886 PMCID: PMC4546950 DOI: 10.1155/2015/837904] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 07/28/2015] [Indexed: 12/16/2022]
Abstract
Pelvic organ prolapse (POP) is a common highly disabling disorder with a large hereditary component. It is characterized by a loss of pelvic floor support that leads to the herniation of the uterus in or outside the vagina. Genome-wide linkage studies have shown an evidence of POP association with the region 9q21 and six other loci in European pedigrees. The aim of our study was to test the above associations in a case-control study in Russian population. Twelve SNPs including SNPs cited in the above studies and those selected using the RegulomeDB annotations for the region 9q21 were genotyped in 210 patients with POP (stages III-IV) and 292 controls with no even minimal POP. Genotyping was performed using the polymerase chain reaction with confronting two-pair primers (PCR–CTPP). Association analyses were conducted for individual SNPs, 9q21 haplotypes, and SNP-SNP interactions. SNP rs12237222 with the highest RegulomeDB score 1a appeared to be the key SNP in haplotypes associated with POP. Other RegulomeDB Category 1 SNPs, rs12551710 and rs2236479 (scores 1d and 1f, resp.), exhibited epistatic effects. In this study, we verified the region 9q21 association with POP in Russians, using RegulomeDB annotations.
Collapse
|
17
|
Chen Z, Craiu RV, Bull SB. A note on the efficiencies of sampling strategies in two-stage Bayesian regional fine mapping of a quantitative trait. Genet Epidemiol 2014; 38:599-609. [PMID: 25132153 DOI: 10.1002/gepi.21845] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Revised: 06/12/2014] [Accepted: 06/16/2014] [Indexed: 11/09/2022]
Abstract
In focused studies designed to follow up associations detected in a genome-wide association study (GWAS), investigators can proceed to fine-map a genomic region by targeted sequencing or dense genotyping of all variants in the region, aiming to identify a functional sequence variant. For the analysis of a quantitative trait, we consider a Bayesian approach to fine-mapping study design that incorporates stratification according to a promising GWAS tag SNP in the same region. Improved cost-efficiency can be achieved when the fine-mapping phase incorporates a two-stage design, with identification of a smaller set of more promising variants in a subsample taken in stage 1, followed by their evaluation in an independent stage 2 subsample. To avoid the potential negative impact of genetic model misspecification on inference we incorporate genetic model selection based on posterior probabilities for each competing model. Our simulation study shows that, compared to simple random sampling that ignores genetic information from GWAS, tag-SNP-based stratified sample allocation methods reduce the number of variants continuing to stage 2 and are more likely to promote the functional sequence variant into confirmation studies.
Collapse
Affiliation(s)
- Zhijian Chen
- Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, Ontario, Canada
| | | | | |
Collapse
|
18
|
Abstract
The cost of next-generation sequencing is now approaching that of the first generation of genome-wide single-nucleotide genotyping panels, but this is still out of reach for large-scale epidemiologic studies with tens of thousands of subjects. Furthermore, the anticipated yield of millions of rare variants poses serious challenges for distinguishing causal from noncausal variants for disease. We explore the merits of using family-based designs for sequencing substudies to identify novel variants and prioritize them for their likelihood of causality. While the sharing of variants within families means that family-based designs may be less efficient for discovery than sequencing of a comparable number of unrelated individuals, the ability to exploit cosegregation of variants with disease within families helps distinguish causal from noncausal ones. We introduce a score test criterion for prioritizing discovered variants in terms of their likelihood of being functional. We compare the relative statistical efficiency of 2-stage versus1-stage family-based designs by application to the Genetic Analysis Workshop 18 simulated sequence data.
Collapse
Affiliation(s)
- Zhao Yang
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089-9234, USA
| | - Duncan C Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA 90089-9234, USA
| |
Collapse
|
19
|
Thomas DC, Yang Z, Yang F. Two-phase and family-based designs for next-generation sequencing studies. Front Genet 2013; 4:276. [PMID: 24379824 PMCID: PMC3861783 DOI: 10.3389/fgene.2013.00276] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2013] [Accepted: 11/19/2013] [Indexed: 12/21/2022] Open
Abstract
The cost of next-generation sequencing is now approaching that of early GWAS panels, but is still out of reach for large epidemiologic studies and the millions of rare variants expected poses challenges for distinguishing causal from non-causal variants. We review two types of designs for sequencing studies: two-phase designs for targeted follow-up of genomewide association studies using unrelated individuals; and family-based designs exploiting co-segregation for prioritizing variants and genes. Two-phase designs subsample subjects for sequencing from a larger case-control study jointly on the basis of their disease and carrier status; the discovered variants are then tested for association in the parent study. The analysis combines the full sequence data from the substudy with the more limited SNP data from the main study. We discuss various methods for selecting this subset of variants and describe the expected yield of true positive associations in the context of an on-going study of second breast cancers following radiotherapy. While the sharing of variants within families means that family-based designs are less efficient for discovery than sequencing unrelated individuals, the ability to exploit co-segregation of variants with disease within families helps distinguish causal from non-causal ones. Furthermore, by enriching for family history, the yield of causal variants can be improved and use of identity-by-descent information improves imputation of genotypes for other family members. We compare the relative efficiency of these designs with those using unrelated individuals for discovering and prioritizing variants or genes for testing association in larger studies. While associations can be tested with single variants, power is low for rare ones. Recent generalizations of burden or kernel tests for gene-level associations to family-based data are appealing. These approaches are illustrated in the context of a family-based study of colorectal cancer.
Collapse
Affiliation(s)
- Duncan C Thomas
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| | - Zhao Yang
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| | - Fan Yang
- Department of Preventive Medicine, University of Southern California Los Angeles, CA, USA
| |
Collapse
|
20
|
Sohns M, Viktorova E, Amos CI, Brennan P, Fehringer G, Gaborieau V, Han Y, Heinrich J, Chang-Claude J, Hung RJ, Müller-Nurasyid M, Risch A, Thomas D, Bickeböller H. Empirical hierarchical bayes approach to gene-environment interactions: development and application to genome-wide association studies of lung cancer in TRICL. Genet Epidemiol 2013; 37:551-559. [PMID: 23893921 DOI: 10.1002/gepi.21741] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Revised: 04/15/2013] [Accepted: 05/09/2013] [Indexed: 11/06/2022]
Abstract
The analysis of gene-environment (G × E) interactions remains one of the greatest challenges in the postgenome-wide association studies (GWASs) era. Recent methods constitute a compromise between the robust but underpowered case-control and powerful case-only methods. Inferences of the latter are biased when the assumption of gene-environment (G-E) independence in controls fails. We propose a novel empirical hierarchical Bayes approach to G × E interaction (EHB-GE), which benefits from greater rank power while accounting for population-based G-E correlation. Building on Lewinger et al.'s ([2007] Genet Epidemiol 31:871-882) hierarchical Bayes prioritization approach, the method first obtains posterior G-E correlation estimates in controls for each marker, borrowing strength from G-E information across the genome. These posterior estimates are then subtracted from the corresponding case-only G × E estimates. We compared EHB-GE with rival methods using simulation. EHB-GE has similar or greater rank power to detect G × E interactions in the presence of large numbers of G-E correlations with weak to strong effects or only a low number of such correlations with large effect. When there are no or only a few weak G-E correlations, Murcray et al.'s method ([2009] Am J Epidemiol 169:219-226) identifies markers with low G × E interaction effects better. We applied EHB-GE and competing methods to four lung cancer case-control GWAS from the Interdisciplinary Research in Cancer of the Lung/International Lung Cancer Consortium with smoking as environmental factor. A number of genes worth investigating were identified by the EHB-GE approach.
Collapse
Affiliation(s)
- Melanie Sohns
- Department of Genetic Epidemiology, University Medical Center, Georg-August University of Goettingen, Goettingen, Germany
| | - Elena Viktorova
- Department of Genetic Epidemiology, University Medical Center, Georg-August University of Goettingen, Goettingen, Germany
| | - Christopher I Amos
- Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, TX, USA
| | - Paul Brennan
- International Agency for Research on Cancer (IARC), Lyon, France
| | - Gord Fehringer
- Prosserman Centre for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | | | - Younghun Han
- Department of Epidemiology, University of Texas M.D. Anderson Cancer Center, Houston, TX, USA
| | - Joachim Heinrich
- Institute of Epidemiology I, Helmholtz Zentrum München, Neuherberg, Germany
| | - Jenny Chang-Claude
- Unit of Genetic Epidemiology, Division of Cancer Epidemiology, Deutsches Krebsforschungszentrum, Heidelberg, Germany
| | - Rayjean J Hung
- Prosserman Centre for Health Research, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Martina Müller-Nurasyid
- Institute of Medical Informatics, Biometry and Epidemiology, Chair of Epidemiology and Chair of Genetic Epidemiology, Ludwig-Maximilians-Universität, Munich, Germany.,Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany
| | - Angela Risch
- Division of Epigenomics and Cancer Risk Factors, German Cancer Research Center, Heidelberg, Germany
| | - Duncan Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, USA
| | - Heike Bickeböller
- Department of Genetic Epidemiology, University Medical Center, Georg-August University of Goettingen, Goettingen, Germany
| |
Collapse
|
21
|
Thomas DC. Some surprising twists on the road to discovering the contribution of rare variants to complex diseases. Hum Hered 2013; 74:113-7. [PMID: 23594489 DOI: 10.1159/000347020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
22
|
Cheung C, Thompson E, Wijsman E. GIGI: an approach to effective imputation of dense genotypes on large pedigrees. Am J Hum Genet 2013; 92:504-16. [PMID: 23561844 DOI: 10.1016/j.ajhg.2013.02.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Revised: 01/15/2013] [Accepted: 02/27/2013] [Indexed: 12/11/2022] Open
Abstract
Recent emergence of the common-disease-rare-variant hypothesis has renewed interest in the use of large pedigrees for identifying rare causal variants. Genotyping with modern sequencing platforms is increasingly common in the search for such variants but remains expensive and often is limited to only a few subjects per pedigree. In population-based samples, genotype imputation is widely used so that additional genotyping is not needed. We now introduce an analogous approach that enables computationally efficient imputation in large pedigrees. Our approach samples inheritance vectors (IVs) from a Markov Chain Monte Carlo sampler by conditioning on genotypes from a sparse set of framework markers. Missing genotypes are probabilistically inferred from these IVs along with observed dense genotypes that are available on a subset of subjects. We implemented our approach in the Genotype Imputation Given Inheritance (GIGI) program and evaluated the approach on both simulated and real large pedigrees. With a real pedigree, we also compared imputed results obtained from this approach with those from the population-based imputation program BEAGLE. We demonstrated that our pedigree-based approach imputes many alleles with high accuracy. It is much more accurate for calling rare alleles than is population-based imputation and does not require an outside reference sample. We also evaluated the effect of varying other parameters, including the marker type and density of the framework panel, threshold for calling genotypes, and population allele frequencies. By leveraging information from existing genotypes already assayed on large pedigrees, our approach can facilitate cost-effective use of sequence data in the pursuit of rare causal variants.
Collapse
|
23
|
Rotondi MA, Bull SB. Cumulative meta-analysis for genetic association: when is a new study worthwhile? Hum Hered 2012; 74:61-70. [PMID: 23258221 DOI: 10.1159/000345604] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2012] [Accepted: 10/30/2012] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES In this paper, we address the questions: how large a sample size would be required to show genome-wide significance between a single nucleotide polymorphism (SNP) and a genetic trait in a meta-analysis of a newly planned study together with the existing ones? Or alternatively: will a planned study of size n be able to provide evidence of a genetic association when this study is combined with a current meta-analysis? METHODS We examine the potential impact of a newly planned genetic study on an existing meta-analysis through the use of a simulation-based algorithm. The proposed approach provides an empirical estimate of the power of the updated meta-analysis to detect genome-wide significance (p<5.0×10(-8)) of a complex trait and each of a set of specific SNPs of interest or the expected p value of the updated meta-analysis including the current and proposed studies. RESULTS This technique is illustrated in the context of an updated meta-analysis of case-control studies in Paget's disease. A second example illustrates the impact of adding a newly planned study to a large meta-analysis of SNP associations with human height. CONCLUSIONS The proposed algorithm is particularly useful for the design of studies to assess a selected set of high-priority SNP associations that are 'nearly' significant in meta-analysis of existing studies. The results may help investigators decide whether an updated meta-analysis is likely to achieve genome-wide significance.
Collapse
Affiliation(s)
- Michael A Rotondi
- School of Kinesiology and Health Science, York University, and Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, ON, Canada.
| | | |
Collapse
|
24
|
Abstract
BACKGROUND Risk factors for diastolic dysfunction in hypertrophic cardiomyopathy (HCM) are poorly understood. We investigated the association of variants in hypoxia-response genes with phenotype severity in pediatric HCM. METHODS A total of 80 unrelated patients <21 y and 14 related members from eight families with HCM were genotyped for six variants associated with vascular endothelial growth factor A (VEGFA) downregulation, or hypoxia-inducible factor A (HIF1A) upregulation. Associations between risk genotypes and left-ventricular (LV) hypertrophy, LV dysfunction, and freedom from myectomy were assessed. Tissue expression was measured in myocardial samples from 17 patients with HCM and 20 patients without HCM. RESULTS Age at enrollment was 9 ± 5 y (follow-up, 3.1 ± 3.6 y). Risk allele frequency was 67% VEGFA and 92% HIF1A. Risk genotypes were associated with younger age at diagnosis (P < 0.001), septal hypertrophy (P < 0.01), prolonged E-wave deceleration time (EWDT) (P < 0.0001) and isovolumic relaxation time (IVRT) (P < 0.0001), and lower freedom from myectomy (P < 0.05). These associations were seen in sporadic and familial HCM independent of the disease-causing mutation. Risk genotypes were associated with higher myocardial HIF1A and transforming growth factor B1 (TGFB1) expression and increased endothelial-fibroblast transformation (P < 0.05). CONCLUSION HIF1A-upregulation and/or VEGFA-downregulation genotypes were associated with more severe septal hypertrophy and diastolic dysfunction and may provide genetic markers to improve risk prediction in HCM.
Collapse
|
25
|
Gögele M, Minelli C, Thakkinstian A, Yurkiewich A, Pattaro C, Pramstaller PP, Little J, Attia J, Thompson JR. Methods for meta-analyses of genome-wide association studies: critical assessment of empirical evidence. Am J Epidemiol 2012; 175:739-49. [PMID: 22427610 DOI: 10.1093/aje/kwr385] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
There has been a steep increase in the number of meta-analyses of genome-wide association (GWA) studies aimed at identifying genetic variants with increasingly smaller effects, but pressure to publish findings of new genetic associations has limited the time available for careful consideration of all of their methodological aspects. The authors surveyed the literature (2007-2010) to provide empirical evidence on the methods used in GWA meta-analyses, including their organization, requirements about the uniformity of methods used in primary studies, methods for data pooling, investigation of between-study heterogeneity, and quality of reporting. This review showed that a great variety of methods are being used, but the rationale for their choice is often unclear. It also highlights how important methodological aspects have received insufficient attention, potentially leading to missed opportunities for improving gene discovery and characterization. Evaluation of power to replicate findings was inadequate, and the number of variants selected for replication was not associated with replication sample size. A low proportion of GWA meta-analyses investigated the presence and magnitude of heterogeneity, even when there was little uniformity in methods used in primary studies. More methodological work is required before clear guidance can be offered as to optimal methods or tradeoffs between alternative methods. However, there is a clear need for guidelines for reporting the results of GWA meta-analyses.
Collapse
Affiliation(s)
- Martin Gögele
- Center for Biomedicine, European Academy of Bozen/Bolzano (EURAC), Viale Druso 1, 39100 Bolzano, Italy
| | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Chen Z, Craiu RV, Bull SB. Two-Phase Stratified Sampling Designs for Regional Sequencing. Genet Epidemiol 2012; 36:320-32. [DOI: 10.1002/gepi.21624] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2011] [Revised: 01/16/2012] [Accepted: 01/17/2012] [Indexed: 12/12/2022]
Affiliation(s)
- Zhijian Chen
- Samuel Lunenfeld Research Institute of Mount Sinai Hospital; Toronto ON; Canada
| | - Radu V. Craiu
- Department of Statistics; University of Toronto; Toronto ON; Canada
| | | |
Collapse
|
27
|
Thymic stromal lymphopoietin: an immune cytokine gene associated with the metabolic syndrome and blood pressure in severe obesity. Clin Sci (Lond) 2012; 123:99-109. [DOI: 10.1042/cs20110584] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
A previous expression profiling of VAT (visceral adipose tissue) revealed that the TSLP (thymic stromal lymphopoietin) gene was less expressed in severely obese men with (n=7) compared with without (n=7) the MetS (metabolic syndrome). We hypothesized that TSLP SNPs (single nucleotide polymorphisms) are associated with TSLP gene expression in VAT and with MetS phenotypes. Following validation of lower TSLP expression (P=0.003) in VAT of severely obese men and women with (n=70) compared with without (n=60) the MetS, a detailed genetic investigation was performed at the TSLP locus by sequencing its promoter, exons and intron–exon splicing boundaries using DNA of 25 severely obese subjects. Five tagging SNPs were genotyped in the 130 subjects from the expression analysis to test whether these SNPs contributed to TSLP expression variability (ANOVAs) and then genotyped in two independent samples of severely obese men (total, n=389) and women (total, n=894). In a sex-stratified multistage experimental design, ANOVAs were performed to test whether tagging SNPs were associated with MetS components treated as continuous variables. We observed that the non-coding SNP rs2289277 was associated with TSLP mRNA abundance (P=0.04), as well as with SBP [systolic BP (blood pressure)] (P=0.004) and DBP (diastolic BP) (P=0.0003) in men when adjusting for age, waist circumference, smoking and medication treating hypertension. These novel observations suggest that TSLP expression in VAT may partly explain the inter-individual variability for metabolic impairments in the presence of obesity and that specific SNPs (rs2289277 and/or correlating SNPs) may influence TSLP gene expression as well as BP in obese men.
Collapse
|
28
|
Yang F, Thomas DC. Two-stage design of sequencing studies for testing association with rare variants. Hum Hered 2011; 71:209-20. [PMID: 21734405 DOI: 10.1159/000328193] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Accepted: 03/31/2011] [Indexed: 01/28/2023] Open
Abstract
Multiple rare variants have been suggested as accounting for some of the associations with common single nucleotide polymorphisms identified in genome-wide association studies or possibly some of the as yet undiscovered heritability. We consider the power of various approaches to designing substudies aimed at using next-generation sequencing technologies to discover novel variants and to select some subsets that are possibly causal for genotyping in the original case-control study and testing for association using various weighted sum indices. We find that the selection of variants based on the statistical significance of the case-control difference in the subsample yields good power for testing rare variant indices in the main study, and that multivariate models including both the summary index of rare variants and the associated common single nucleotide polymorphisms can distinguish which is the causal factor. By simulation, we explore the effects of varying the size of the discovery subsample, choice of index, and true causal model.
Collapse
Affiliation(s)
- Fan Yang
- Department of Preventive Medicine, University of Southern California, Los Angeles, USA
| | | |
Collapse
|
29
|
Edwards TL, Song Z, Li C. Enriching targeted sequencing experiments for rare disease alleles. ACTA ACUST UNITED AC 2011; 27:2112-8. [PMID: 21700677 DOI: 10.1093/bioinformatics/btr324] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Next-generation targeted resequencing of genome-wide association study (GWAS)-associated genomic regions is a common approach for follow-up of indirect association of common alleles. However, it is prohibitively expensive to sequence all the samples from a well-powered GWAS study with sufficient depth of coverage to accurately call rare genotypes. As a result, many studies may use next-generation sequencing for single nucleotide polymorphism (SNP) discovery in a smaller number of samples, with the intent to genotype candidate SNPs with rare alleles captured by resequencing. This approach is reasonable, but may be inefficient for rare alleles if samples are not carefully selected for the resequencing experiment. RESULTS We have developed a probability-based approach, SampleSeq, to select samples for a targeted resequencing experiment that increases the yield of rare disease alleles substantially over random sampling of cases or controls or sampling based on genotypes at associated SNPs from GWAS data. This technique allows for smaller sample sizes for resequencing experiments, or allows the capture of rarer risk alleles. When following up multiple regions, SampleSeq selects subjects with an even representation of all the regions. SampleSeq also can be used to calculate the sample size needed for the resequencing to increase the chance of successful capture of rare alleles of desired frequencies. SOFTWARE http://biostat.mc.vanderbilt.edu/SampleSeq
Collapse
Affiliation(s)
- Todd L Edwards
- Vanderbilt Epidemiology Center, Division of Epidemiology, Department of Medicine, Vanderbilt University, Nashville, TN 37203, USA
| | | | | |
Collapse
|
30
|
Van Steen K. Perspectives on genome-wide multi-stage family-based association studies. Stat Med 2011; 30:2201-21. [DOI: 10.1002/sim.4259] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2010] [Accepted: 03/07/2011] [Indexed: 01/03/2023]
|
31
|
Graves KD, Peshkin BN, Luta G, Tuong W, Schwartz MD. Interest in genetic testing for modest changes in breast cancer risk: implications for SNP testing. Public Health Genomics 2011; 14:178-89. [PMID: 21464556 DOI: 10.1159/000324703] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 01/26/2011] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Advances in genomics may eventually lead to 'personalized genetic medicine,' yet the clinical utility of predictive testing for modest changes in risk is unclear. We explored interest in genetic testing for genes related to modest changes in breast cancer risk in women at moderate to high risk for breast cancer. METHODS Women (n = 105) with a negative breast biopsy and ≥1 relative with breast or ovarian cancer completed telephone surveys. We measured demographic and psychosocial variables and, following presentation of hypothetical scenarios of genetic tests for lower-penetrance breast cancer gene mutations, assessed interest in willingness to pay for and comprehension of test results. We used logistic regression models with generalized estimating equations to evaluate combinations of risk level, cost and behavioral modifiers. RESULTS Many women (77%) reported 'definite' interest in genetic testing, with greater interest in tests that conveyed more risk and cost less. Behavioral modifiers of risk (taking a vitamin; diet/exercise), having a regular physician, greater perceived benefits of genetic testing, and greater cancer worry also influenced interest. Most participants (63%) did not understand relative vs. absolute risk. Women with less understanding reported more cancer worry and greater willingness to pay for testing. CONCLUSION Interest in genetic testing for mutations related to modest changes in risk was high, modified by both test and psychosocial factors. Findings highlight the need for education about benefits and risks of testing for mutations that convey modest changes in risk, particularly given the current lack of clinical validity/utility and availability of direct-to-consumer genetic testing.
Collapse
Affiliation(s)
- K D Graves
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20007, USA. kdg9 @ georgetown.edu
| | | | | | | | | |
Collapse
|
32
|
Thomas D. Methods for investigating gene-environment interactions in candidate pathway and genome-wide association studies. Annu Rev Public Health 2010; 31:21-36. [PMID: 20070199 DOI: 10.1146/annurev.publhealth.012809.103619] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Despite the considerable enthusiasm about the yield of novel and replicated discoveries of genetic associations from the new generation of genome-wide association studies (GWAS), the proportion of the heritability of most complex diseases that have been studied to date remains small. Some of this "dark matter" could be due to gene-environment (G x E) interactions or more complex pathways involving multiple genes and exposures. We review the basic epidemiologic study design and statistical analysis approaches to studying G x E interactions individually and then consider more comprehensive approaches to studying entire pathways or GWAS data. In addition to the usual issues in genetic association studies, particular care is needed in exposure assessment, and very large sample sizes are required. Although hypothesis-driven, pathway-based and agnostic GWA study approaches are generally viewed as opposite poles, we suggest that the two can be usefully married using hierarchical modeling strategies that exploit external pathway knowledge in mining genome-wide data.
Collapse
Affiliation(s)
- Duncan Thomas
- Department of Preventive Medicine, University of Southern California, Los Angeles, California, 90089-9011, USA.
| |
Collapse
|
33
|
Abstract
Despite the yield of recent genome-wide association (GWA) studies, the identified variants explain only a small proportion of the heritability of most complex diseases. This unexplained heritability could be partly due to gene--environment (G×E) interactions or more complex pathways involving multiple genes and exposures. This Review provides a tutorial on the available epidemiological designs and statistical analysis approaches for studying specific G×E interactions and choosing the most appropriate methods. I discuss the approaches that are being developed for studying entire pathways and available techniques for mining interactions in GWA data. I also explore methods for marrying hypothesis-driven pathway-based approaches with 'agnostic' GWA studies.
Collapse
Affiliation(s)
- Duncan Thomas
- Medicine, University of Southern California, 1540 Alcazar Street, CHP‑220, Los Angeles, California 90089‑9011, USA.
| |
Collapse
|
34
|
Abstract
False positive findings are a common problem in whole genome association studies. In this commentary we show that nothing is gained by randomly splitting a data sample to two equal sized subsets, where the first data subset is used for explorative purposes and the other sub set is used to confirm the findings in the first subset. We compare the random splitting procedure to using the full data sample for analysis, by using a Bayesian perspective with consideration taken to prior probability of a false positive finding.
Collapse
|
35
|
Zheng G, Joo J, Zaykin D, Wu C, Geller N. Robust Tests in Genome-Wide Scans under Incomplete Linkage Disequilibrium. Stat Sci 2009. [DOI: 10.1214/09-sts314] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
36
|
Abstract
Replication helps ensure that a genotype-phenotype association observed in a genome-wide association (GWA) study represents a credible association and is not a chance finding or an artifact due to uncontrolled biases. We discuss prerequisites for exact replication; issues of heterogeneity; advantages and disadvantages of different methods of data synthesis across multiple studies; frequentist vs. Bayesian inferences for replication; and challenges that arise from multi-team collaborations. While consistent replication can greatly improve the credibility of a genotype-phenotype association, it may not eliminate spurious associations due to biases shared by many studies. Conversely, lack of replication in well-powered follow-up studies usually invalidates the initially proposed association, although occasionally it may point to differences in linkage disequilibrium or effect modifiers across studies.
Collapse
Affiliation(s)
- Peter Kraft
- Departments of Epidemiology and Biostatistics, Harvard School of Public Health, Boston, MA, USA
| | | | | |
Collapse
|