1
|
Sun Q, Rowland BT, Chen J, Mikhaylova AV, Avery C, Peters U, Lundin J, Matise T, Buyske S, Tao R, Mathias RA, Reiner AP, Auer PL, Cox NJ, Kooperberg C, Thornton TA, Raffield LM, Li Y. Improving polygenic risk prediction in admixed populations by explicitly modeling ancestral-differential effects via GAUDI. Nat Commun 2024; 15:1016. [PMID: 38310129 PMCID: PMC10838303 DOI: 10.1038/s41467-024-45135-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 01/16/2024] [Indexed: 02/05/2024] Open
Abstract
Polygenic risk scores (PRS) have shown successes in clinics, but most PRS methods focus only on participants with distinct primary continental ancestry without accommodating recently-admixed individuals with mosaic continental ancestry backgrounds for different segments of their genomes. Here, we develop GAUDI, a novel penalized-regression-based method specifically designed for admixed individuals. GAUDI explicitly models ancestry-differential effects while borrowing information across segments with shared ancestry in admixed genomes. We demonstrate marked advantages of GAUDI over other methods through comprehensive simulation and real data analyses for traits with associated variants exhibiting ancestral-differential effects. Leveraging data from the Women's Health Initiative study, we show that GAUDI improves PRS prediction of white blood cell count and C-reactive protein in African Americans by > 64% compared to alternative methods, and even outperforms PRS-CSx with large European GWAS for some scenarios. We believe GAUDI will be a valuable tool to mitigate disparities in PRS performance in admixed individuals.
Collapse
Affiliation(s)
- Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Bryce T Rowland
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jiawen Chen
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Anna V Mikhaylova
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
| | - Christy Avery
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Ulrike Peters
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Jessica Lundin
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Tara Matise
- Department of Genetics, Rutgers University, New Brunswick, NJ, 08901, USA
| | - Steve Buyske
- Department of Statistics, Rutgers University, New Brunswick, NJ, 08901, USA
| | - Ran Tao
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Rasika A Mathias
- Department of Medicine, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA, 98195, USA
| | - Paul L Auer
- Division of Biostatistics, Institute for Health and Equity, and Cancer Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | - Nancy J Cox
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, 98109, USA
| | - Timothy A Thornton
- Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
2
|
Shi M, Tanikawa C, Munter HM, Akiyama M, Koyama S, Tomizuka K, Matsuda K, Lathrop GM, Terao C, Koido M, Kamatani Y. Genotype imputation accuracy and the quality metrics of the minor ancestry in multi-ancestry reference panels. Brief Bioinform 2023; 25:bbad509. [PMID: 38221906 PMCID: PMC10788679 DOI: 10.1093/bib/bbad509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/20/2023] [Accepted: 12/13/2023] [Indexed: 01/16/2024] Open
Abstract
Large-scale imputation reference panels are currently available and have contributed to efficient genome-wide association studies through genotype imputation. However, whether large-size multi-ancestry or small-size population-specific reference panels are the optimal choices for under-represented populations continues to be debated. We imputed genotypes of East Asian (180k Japanese) subjects using the Trans-Omics for Precision Medicine reference panel and found that the standard imputation quality metric (Rsq) overestimated dosage r2 (squared correlation between imputed dosage and true genotype) particularly in marginal-quality bins. Variance component analysis of Rsq revealed that the increased imputed-genotype certainty (dosages closer to 0, 1 or 2) caused upward bias, indicating some systemic bias in the imputation. Through systematic simulations using different template switching rates (θ value) in the hidden Markov model, we revealed that the lower θ value increased the imputed-genotype certainty and Rsq; however, dosage r2 was insensitive to the θ value, thereby causing a deviation. In simulated reference panels with different sizes and ancestral diversities, the θ value estimates from Minimac decreased with the size of a single ancestry and increased with the ancestral diversity. Thus, Rsq could be deviated from dosage r2 for a subpopulation in the multi-ancestry panel, and the deviation represents different imputed-dosage distributions. Finally, despite the impact of the θ value, distant ancestries in the reference panel contributed only a few additional variants passing a predefined Rsq threshold. We conclude that the θ value substantially impacts the imputed dosage and the imputation quality metric value.
Collapse
Affiliation(s)
- Mingyang Shi
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Chizu Tanikawa
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Hans Markus Munter
- Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montreal, Québec, Canada
| | - Masato Akiyama
- Department of Ocular Pathology and Imaging Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Satoshi Koyama
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Koichi Matsuda
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Gregory Mark Lathrop
- Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montreal, Québec, Canada
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| |
Collapse
|
3
|
Sun Q, Broadaway KA, Edmiston SN, Fajgenbaum K, Miller-Fleming T, Westerkam LL, Melendez-Gonzalez M, Bui H, Blum FR, Levitt B, Lin L, Hao H, Harris KM, Liu Z, Thomas NE, Cox NJ, Li Y, Mohlke KL, Sayed CJ. Genetic Variants Associated With Hidradenitis Suppurativa. JAMA Dermatol 2023; 159:930-938. [PMID: 37494057 PMCID: PMC10372759 DOI: 10.1001/jamadermatol.2023.2217] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 04/25/2023] [Indexed: 07/27/2023]
Abstract
Importance Hidradenitis suppurativa (HS) is a common and severely morbid chronic inflammatory skin disease that is reported to be highly heritable. However, the genetic understanding of HS is insufficient, and limited genome-wide association studies (GWASs) have been performed for HS, which have not identified significant risk loci. Objective To identify genetic variants associated with HS and to shed light on the underlying genes and genetic mechanisms. Design, Setting, and Participants This genetic association study recruited 753 patients with HS in the HS Program for Research and Care Excellence (HS ProCARE) at the University of North Carolina Department of Dermatology from August 2018 to July 2021. A GWAS was performed for 720 patients (after quality control) with controls from the Add Health study and then meta-analyzed with 2 large biobanks, UK Biobank (247 cases) and FinnGen (673 cases). Variants at 3 loci were tested for replication in the BioVU biobank (290 cases). Data analysis was performed from September 2021 to December 2022. Main Outcomes and Measures Main outcome measures are loci identified, with association of P < 1 × 10-8 considered significant. Results A total of 753 patients were recruited, with 720 included in the analysis. Mean (SD) age at symptom onset was 20.3 (10.57) years and at enrollment was 35.3 (13.52) years; 360 (50.0%) patients were Black, and 575 (79.7%) were female. In a meta-analysis of the 4 studies, 2 HS-associated loci were identified and replicated, with lead variants rs10512572 (P = 2.3 × 10-11) and rs17090189 (P = 2.1 × 10-8) near the SOX9 and KLF5 genes, respectively. Variants at these loci are located in enhancer regulatory elements detected in skin tissue. Conclusions and Relevance In this genetic association study, common variants associated with HS located near the SOX9 and KLF5 genes were associated with risk of HS. These or other nearby genes may be associated with genetic risk of disease and the development of clinical features, such as cysts, comedones, and inflammatory tunnels, that are unique to HS. New insights into disease pathogenesis related to these genes may help predict disease progression and novel treatment approaches in the future.
Collapse
Affiliation(s)
- Quan Sun
- Department of Biostatistics, University of North Carolina at Chapel Hill
| | | | - Sharon N. Edmiston
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
- Lineberger Comprehensive Cancer Center, Chapel Hill, North Carolina
| | - Kristen Fajgenbaum
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
| | - Tyne Miller-Fleming
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Linnea Lackstrom Westerkam
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
- University of North Carolina at Chapel Hill School of Medicine
| | | | - Helen Bui
- Department of Internal Medicine, University of North Carolina at Chapel Hill School of Medicine
| | | | - Brandt Levitt
- Carolina Population Center, University of North Carolina at Chapel Hill
| | - Lan Lin
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
| | - Honglin Hao
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
| | - Kathleen Mullan Harris
- Carolina Population Center, University of North Carolina at Chapel Hill
- Sociology Department, University of North Carolina at Chapel Hill
| | - Zhi Liu
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
- Lineberger Comprehensive Cancer Center, Chapel Hill, North Carolina
| | - Nancy E. Thomas
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
- Carolina Population Center, University of North Carolina at Chapel Hill
| | - Nancy J. Cox
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill
- Department of Genetics, University of North Carolina at Chapel Hill
| | - Karen L. Mohlke
- Department of Genetics, University of North Carolina at Chapel Hill
| | - Christopher J. Sayed
- Department of Dermatology, University of North Carolina at Chapel Hill School of Medicine
| |
Collapse
|
4
|
Shen S, Li Z, Jiang Y, Duan W, Li H, Du S, Esteller M, Shen H, Hu Z, Zhao Y, Christiani DC, Chen F. A Large-Scale Exome-Wide Association Study Identifies Novel Germline Mutations in Lung Cancer. Am J Respir Crit Care Med 2023; 208:280-289. [PMID: 37167549 PMCID: PMC10395715 DOI: 10.1164/rccm.202212-2199oc] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Accepted: 05/11/2023] [Indexed: 05/13/2023] Open
Abstract
Rationale: Genome-wide association studies have identified common variants of lung cancer. However, the contribution of rare exome-wide variants, especially protein-coding variants, to cancers remains largely unexplored. Objectives: To evaluate the role of human exomes in genetic predisposition to lung cancer. Methods: We performed exome-wide association studies to detect the association of exomes with lung cancer in 30,312 patients and 652,902 control subjects. A scalable and accurate implementation of a generalized mixed model was used to detect the association signals for loss-of-function, missense, and synonymous variants and gene-level sets. Furthermore, we performed association and Bayesian colocalization analyses to evaluate their relationships with intermediate exposures. Measurements and Main Results: We systematically analyzed 216,739 single-nucleotide variants in the human exome. The loss-of-function variants exhibited the most notable effects on lung cancer risk. We identified four novel variants, including two missense variants (rs202197044TET3 [Pmeta (P values of meta-analysis) = 3.60 × 10-8] and rs202187871POT1 [Pmeta = 2.21 × 10-8]) and two synonymous variants (rs7447927TMEM173 [Pmeta = 1.32 × 10-9] and rs140624366ATRN [Pmeta = 2.97 × 10-9]). rs202197044TET3 was significantly associated with emphysema (odds ratio, 3.55; Pfdr = 0.015), whereas rs7447927POT1 was strongly associated with telomere length (β = 1.08; Pfdr (FDR corrected P value) = 3.76 × 10-53). Functional evidence of expression of quantitative trait loci, splicing quantitative trait loci, and isoform expression was found for the four novel genes. Gene-level association tests identified several novel genes, including POT1 (protection of telomeres 1), RTEL1, BSG, and ZNF232. Conclusions: Our findings provide insights into the genetic architecture of human exomes and their role in lung cancer predisposition.
Collapse
Affiliation(s)
- Sipeng Shen
- Department of Biostatistics and
- Jiangsu Key Lab of Cancer Biomarkers, Prevention, and Treatment, Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine
- China International Cooperation Center of Environment and Human Health
| | | | | | - Weiwei Duan
- Department of Bioinformatics, School of Biomedical Engineering and Informatics, and
| | | | - Sha Du
- Department of Biostatistics and
| | - Manel Esteller
- Josep Carreras Leukaemia Research Institute, Barcelona, Spain
- Centro de Investigacion Biomedica en Red Cancer, Madrid, Spain
- Institucio Catalana de Recerca i Estudis Avançats, Barcelona, Spain
- Physiological Sciences Department, School of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain
| | - Hongbing Shen
- Department of Epidemiology, Center for Global Health, School of Public Health
- Jiangsu Key Lab of Cancer Biomarkers, Prevention, and Treatment, Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine
| | - Zhibin Hu
- Department of Epidemiology, Center for Global Health, School of Public Health
- Jiangsu Key Lab of Cancer Biomarkers, Prevention, and Treatment, Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine
| | - Yang Zhao
- Department of Biostatistics and
- Key Laboratory of Biomedical Big Data, Nanjing Medical University, Nanjing, China
| | - David C. Christiani
- Department of Environmental Health, Harvard T.H. Chan School of Public Health, Harvard University, Boston, Massachusetts; and
- Pulmonary and Critical Care Division, Massachusetts General Hospital, Department of Medicine, Harvard Medical School, Boston, Massachusetts
| | - Feng Chen
- Department of Biostatistics and
- Jiangsu Key Lab of Cancer Biomarkers, Prevention, and Treatment, Jiangsu Collaborative Innovation Center for Cancer Personalized Medicine
- China International Cooperation Center of Environment and Human Health
| |
Collapse
|
5
|
Zhou YH, Gallins PJ, Pace RG, Dang H, Aksit MA, Blue EE, Buckingham KJ, Collaco JM, Faino AV, Gordon WW, Hetrick KN, Ling H, Liu W, Onchiri FM, Pagel K, Pugh EW, Raraigh KS, Rosenfeld M, Sun Q, Wen J, Li Y, Corvol H, Strug LJ, Bamshad MJ, Blackman SM, Cutting GR, Gibson RL, O’Neal WK, Wright FA, Knowles MR. Genetic Modifiers of Cystic Fibrosis Lung Disease Severity: Whole-Genome Analysis of 7,840 Patients. Am J Respir Crit Care Med 2023; 207:1324-1333. [PMID: 36921087 PMCID: PMC10595435 DOI: 10.1164/rccm.202209-1653oc] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Accepted: 02/27/2023] [Indexed: 03/17/2023] Open
Abstract
Rationale: Lung disease is the major cause of morbidity and mortality in persons with cystic fibrosis (pwCF). Variability in CF lung disease has substantial non-CFTR (CF transmembrane conductance regulator) genetic influence. Identification of genetic modifiers has prognostic and therapeutic importance. Objectives: Identify genetic modifier loci and genes/pathways associated with pulmonary disease severity. Methods: Whole-genome sequencing data on 4,248 unique pwCF with pancreatic insufficiency and lung function measures were combined with imputed genotypes from an additional 3,592 patients with pancreatic insufficiency from the United States, Canada, and France. This report describes association of approximately 15.9 million SNPs using the quantitative Kulich normal residual mortality-adjusted (KNoRMA) lung disease phenotype in 7,840 pwCF using premodulator lung function data. Measurements and Main Results: Testing included common and rare SNPs, transcriptome-wide association, gene-level, and pathway analyses. Pathway analyses identified novel associations with genes that have key roles in organ development, and we hypothesize that these genes may relate to dysanapsis and/or variability in lung repair. Results confirmed and extended previous genome-wide association study findings. These whole-genome sequencing data provide finely mapped genetic information to support mechanistic studies. No novel primary associations with common single variants or rare variants were found. Multilocus effects at chr5p13 (SLC9A3/CEP72) and chr11p13 (EHF/APIP) were identified. Variant effect size estimates at associated loci were consistently ordered across the cohorts, indicating possible age or birth cohort effects. Conclusions: This premodulator genomic, transcriptomic, and pathway association study of 7,840 pwCF will facilitate mechanistic and postmodulator genetic studies and the development of novel therapeutics for CF lung disease.
Collapse
Affiliation(s)
- Yi-Hui Zhou
- Bioinformatics Research Center
- Department of Biological Sciences, and
| | | | - Rhonda G. Pace
- Marsico Lung Institute/UNC CF Research Center, School of Medicine
| | - Hong Dang
- Marsico Lung Institute/UNC CF Research Center, School of Medicine
| | | | - Elizabeth E. Blue
- Brotman Baty Institute for Precision Medicine, Seattle, Washington
- Division of Medical Genetics, Department of Medicine
| | | | | | - Anna V. Faino
- Children’s Core for Biostatistics, Epidemiology and Analytics in Research and
| | | | - Kurt N. Hetrick
- Department of Genetic Medicine, Center for Inherited Disease Research, and
| | - Hua Ling
- Department of Genetic Medicine, Center for Inherited Disease Research, and
| | | | | | - Kymberleigh Pagel
- The Institute for Computational Medicine, The Johns Hopkins University, Baltimore, Maryland
| | - Elizabeth W. Pugh
- Department of Genetic Medicine, Center for Inherited Disease Research, and
| | | | - Margaret Rosenfeld
- Department of Pediatrics, and
- Center for Clinical and Translational Research, Seattle Children’s Research Institute, Seattle, Washington
| | | | | | - Yun Li
- Department of Biostatistics
- Department of Genetics, and
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Harriet Corvol
- Pediatric Pulmonary Department, Assistance Publique-Hôpitaux de Paris, Hôpital Trousseau, Paris, France
- Centre de Recherche Saint Antoine, Sorbonne Université, Institut National de la Santé et de la Recherche Médicale, Paris, France
| | - Lisa J. Strug
- Division of Biostatistics, Dalla Lana School of Public Health
- Department of Statistical Sciences, and
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada; and
- Program in Genetics and Genome Biology and
- The Center for Applied Genomics, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Michael J. Bamshad
- Brotman Baty Institute for Precision Medicine, Seattle, Washington
- Division of Genetic Medicine, Department of Pediatrics
- Department of Genome Sciences, University of Washington, Seattle, Washington
| | - Scott M. Blackman
- McKusick-Nathans Department of Genetic Medicine
- Division of Pediatric Endocrinology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | | | - Ronald L. Gibson
- Department of Pediatrics, and
- Center for Clinical and Translational Research, Seattle Children’s Research Institute, Seattle, Washington
| | - Wanda K. O’Neal
- Marsico Lung Institute/UNC CF Research Center, School of Medicine
| | - Fred A. Wright
- Bioinformatics Research Center
- Department of Biological Sciences, and
- Department of Statistics, North Carolina State University, Raleigh, North Carolina
| | | |
Collapse
|