1
|
Cahoon JL, Rui X, Tang E, Simons C, Langie J, Chen M, Lo YC, Chiang CWK. Imputation accuracy across global human populations. Am J Hum Genet 2024; 111:979-989. [PMID: 38604166 PMCID: PMC11080279 DOI: 10.1016/j.ajhg.2024.03.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 03/14/2024] [Accepted: 03/15/2024] [Indexed: 04/13/2024] Open
Abstract
Genotype imputation is now fundamental for genome-wide association studies but lacks fairness due to the underrepresentation of references from non-European ancestries. The state-of-the-art imputation reference panel released by the Trans-Omics for Precision Medicine (TOPMed) initiative improved the imputation of admixed African-ancestry and Hispanic/Latino samples, but imputation for populations primarily residing outside of North America may still fall short in performance due to persisting underrepresentation. To illustrate this point, we imputed the genotypes of over 43,000 individuals across 123 populations around the world and identified numerous populations where imputation accuracy paled in comparison to that of European-ancestry populations. For instance, the mean imputation r-squared (Rsq) for variants with minor allele frequencies between 1% and 5% in Saudi Arabians (n = 1,061), Vietnamese (n = 1,264), Thai (n = 2,435), and Papua New Guineans (n = 776) were 0.79, 0.78, 0.76, and 0.62, respectively, compared to 0.90-0.93 for comparable European populations matched in sample size and SNP array content. Outside of Africa and Latin America, Rsq appeared to decrease as genetic distances to European-ancestry reference increased, as predicted. Using sequencing data as ground truth, we also showed that Rsq may over-estimate imputation accuracy for non-European populations more than European populations, suggesting further disparity in accuracy between populations. Using 1,496 sequenced individuals from Taiwan Biobank as a second reference panel to TOPMed, we also assessed a strategy to improve imputation for non-European populations with meta-imputation, but this design did not improve accuracy across frequency spectra. Taken together, our analyses suggest that we must ultimately strive to increase diversity and size to promote equity within genetics research.
Collapse
Affiliation(s)
- Jordan L Cahoon
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA; Department of Computer Science, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Xinyue Rui
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Echo Tang
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Christopher Simons
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA
| | - Jalen Langie
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Minhui Chen
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Ying-Chu Lo
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA
| | - Charleston W K Chiang
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA; Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, Los Angeles, CA 90089, USA; Norris Comprehensive Cancer Center, University of Southern California, Los Angeles, Los Angeles, CA 90033, USA.
| |
Collapse
|
2
|
Wiley LK, Shortt JA, Roberts ER, Lowery J, Kudron E, Lin M, Mayer D, Wilson M, Brunetti TM, Chavan S, Phang TL, Pozdeyev N, Lesny J, Wicks SJ, Moore ET, Morgenstern JL, Roff AN, Shalowitz EL, Stewart A, Williams C, Edelmann MN, Hull M, Patton JT, Axell L, Ku L, Lee YM, Jirikowic J, Tanaka A, Todd E, White S, Peterson B, Hearst E, Zane R, Greene CS, Mathias R, Coors M, Taylor M, Ghosh D, Kahn MG, Brooks IM, Aquilante CL, Kao D, Rafaels N, Crooks KR, Hess S, Barnes KC, Gignoux CR. Building a vertically integrated genomic learning health system: The biobank at the Colorado Center for Personalized Medicine. Am J Hum Genet 2024; 111:11-23. [PMID: 38181729 PMCID: PMC10806731 DOI: 10.1016/j.ajhg.2023.12.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 11/30/2023] [Accepted: 12/01/2023] [Indexed: 01/07/2024] Open
Abstract
Precision medicine initiatives across the globe have led to a revolution of repositories linking large-scale genomic data with electronic health records, enabling genomic analyses across the entire phenome. Many of these initiatives focus solely on research insights, leading to limited direct benefit to patients. We describe the biobank at the Colorado Center for Personalized Medicine (CCPM Biobank) that was jointly developed by the University of Colorado Anschutz Medical Campus and UCHealth to serve as a unique, dual-purpose research and clinical resource accelerating personalized medicine. This living resource currently has more than 200,000 participants with ongoing recruitment. We highlight the clinical, laboratory, regulatory, and HIPAA-compliant informatics infrastructure along with our stakeholder engagement, consent, recontact, and participant engagement strategies. We characterize aspects of genetic and geographic diversity unique to the Rocky Mountain region, the primary catchment area for CCPM Biobank participants. We leverage linked health and demographic information of the CCPM Biobank participant population to demonstrate the utility of the CCPM Biobank to replicate complex trait associations in the first 33,674 genotyped individuals across multiple disease domains. Finally, we describe our current efforts toward return of clinical genetic test results, including high-impact pathogenic variants and pharmacogenetic information, and our broader goals as the CCPM Biobank continues to grow. Bringing clinical and research interests together fosters unique clinical and translational questions that can be addressed from the large EHR-linked CCPM Biobank resource within a HIPAA- and CLIA-certified environment.
Collapse
Affiliation(s)
- Laura K Wiley
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jonathan A Shortt
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Emily R Roberts
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jan Lowery
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; University of Colorado Cancer Center, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Community and Behavioral Health, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Elizabeth Kudron
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Pediatrics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Meng Lin
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - David Mayer
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Melissa Wilson
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Tonya M Brunetti
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Sameer Chavan
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Tzu L Phang
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Nikita Pozdeyev
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Division of Endocrinology, Diabetes and Metabolism, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Joseph Lesny
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Stephen J Wicks
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Ethan T Moore
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Joshua L Morgenstern
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Alanna N Roff
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Elise L Shalowitz
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Adrian Stewart
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Cole Williams
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Michelle N Edelmann
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Madelyne Hull
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - J Tacker Patton
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Lisen Axell
- CU Cancer Center, Hereditary Cancer Clinic, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Lisa Ku
- CU Cancer Center, Hereditary Cancer Clinic, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Yee Ming Lee
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Clinical Pharmacy, University of Colorado Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | | | - Emily Todd
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; UCHealth, Aurora, CO 80045, USA
| | | | - Brett Peterson
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Richard Zane
- UCHealth, Aurora, CO 80045, USA; University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Casey S Greene
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Rasika Mathias
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Marilyn Coors
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Matthew Taylor
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Division of Cardiology, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Debashis Ghosh
- Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO 80045, USA
| | - Michael G Kahn
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Ian M Brooks
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Christina L Aquilante
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Pharmaceutical Sciences, University of Colorado Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - David Kao
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Division of Cardiology, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; CARE Innovation Center, UCHealth, Aurora, CO 80045, USA
| | - Nicholas Rafaels
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Kristy R Crooks
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Pathology, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Kathleen C Barnes
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
| | - Christopher R Gignoux
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Department of Biomedical Informatics, University of Colorado School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA.
| |
Collapse
|
3
|
Childebayeva A, Zavala EI. Review: Computational analysis of human skeletal remains in ancient DNA and forensic genetics. iScience 2023; 26:108066. [PMID: 37927550 PMCID: PMC10622734 DOI: 10.1016/j.isci.2023.108066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2023] Open
Abstract
Degraded DNA is used to answer questions in the fields of ancient DNA (aDNA) and forensic genetics. While aDNA studies typically center around human evolution and past history, and forensic genetics is often more concerned with identifying a specific individual, scientists in both fields face similar challenges. The overlap in source material has prompted periodic discussions and studies on the advantages of collaboration between fields toward mutually beneficial methodological advancements. However, most have been centered around wet laboratory methods (sampling, DNA extraction, library preparation, etc.). In this review, we focus on the computational side of the analytical workflow. We discuss limitations and considerations to consider when working with degraded DNA. We hope this review provides a framework to researchers new to computational workflows for how to think about analyzing highly degraded DNA and prompts an increase of collaboration between the forensic genetics and aDNA fields.
Collapse
Affiliation(s)
- Ainash Childebayeva
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- Department of Anthropology, University of Kansas, Lawrence, KS, USA
| | - Elena I. Zavala
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Biology, University of Oregon, Eugene, OR, USA
| |
Collapse
|
4
|
Bandres-Ciga S, Faghri F, Majounie E, Koretsky MJ, Kim J, Levine KS, Leonard H, Makarious MB, Iwaki H, Crea PW, Hernandez DG, Arepalli S, Billingsley K, Lohmann K, Klein C, Lubbe SJ, Jabbari E, Saffie-Awad P, Narendra D, Reyes-Palomares A, Quinn JP, Schulte C, Morris HR, Traynor BJ, Scholz SW, Houlden H, Hardy J, Dumanis S, Riley E, Blauwendraat C, Singleton A, Nalls M, Jeff J, Vitale D. NeuroBooster Array: A Genome-Wide Genotyping Platform to Study Neurological Disorders Across Diverse Populations. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.11.06.23298176. [PMID: 37986980 PMCID: PMC10659467 DOI: 10.1101/2023.11.06.23298176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Genome-wide genotyping platforms have the capacity to capture genetic variation across different populations, but there have been disparities in the representation of population-dependent genetic diversity. The motivation for pursuing this endeavor was to create a comprehensive genome-wide array capable of encompassing a wide range of neuro-specific content for the Global Parkinson's Genetics Program (GP2) and the Center for Alzheimer's and Related Dementias (CARD). CARD aims to increase diversity in genetic studies, using this array as a tool to foster inclusivity. GP2 is the first supported resource project of the Aligning Science Across Parkinson's (ASAP) initiative that aims to support a collaborative global effort aimed at significantly accelerating the discovery of genetic factors contributing to Parkinson's disease and atypical parkinsonism by generating genome-wide data for over 200,000 individuals in a multi-ancestry context. Here, we present the Illumina NeuroBooster array (NBA), a novel, high-throughput and cost-effective custom-designed content platform to screen for genetic variation in neurological disorders across diverse populations. The NBA contains a backbone of 1,914,934 variants (Infinium Global Diversity Array) complemented with custom content of 95,273 variants implicated in over 70 neurological conditions or traits with potential neurological complications. Furthermore, the platform includes over 10,000 tagging variants to facilitate imputation and analyses of neurodegenerative disease-related GWAS loci across diverse populations. The NBA can identify low frequency variants and accurately impute over 15 million common variants from the latest release of the TOPMed Imputation Server as of August 2023 (reference of over 300 million variants and 90,000 participants). We envisage this valuable tool will standardize genetic studies in neurological disorders across different ancestral groups, allowing researchers to perform genetic research inclusively and at a global scale.
Collapse
Affiliation(s)
- Sara Bandres-Ciga
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Faraz Faghri
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| | | | - Mathew J Koretsky
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Jeffrey Kim
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Kristin S Levine
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| | - Hampton Leonard
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| | - Mary B Makarious
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
- Department of Clinical and Movement Neurosciences, Queen Square Institute of Neurology, University College London, London, UK
| | - Hirotaka Iwaki
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| | - Peter Wild Crea
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Dena G Hernandez
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Sampath Arepalli
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Kimberley Billingsley
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Katja Lohmann
- Institute of Neurogenetics, University of Lübeck, Lübeck, Germany
| | - Christine Klein
- Institute of Neurogenetics, University of Lübeck, Lübeck, Germany
| | - Steven J Lubbe
- Ken and Ruth Davee Department of Neurology, Northwestern University, Feinberg School of Medicine, Chicago, Illinois, USA
- Simpson Querrey Center for Neurogenetics, Northwestern University, Feinberg School of Medicine, Chicago, Illinois, USA
| | - Edwin Jabbari
- Department of Clinical and Movement Neurosciences, Queen Square Institute of Neurology, University College London, London, UK
| | - Paula Saffie-Awad
- Programa de Pós-Graduação em Ciências Médicas, Universidade Federal do Rio Grande do Sul, Porto Alegre, Brazil
- Centro de Trastornos del Movimiento (CETRAM), Santiago, Chile
- Clínica Santa María, Santiago, Chile
| | - Derek Narendra
- Inherited Movement Disorders Unit, Neurogenetics Branch, Division of Intramural Research, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland, USA
| | - Armando Reyes-Palomares
- Department of Molecular Biology and Biochemistry, Faculty of Sciences, University of Málaga, Málaga, Spain
| | - John P Quinn
- Department of Pharmacology & Therapeutics, University of Liverpool, Liverpool, UK
| | - Claudia Schulte
- Department for Neurodegenerative Diseases, Hertie Institute for Clinical Brain Research, University of Tuebingen and German Center for Neurodegenerative Diseases, University of Tuebingen, Tuebingen, Germany
| | - Huw R Morris
- Department of Clinical and Movement Neurosciences, Queen Square Institute of Neurology, University College London, London, UK
- Aligning Science Against Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815, USA
| | - Bryan J. Traynor
- Department of Neurology, Johns Hopkins University Medical Center, Baltimore, MD, USA
- Neuromuscular Diseases Research Section, Laboratory of Neurogenetics, National Institute on Aging, Bethesda, MD, USA
| | - Sonja W. Scholz
- Neurodegenerative Diseases Research Unit, National Institute of Neurological Disorders and Stroke, Bethesda, MD, USA
| | - Henry Houlden
- Aligning Science Against Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815, USA
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, London, UK
| | - John Hardy
- Aligning Science Against Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815, USA
- Department of Neurodegenerative Disease, UCL Queen Square Institute of Neurology, London, UK
| | - Sonya Dumanis
- Aligning Science Against Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815, USA
| | - Ekemini Riley
- Aligning Science Against Parkinson’s (ASAP) Collaborative Research Network, Chevy Chase, MD, 20815, USA
| | - Cornelis Blauwendraat
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Andrew Singleton
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD 20892, USA
| | - Mike Nalls
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| | | | - Dan Vitale
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC 20037, USA
| |
Collapse
|
5
|
Haring B, Hunt RP, Shadyab AH, Eaton C, Kaplan R, Martin LW, Panjrath G, Kuller LH, Assimes T, Kooperberg C, Wassertheil-Smoller S. Cardiovascular Disease and Mortality in Black Women Carrying the Amyloidogenic V122I Transthyretin Gene Variant. JACC. HEART FAILURE 2023; 11:1189-1199. [PMID: 36930136 PMCID: PMC10508305 DOI: 10.1016/j.jchf.2023.02.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/08/2023] [Accepted: 02/15/2023] [Indexed: 03/07/2023]
Abstract
BACKGROUND Long-term data on cardiovascular disease (CVD) and mortality in female carriers of the transthyretin (TTR) V122I (pV142I) variant, one of the most common variants of hereditary transthyretin cardiac amyloidosis, are sparse and the effects of blood pressure, heart rate, body mass index, and physical activity on CVD outcomes remain largely unknown. OBJECTIVES The aim was to first examine the relationship of TTR V122I (pV142I) carrier status with CVD and mortality and second to investigate the effects of blood pressure, heart rate, body mass index, and physical activity in a large cohort of postmenopausal women. METHODS The study population consisted of 9,862 non-Hispanic Black/African American women, 9,529 noncarriers and 333 TTR V122I carriers, enrolled in the Women's Health Initiative at 40 centers in the United States. Women were generally healthy and postmenopausal at the time of enrollment (1993-1998). CVD was defined as a composite endpoint consisting of coronary heart disease, stroke, acute heart failure or CVD death, and all-cause mortality. CVD cases were based on self-reported annual mailed health updates. All information was centrally adjudicated by trained physicians. HRs and 95% CIs were obtained from adjusted Cox proportional hazards models. RESULTS Among 9,862 Black female participants (mean age: 62 years [IQR: 56-67 years]), the population frequency of the TTR V122I variant was 3.4% (333 variant carriers and 9,529 noncarriers). During a mean follow-up of 16.1 years (IQR: 9.7-22.2 years), incident CVD occurred in 2,229 noncarriers and 96 carriers, whereas 2,689 noncarriers and 108 carriers died. In adjusted models including demographic, lifestyle, and medical history covariates, TTR V122I carriers were at higher risk of the composite endpoint CVD (HR: 1.52; 95% CI: 1.22-1.88), acute heart failure (HR: 2.21; 95% CI: 1.53-3.18), coronary heart disease (HR: 1.80; 95% CI: 1.30-2.47), CVD death (HR: 1.70; 95% CI: 1.26-2.30), and all-cause mortality (HR: 1.28; 95% CI: 1.04-1.56). The authors found a significant interaction by age but not by blood pressure, heart rate, body mass index, or physical activity. CONCLUSIONS Black female TTR V122I (pV142I) carriers have a higher CVD and all-cause mortality risk compared to noncarriers. In case of clinical suspicion of amyloidosis, they should be screened for TTR V122I (pV142I) carrier status to ensure early treatment onset.
Collapse
Affiliation(s)
- Bernhard Haring
- Department of Medicine III, Saarland University Hospital, Homburg, Saarland, Germany; Department of Medicine I, University of Würzburg, Würzburg, Bavaria, Germany; Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA.
| | - Rebecca P Hunt
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Aladdin H Shadyab
- Herbert Wertheim School of Public Health and Human Longevity Science, University of California-San Diego, La Jolla, California, USA
| | - Charles Eaton
- Center for Primary Care and Prevention, Department of Family Medicine, Department of Epidemiology, Warren Alpert Medical Scholl of Brown University, Brown University School of Public Health, Providence, Rhode Island, USA
| | - Robert Kaplan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA; Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Lisa Warsinger Martin
- Division of Cardiology, George Washington University School of Medicine and Health Sciences, Washington, District of Columbia, USA
| | - Gurusher Panjrath
- Division of Cardiology, George Washington University School of Medicine and Health Sciences, Washington, District of Columbia, USA
| | - Lewis H Kuller
- Department of Epidemiology, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Themistocles Assimes
- Department of Medicine, Stanford University School of Medicine, Stanford, California, USA; VA Palo Alto Health Care System, Palo Alto, California, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, Washington, USA
| | - Sylvia Wassertheil-Smoller
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| |
Collapse
|
6
|
Huffman JE, Nicolas J, Hahn J, Heath AS, Raffield LM, Yanek LR, Brody JA, Thibord F, Almasy L, Bartz TM, Bielak LF, Bowler RP, Carrasquilla GD, Chasman DI, Chen MH, Emmert DB, Ghanbari M, Haessle J, Hottenga JJ, Kleber ME, Le NQ, Lee J, Lewis JP, Li-Gao R, Luan J, Malmberg A, Mangino M, Marioni RE, Martinez-Perez A, Pankratz N, Polasek O, Richmond A, Rodriguez BA, Rotter JI, Steri M, Suchon P, Trompet S, Weiss S, Zare M, Auer P, Cho MH, Christofidou P, Davies G, de Geus E, Deleuze JF, Delgado GE, Ekunwe L, Faraday N, Gögele M, Greinacher A, He G, Howard T, Joshi PK, Kilpeläinen TO, Lahti J, Linneberg A, Naitza S, Noordam R, Paüls-Vergés F, Rich SS, Rosendaal FR, Rudan I, Ryan KA, Souto JC, van Rooij FJ, Wang H, Zhao W, Becker LC, Beswick A, Brown MR, Cade BE, Campbell H, Cho K, Crapo JD, Curran JE, de Maat MP, Doyle M, Elliott P, Floyd JS, Fuchsberger C, Grarup N, Guo X, Harris SE, Hou L, Kolcic I, Kooperberg C, Menni C, Nauck M, O'Connell JR, Orrù V, Psaty BM, Räikkönen K, Smith JA, Soria JM, Stott DJ, van Hylckama Vlieg A, Watkins H, Willemsen G, Wilson P, Ben-Shlomo Y, Blangero J, Boomsma D, Cox SR, Dehghan A, Eriksson JG, Fiorillo E, Fornage M, Hansen T, Hayward C, Ikram MA, Jukema JW, Kardia SL, Lange LA, März W, Mathias RA, Mitchell BD, Mook-Kanamori DO, Morange PE, Pedersen O, Pramstaller PP, Redline S, Reiner A, Ridker PM, Silverman EK, Spector TD, Völker U, Wareham N, Wilson JF, Yao J, Trégouët DA, Johnson AD, Wolberg AS, de Vries PS, Sabater-Lleal M, Morrison AC, Smith NL. Whole genome analysis of plasma fibrinogen reveals population-differentiated genetic regulators with putative liver roles. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.07.23291095. [PMID: 37398003 PMCID: PMC10312878 DOI: 10.1101/2023.06.07.23291095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Genetic studies have identified numerous regions associated with plasma fibrinogen levels in Europeans, yet missing heritability and limited inclusion of non-Europeans necessitates further studies with improved power and sensitivity. Compared with array-based genotyping, whole genome sequencing (WGS) data provides better coverage of the genome and better representation of non-European variants. To better understand the genetic landscape regulating plasma fibrinogen levels, we meta-analyzed WGS data from the NHLBI's Trans-Omics for Precision Medicine (TOPMed) program (n=32,572), with array-based genotype data from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium (n=131,340) imputed to the TOPMed or Haplotype Reference Consortium panel. We identified 18 loci that have not been identified in prior genetic studies of fibrinogen. Of these, four are driven by common variants of small effect with reported MAF at least 10% higher in African populations. Three ( SERPINA1, ZFP36L2 , and TLR10) signals contain predicted deleterious missense variants. Two loci, SOCS3 and HPN , each harbor two conditionally distinct, non-coding variants. The gene region encoding the protein chain subunits ( FGG;FGB;FGA ), contains 7 distinct signals, including one novel signal driven by rs28577061, a variant common (MAF=0.180) in African reference panels but extremely rare (MAF=0.008) in Europeans. Through phenome-wide association studies in the VA Million Veteran Program, we found associations between fibrinogen polygenic risk scores and thrombotic and inflammatory disease phenotypes, including an association with gout. Our findings demonstrate the utility of WGS to augment genetic discovery in diverse populations and offer new insights for putative mechanisms of fibrinogen regulation. Key Points Largest and most diverse genetic study of plasma fibrinogen identifies 54 regions (18 novel), housing 69 conditionally distinct variants (20 novel).Sufficient power achieved to identify signal driven by African population variant.Links to (1) liver enzyme, blood cell and lipid genetic signals, (2) liver regulatory elements, and (3) thrombotic and inflammatory disease.
Collapse
|
7
|
Assessing effectiveness of many-objective evolutionary algorithms for selection of tag SNPs. PLoS One 2022; 17:e0278560. [PMID: 36480538 PMCID: PMC9731481 DOI: 10.1371/journal.pone.0278560] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Accepted: 11/19/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Studies on genome-wide associations help to determine the cause of many genetic diseases. Genome-wide associations typically focus on associations between single-nucleotide polymorphisms (SNPs). Genotyping every SNP in a chromosomal region for identifying genetic variation is computationally very expensive. A representative subset of SNPs, called tag SNPs, can be used to identify genetic variation. Small tag SNPs save the computation time of genotyping platform, however, there could be missing data or genotyping errors in small tag SNPs. This study aims to solve Tag SNPs selection problem using many-objective evolutionary algorithms. METHODS Tag SNPs selection can be viewed as an optimization problem with some trade-offs between objectives, e.g. minimizing the number of tag SNPs and maximizing tolerance for missing data. In this study, the tag SNPs selection problem is formulated as a many-objective problem. Nondominated Sorting based Genetic Algorithm (NSGA-III), and Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D), which are Many-Objective evolutionary algorithms, have been applied and investigated for optimal tag SNPs selection. This study also investigates different initialization methods like greedy and random initialization. optimization. RESULTS The evaluation measures used for comparing results for different algorithms are Hypervolume, Range, SumMin, MinSum, Tolerance rate, and Average Hamming distance. Overall MOEA/D algorithm gives superior results as compared to other algorithms in most cases. NSGA-III outperforms NSGA-II and other compared algorithms on maximum tolerance rate, and SPEA2 outperforms all algorithms on average hamming distance. CONCLUSION Experimental results show that the performance of our proposed many-objective algorithms is much superior as compared to the results of existing methods. The outcomes show the advantages of greedy initialization over random initialization using NSGA-III, SPEA2, and MOEA/D to solve the tag SNPs selection as many-objective optimization problem.
Collapse
|
8
|
Caliebe A, Tekola‐Ayele F, Darst BF, Wang X, Song YE, Gui J, Sebro RA, Balding DJ, Saad M, Dubé M. Including diverse and admixed populations in genetic epidemiology research. Genet Epidemiol 2022; 46:347-371. [PMID: 35842778 PMCID: PMC9452464 DOI: 10.1002/gepi.22492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/31/2022] [Accepted: 06/06/2022] [Indexed: 11/25/2022]
Abstract
The inclusion of ancestrally diverse participants in genetic studies can lead to new discoveries and is important to ensure equitable health care benefit from research advances. Here, members of the Ethical, Legal, Social, Implications (ELSI) committee of the International Genetic Epidemiology Society (IGES) offer perspectives on methods and analysis tools for the conduct of inclusive genetic epidemiology research, with a focus on admixed and ancestrally diverse populations in support of reproducible research practices. We emphasize the importance of distinguishing socially defined population categorizations from genetic ancestry in the design, analysis, reporting, and interpretation of genetic epidemiology research findings. Finally, we discuss the current state of genomic resources used in genetic association studies, functional interpretation, and clinical and public health translation of genomic findings with respect to diverse populations.
Collapse
Affiliation(s)
- Amke Caliebe
- Institute of Medical Informatics and StatisticsKiel University and University Hospital Schleswig‐HolsteinKielGermany
| | - Fasil Tekola‐Ayele
- Epidemiology Branch, Division of Population Health Research, Division of Intramural Research, Eunice Kennedy Shriver National Institute of Child Health and Human DevelopmentNational Institutes of HealthBethesdaMarylandUSA
| | - Burcu F. Darst
- Center for Genetic EpidemiologyUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
- Public Health Sciences DivisionFred Hutchinson Cancer Research CenterSeattleWashingtonUSA
| | - Xuexia Wang
- Department of MathematicsUniversity of North TexasDentonTexasUSA
| | - Yeunjoo E. Song
- Department of Population and Quantitative Health SciencesCase Western Reserve UniversityClevelandOhioUSA
| | - Jiang Gui
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth CollegeOne Medical Center Dr.LebanonNew HampshireUSA
| | | | - David J. Balding
- Melbourne Integrative Genomics, Schools of BioSciences and of Mathematics & StatisticsUniversity of MelbourneMelbourneAustralia
| | - Mohamad Saad
- Qatar Computing Research InstituteHamad Bin Khalifa UniversityDohaQatar
- Neuroscience Research Center, Faculty of Medical SciencesLebanese UniversityBeirutLebanon
| | - Marie‐Pierre Dubé
- Department of Medicine, and Social and Preventive MedicineUniversité de MontréalMontréalQuébecCanada
- Beaulieu‐Saucier Pharmacogenomcis CentreMontreal Heart InstituteMontrealCanada
| | | |
Collapse
|
9
|
Hanks SC, Forer L, Schönherr S, LeFaive J, Martins T, Welch R, Gagliano Taliun SA, Braff D, Johnsen JM, Kenny EE, Konkle BA, Laakso M, Loos RF, McCarroll S, Pato C, Pato MT, Smith AV, Boehnke M, Scott LJ, Fuchsberger C. Extent to which array genotyping and imputation with large reference panels approximate deep whole-genome sequencing. Am J Hum Genet 2022; 109:1653-1666. [PMID: 35981533 PMCID: PMC9502057 DOI: 10.1016/j.ajhg.2022.07.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 07/20/2022] [Indexed: 01/02/2023] Open
Abstract
Understanding the genetic basis of human diseases and traits is dependent on the identification and accurate genotyping of genetic variants. Deep whole-genome sequencing (WGS), the gold standard technology for SNP and indel identification and genotyping, remains very expensive for most large studies. Here, we quantify the extent to which array genotyping followed by genotype imputation can approximate WGS in studies of individuals of African, Hispanic/Latino, and European ancestry in the US and of Finnish ancestry in Finland (a population isolate). For each study, we performed genotype imputation by using the genetic variants present on the Illumina Core, OmniExpress, MEGA, and Omni 2.5M arrays with the 1000G, HRC, and TOPMed imputation reference panels. Using the Omni 2.5M array and the TOPMed panel, ≥90% of bi-allelic single-nucleotide variants (SNVs) are well imputed (r2 > 0.8) down to minor-allele frequencies (MAFs) of 0.14% in African, 0.11% in Hispanic/Latino, 0.35% in European, and 0.85% in Finnish ancestries. There was little difference in TOPMed-based imputation quality among the arrays with >700k variants. Individual-level imputation quality varied widely between and within the three US studies. Imputation quality also varied across genomic regions, producing regions where even common (MAF > 5%) variants were consistently not well imputed across ancestries. The extent to which array genotyping and imputation can approximate WGS therefore depends on reference panel, genotype array, sample ancestry, and genomic location. Imputation quality by variant or genomic region can be queried with our new tool, RsqBrowser, now deployed on the Michigan Imputation Server.
Collapse
Affiliation(s)
- Sarah C. Hanks
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Lukas Forer
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Sebastian Schönherr
- Institute of Genetic Epidemiology, Medical University of Innsbruck, Innsbruck, Austria
| | - Jonathon LeFaive
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Taylor Martins
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Ryan Welch
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Sarah A. Gagliano Taliun
- Department of Medicine and Department of Neurosciences, Université de Montréal, Montreal, QC, Canada,Research Centre, Montreal Heart Institute, Montreal, QC, Canada
| | - David Braff
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
| | - Jill M. Johnsen
- Research Institute, Bloodworks, Seattle, WA, USA,Department of Medicine, University of Washington, Seattle, WA, USA
| | - Eimear E. Kenny
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA,Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | | | - Markku Laakso
- Institute of Clinical Medicine, Internal Medicine, University of Eastern Finland, Kuopio, Finland
| | - Ruth F.J. Loos
- Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Steven McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA,Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Carlos Pato
- Departments of Psychiatry, Rutgers University, Robert Wood Johnson Medical School and New Jersey Medical School, New Brunswick, NJ, USA
| | - Michele T. Pato
- Departments of Psychiatry, Rutgers University, Robert Wood Johnson Medical School and New Jersey Medical School, New Brunswick, NJ, USA
| | - Albert V. Smith
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | | | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Laura J. Scott
- Department of Biostatistics and Center for Statistical Genetics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Christian Fuchsberger
- Institute for Biomedicine (Affiliated with the University of Lübeck), Eurac Research, Bolzano, Italy.
| |
Collapse
|
10
|
Redondo MJ, Gignoux CR, Dabelea D, Hagopian WA, Onengut-Gumuscu S, Oram RA, Rich SS. Type 1 diabetes in diverse ancestries and the use of genetic risk scores. Lancet Diabetes Endocrinol 2022; 10:597-608. [PMID: 35724677 PMCID: PMC10024251 DOI: 10.1016/s2213-8587(22)00159-0] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 04/16/2022] [Accepted: 05/06/2022] [Indexed: 02/06/2023]
Abstract
Over 75 genetic loci within and outside of the HLA region influence type 1 diabetes risk. Genetic risk scores (GRS), which facilitate the integration of complex genetic information, have been developed in type 1 diabetes and incorporated into models and algorithms for classification, prognosis, and prediction of disease and response to preventive and therapeutic interventions. However, the development and validation of GRS across different ancestries is still emerging, as is knowledge on type 1 diabetes genetics in populations of diverse genetic ancestries. In this Review, we provide a summary of the current evidence on the evolutionary genetic variation in type 1 diabetes and the racial and ethnic differences in type 1 diabetes epidemiology, clinical characteristics, and preclinical course. We also discuss the influence of genetics on type 1 diabetes with differences across ancestries and the development and validation of GRS in various populations.
Collapse
Affiliation(s)
- Maria J Redondo
- Division of Diabetes and Endocrinology, Texas Children's Hospital, Baylor College of Medicine, Houston, TX, USA.
| | - Christopher R Gignoux
- Department of Medicine and Colorado Center for Personalized Medicine, Anschutz Medical Campus, University of Colorado, Aurora, CO, USA
| | - Dana Dabelea
- Lifecourse Epidemiology of Adiposity and Diabetes (LEAD) Center, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - William A Hagopian
- Division of Diabetes Programs, Pacific Northwest Research Institute, Seattle, WA, USA
| | - Suna Onengut-Gumuscu
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Richard A Oram
- Institute of Biomedical and Clinical Science, University of Exeter Medical School, University of Exeter, Exeter, UK; The Academic Kidney Unit, Royal Devon and Exeter NHS Foundation Trust, Exeter, UK
| | - Stephen S Rich
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| |
Collapse
|
11
|
Thanh Nguyen D, Hoang Nguyen Q, Thuy Duong N, Vo NS. LmTag: functional-enrichment and imputation-aware tag SNP selection for population-specific genotyping arrays. Brief Bioinform 2022; 23:6627269. [PMID: 35780383 DOI: 10.1093/bib/bbac252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Revised: 05/02/2022] [Accepted: 05/31/2022] [Indexed: 12/16/2022] Open
Abstract
Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solutions for large-scale genomic research and applications. Recent years have witnessed the rapid development of numerous genotyping platforms of different sizes and designs, but population-specific platforms are still lacking, especially for those in developing countries. SNP arrays designed for these countries should be cost-effective (small size), yet incorporate key information needed to associate genotypes with traits. A key design principle for most current platforms is to improve genome-wide imputation so that more SNPs not included in the array (imputed SNPs) can be predicted. However, current tag SNP selection methods mostly focus on imputation accuracy and coverage, but not the functional content of the array. It is those functional SNPs that are most likely associated with traits. Here, we propose LmTag, a novel method for tag SNP selection that not only improves imputation performance but also prioritizes highly functional SNP markers. We apply LmTag on a wide range of populations using both public and in-house whole-genome sequencing databases. Our results show that LmTag improved both functional marker prioritization and genome-wide imputation accuracy compared to existing methods. This novel approach could contribute to the next generation genotyping arrays that provide excellent imputation capability as well as facilitate array-based functional genetic studies. Such arrays are particularly suitable for under-represented populations in developing countries or non-model species, where little genomics data are available while investment in genome sequencing or high-density SNP arrays is limited. $\textrm{LmTag}$ is available at: https://github.com/datngu/LmTag.
Collapse
Affiliation(s)
- Dat Thanh Nguyen
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam
| | - Quan Hoang Nguyen
- Institute for Molecular Bioscience, University of Queensland, st Lucia, QLD 4067, Brisbane, Australia
| | - Nguyen Thuy Duong
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam.,Institute of Genome Research, Vietnam Academy of Science and Technology, 18 Hoang Quoc Viet, 10000, Hanoi, Vietnam
| | - Nam S Vo
- Center for Biomedical Informatics, Vingroup Big Data Institute, 458 Minh Khai, 10000, Hanoi, Vietnam.,College of Engineering and Computer Science, VinUniversity, Vinhomes Ocean Park, 10000, Hanoi, Vietnam
| |
Collapse
|
12
|
Fan C, Mancuso N, Chiang CW. A genealogical estimate of genetic relationships. Am J Hum Genet 2022; 109:812-824. [PMID: 35417677 DOI: 10.1016/j.ajhg.2022.03.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 03/25/2022] [Indexed: 12/23/2022] Open
Abstract
The application of genetic relationships among individuals, characterized by a genetic relationship matrix (GRM), has far-reaching effects in human genetics. However, the current standard to calculate the GRM treats linked markers as independent and does not explicitly model the underlying genealogical history of the study sample. Here, we propose a coalescent-informed framework, namely the expected GRM (eGRM), to infer the expected relatedness between pairs of individuals given an ancestral recombination graph (ARG) of the sample. Through extensive simulations, we show that the eGRM is an unbiased estimate of latent pairwise genome-wide relatedness and is robust when computed with ARG inferred from incomplete genetic data. As a result, the eGRM better captures the structure of a population than the canonical GRM, even when using the same genetic information. More importantly, our framework allows a principled approach to estimate the eGRM at different time depths of the ARG, thereby revealing the time-varying nature of population structure in a sample. When applied to SNP array genotypes from a population sample from Northern and Eastern Finland, we find that clustering analysis with the eGRM reveals population structure driven by subpopulations that would not be apparent via the canonical GRM and that temporally the population model is consistent with recent divergence and expansion. Taken together, our proposed eGRM provides a robust tree-centric estimate of relatedness with wide application to genetic studies.
Collapse
|
13
|
Clark KC, Kwitek AE. Multi-Omic Approaches to Identify Genetic Factors in Metabolic Syndrome. Compr Physiol 2021; 12:3045-3084. [PMID: 34964118 PMCID: PMC9373910 DOI: 10.1002/cphy.c210010] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Metabolic syndrome (MetS) is a highly heritable disease and a major public health burden worldwide. MetS diagnosis criteria are met by the simultaneous presence of any three of the following: high triglycerides, low HDL/high LDL cholesterol, insulin resistance, hypertension, and central obesity. These diseases act synergistically in people suffering from MetS and dramatically increase risk of morbidity and mortality due to stroke and cardiovascular disease, as well as certain cancers. Each of these component features is itself a complex disease, as is MetS. As a genetically complex disease, genetic risk factors for MetS are numerous, but not very powerful individually, often requiring specific environmental stressors for the disease to manifest. When taken together, all sequence variants that contribute to MetS disease risk explain only a fraction of the heritable variance, suggesting additional, novel loci have yet to be discovered. In this article, we will give a brief overview on the genetic concepts needed to interpret genome-wide association studies (GWAS) and quantitative trait locus (QTL) data, summarize the state of the field of MetS physiological genomics, and to introduce tools and resources that can be used by the physiologist to integrate genomics into their own research on MetS and any of its component features. There is a wealth of phenotypic and molecular data in animal models and humans that can be leveraged as outlined in this article. Integrating these multi-omic QTL data for complex diseases such as MetS provides a means to unravel the pathways and mechanisms leading to complex disease and promise for novel treatments. © 2022 American Physiological Society. Compr Physiol 12:1-40, 2022.
Collapse
Affiliation(s)
- Karen C Clark
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Anne E Kwitek
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| |
Collapse
|
14
|
Arriaga-MacKenzie IS, Matesi G, Chen S, Ronco A, Marker KM, Hall JR, Scherenberg R, Khajeh-Sharafabadi M, Wu Y, Gignoux CR, Null M, Hendricks AE. Summix: A method for detecting and adjusting for population structure in genetic summary data. Am J Hum Genet 2021; 108:1270-1282. [PMID: 34157305 PMCID: PMC8322937 DOI: 10.1016/j.ajhg.2021.05.016] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 05/26/2021] [Indexed: 12/11/2022] Open
Abstract
Publicly available genetic summary data have high utility in research and the clinic, including prioritizing putative causal variants, polygenic scoring, and leveraging common controls. However, summarizing individual-level data can mask population structure, resulting in confounding, reduced power, and incorrect prioritization of putative causal variants. This limits the utility of publicly available data, especially for understudied or admixed populations where additional research and resources are most needed. Although several methods exist to estimate ancestry in individual-level data, methods to estimate ancestry proportions in summary data are lacking. Here, we present Summix, a method to efficiently deconvolute ancestry and provide ancestry-adjusted allele frequencies (AFs) from summary data. Using continental reference ancestry, African (AFR), non-Finnish European (EUR), East Asian (EAS), Indigenous American (IAM), South Asian (SAS), we obtain accurate and precise estimates (within 0.1%) for all simulation scenarios. We apply Summix to gnomAD v.2.1 exome and genome groups and subgroups, finding heterogeneous continental ancestry for several groups, including African/African American (∼84% AFR, ∼14% EUR) and American/Latinx (∼4% AFR, ∼5% EAS, ∼43% EUR, ∼46% IAM). Compared to the unadjusted gnomAD AFs, Summix's ancestry-adjusted AFs more closely match respective African and Latinx reference samples. Even on modern, dense panels of summary statistics, Summix yields results in seconds, allowing for estimation of confidence intervals via block bootstrap. Given an accompanying R package, Summix increases the utility and equity of public genetic resources, empowering novel research opportunities.
Collapse
Affiliation(s)
| | - Gregory Matesi
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA
| | - Samuel Chen
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA
| | - Alexandria Ronco
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA
| | - Katie M Marker
- Human Medical Genetics and Genomics Program, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Jordan R Hall
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA
| | - Ryan Scherenberg
- Business School, University of Colorado Denver, Denver, CO 80204, USA
| | | | - Yinfei Wu
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA
| | - Christopher R Gignoux
- Human Medical Genetics and Genomics Program, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO 80045, USA
| | - Megan Null
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA; Mathematics and Physical Sciences, The College of Idaho, Caldwell, ID 83605, USA
| | - Audrey E Hendricks
- Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO 80204, USA; Human Medical Genetics and Genomics Program, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA; Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO 80045, USA.
| |
Collapse
|
15
|
Sakurai-Yageta M, Kumada K, Gocho C, Makino S, Uruno A, Tadaka S, Motoike IN, Kimura M, Ito S, Otsuki A, Narita A, Kudo H, Aoki Y, Danjoh I, Yasuda J, Kawame H, Minegishi N, Koshiba S, Fuse N, Tamiya G, Yamamoto M, Kinoshita K. Japonica Array NEO with increased genome-wide coverage and abundant disease risk SNPs. J Biochem 2021; 170:399-410. [PMID: 34131746 PMCID: PMC8510329 DOI: 10.1093/jb/mvab060] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 04/27/2021] [Indexed: 01/12/2023] Open
Abstract
Ethnic-specific SNP arrays are becoming more important to increase the power of genome-wide association studies in diverse population. In the Tohoku Medical Megabank Project, we have been developing a series of Japonica Arrays (JPA) for genotyping participants based on reference panels constructed from whole-genome sequence data of the Japanese population. Here, we designed a novel version of the SNP array for the Japanese population, called Japonica Array NEO (JPA NEO), comprising a total of 666,883 markers. Among them, 654,246 tag SNPs of autosomes and X chromosome were selected from an expanded reference panel of 3,552 Japanese, 3.5KJPNv2, using pairwise r2 of linkage disequilibrium measures. Additionally, 28,298 markers were included for the evaluation of previously identified disease risk markers from the literature and databases, and those present in the Japanese population were extracted using the reference panel. Through genotyping 286 Japanese samples, we found that the imputation quality r2 and INFO score in the minor allele frequency bin >2.5–5% were >0.9 and >0.8, respectively, and >12 million markers were imputed with an INFO score >0.8. From these results, JPA NEO is a promising tool for genotyping the Japanese population with genome-wide coverage, contributing to the development of genetic risk scores.
Collapse
Affiliation(s)
- Mika Sakurai-Yageta
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Kazuki Kumada
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Chinatsu Gocho
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Satoshi Makino
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Akira Uruno
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Shu Tadaka
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Ikuko N Motoike
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Information Sciences, Tohoku University, 6-3-09 Aramaki-Aza-Aoba, Aoba-ku, Sendai, Miyagi 980-8579, Japan
| | - Masae Kimura
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Shin Ito
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Akihito Otsuki
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Akira Narita
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Hisaaki Kudo
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Yuichi Aoki
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Information Sciences, Tohoku University, 6-3-09 Aramaki-Aza-Aoba, Aoba-ku, Sendai, Miyagi 980-8579, Japan
| | - Inaho Danjoh
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Jun Yasuda
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Hiroshi Kawame
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Naoko Minegishi
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Seizo Koshiba
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Nobuo Fuse
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Gen Tamiya
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Masayuki Yamamoto
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan
| | - Kengo Kinoshita
- Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Advanced Research Center for Innovations in Next-Generation Medicine, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan.,Graduate School of Information Sciences, Tohoku University, 6-3-09 Aramaki-Aza-Aoba, Aoba-ku, Sendai, Miyagi 980-8579, Japan
| |
Collapse
|
16
|
Si Y, Vanderwerff B, Zöllner S. Why are rare variants hard to impute? Coalescent models reveal theoretical limits in existing algorithms. Genetics 2021; 217:iyab011. [PMID: 33686438 PMCID: PMC8049559 DOI: 10.1093/genetics/iyab011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2020] [Accepted: 12/15/2020] [Indexed: 01/13/2023] Open
Abstract
Genotype imputation is an indispensable step in human genetic studies. Large reference panels with deeply sequenced genomes now allow interrogating variants with minor allele frequency < 1% without sequencing. Although it is critical to consider limits of this approach, imputation methods for rare variants have only done so empirically; the theoretical basis of their imputation accuracy has not been explored. To provide theoretical consideration of imputation accuracy under the current imputation framework, we develop a coalescent model of imputing rare variants, leveraging the joint genealogy of the sample to be imputed and reference individuals. We show that broadly used imputation algorithms include model misspecifications about this joint genealogy that limit the ability to correctly impute rare variants. We develop closed-form solutions for the probability distribution of this joint genealogy and quantify the inevitable error rate resulting from the model misspecification across a range of allele frequencies and reference sample sizes. We show that the probability of a falsely imputed minor allele decreases with reference sample size, but the proportion of falsely imputed minor alleles mostly depends on the allele count in the reference sample. We summarize the impact of this error on genotype imputation on association tests by calculating the r2 between imputed and true genotype and show that even when modeling other sources of error, the impact of the model misspecification has a significant impact on the r2 of rare variants. To evaluate these predictions in practice, we compare the imputation of the same dataset across imputation panels of different sizes. Although this empirical imputation accuracy is substantially lower than our theoretical prediction, modeling misspecification seems to further decrease imputation accuracy for variants with low allele counts in the reference. These results provide a framework for developing new imputation algorithms and for interpreting rare variant association analyses.
Collapse
Affiliation(s)
- Yichen Si
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Brett Vanderwerff
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
| | - Sebastian Zöllner
- Department of Biostatistics, School of Public Health, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA
- Department of Psychiatry, University of Michigan,1420 Washington Heights, Ann Arbor, MI 48109, USA
| |
Collapse
|
17
|
Martin AR, Atkinson EG, Chapman SB, Stevenson A, Stroud RE, Abebe T, Akena D, Alemayehu M, Ashaba FK, Atwoli L, Bowers T, Chibnik LB, Daly MJ, DeSmet T, Dodge S, Fekadu A, Ferriera S, Gelaye B, Gichuru S, Injera WE, James R, Kariuki SM, Kigen G, Koenen KC, Kwobah E, Kyebuzibwa J, Majara L, Musinguzi H, Mwema RM, Neale BM, Newman CP, Newton CRJC, Pickrell JK, Ramesar R, Shiferaw W, Stein DJ, Teferra S, van der Merwe C, Zingela Z. Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations. Am J Hum Genet 2021; 108:656-668. [PMID: 33770507 PMCID: PMC8059370 DOI: 10.1016/j.ajhg.2021.03.012] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 03/05/2021] [Indexed: 12/21/2022] Open
Abstract
Genetic studies in underrepresented populations identify disproportionate numbers of novel associations. However, most genetic studies use genotyping arrays and sequenced reference panels that best capture variation most common in European ancestry populations. To compare data generation strategies best suited for underrepresented populations, we sequenced the whole genomes of 91 individuals to high coverage as part of the Neuropsychiatric Genetics of African Population-Psychosis (NeuroGAP-Psychosis) study with participants from Ethiopia, Kenya, South Africa, and Uganda. We used a downsampling approach to evaluate the quality of two cost-effective data generation strategies, GWAS arrays versus low-coverage sequencing, by calculating the concordance of imputed variants from these technologies with those from deep whole-genome sequencing data. We show that low-coverage sequencing at a depth of ≥4× captures variants of all frequencies more accurately than all commonly used GWAS arrays investigated and at a comparable cost. Lower depths of sequencing (0.5-1×) performed comparably to commonly used low-density GWAS arrays. Low-coverage sequencing is also sensitive to novel variation; 4× sequencing detects 45% of singletons and 95% of common variants identified in high-coverage African whole genomes. Low-coverage sequencing approaches surmount the problems induced by the ascertainment of common genotyping arrays, effectively identify novel variation particularly in underrepresented populations, and present opportunities to enhance variant discovery at a cost similar to traditional approaches.
Collapse
Affiliation(s)
- Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
| | - Elizabeth G Atkinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Sinéad B Chapman
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Anne Stevenson
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Rocky E Stroud
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Tamrat Abebe
- Department of Microbiology, Immunology, and Parasitology, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Dickens Akena
- Department of Psychiatry, School of Medicine, College of Health Sciences, Makerere University, Kampala, Uganda
| | - Melkam Alemayehu
- Department of Psychiatry, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Fred K Ashaba
- Department of Immunology & Molecular Biology, College of Health Sciences, Makerere University, Kampala, Uganda
| | - Lukoye Atwoli
- Department of Mental Health, School of Medicine, Moi University College of Health Sciences, Eldoret, Kenya
| | - Tera Bowers
- Broad Genomics, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, MA 02141, USA
| | - Lori B Chibnik
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA; Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Mark J Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Institute for Molecular Medicine Finland, Helsinki 00014, Finland
| | - Timothy DeSmet
- Broad Genomics, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, MA 02141, USA
| | - Sheila Dodge
- Broad Genomics, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, MA 02141, USA
| | - Abebaw Fekadu
- Department of Psychiatry, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia; Centre for Innovative Drug Development & Therapeutic Trials for Africa, Addis Ababa University, Addis Ababa, Ethiopia
| | - Steven Ferriera
- Broad Genomics, Broad Institute of MIT and Harvard, 320 Charles Street, Cambridge, MA 02141, USA
| | - Bizu Gelaye
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Stella Gichuru
- Department of Mental Health, Moi Teaching and Referral Hospital, Eldoret, Kenya
| | - Wilfred E Injera
- Department of Immunology, School of Medicine, Moi University College of Health Sciences, Eldoret, Kenya
| | - Roxanne James
- Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa
| | - Symon M Kariuki
- Neurosciences Unit, Clinical Department, KEMRI-Wellcome Trust Research Programme-Coast, Kilifi, Kenya; Department of Psychiatry, University of Oxford, Oxford OX3 7JX, UK
| | - Gabriel Kigen
- Department of Pharmacology and Toxicology, School of Medicine, Moi University College of Health Sciences, Eldoret, Kenya
| | - Karestan C Koenen
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Edith Kwobah
- Department of Mental Health, Moi Teaching and Referral Hospital, Eldoret, Kenya
| | - Joseph Kyebuzibwa
- Department of Psychiatry, School of Medicine, College of Health Sciences, Makerere University, Kampala, Uganda
| | - Lerato Majara
- Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa; SA MRC Human Genetics Research Unit, Division of Human Genetics, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, Observatory 7925, South Africa
| | - Henry Musinguzi
- Department of Immunology & Molecular Biology, College of Health Sciences, Makerere University, Kampala, Uganda
| | - Rehema M Mwema
- Neurosciences Unit, Clinical Department, KEMRI-Wellcome Trust Research Programme-Coast, Kilifi, Kenya
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Carter P Newman
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Charles R J C Newton
- Neurosciences Unit, Clinical Department, KEMRI-Wellcome Trust Research Programme-Coast, Kilifi, Kenya; Department of Psychiatry, University of Oxford, Oxford OX3 7JX, UK
| | | | - Raj Ramesar
- SA MRC Genomic and Precision Medicine Research Unit, Division of Human Genetics, Department of Pathology, Institute of Infectious Diseases and Molecular Medicine, University of Cape Town, Cape Town, South Africa
| | - Welelta Shiferaw
- Department of Microbiology, Immunology, and Parasitology, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Dan J Stein
- Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa; SA MRC Unit on Risk & Resilience in Mental Disorders, University of Cape Town and Neuroscience Institute, Cape Town, South Africa
| | - Solomon Teferra
- Department of Psychiatry, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia
| | - Celia van der Merwe
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Psychiatry and Mental Health, University of Cape Town, Cape Town, South Africa
| | - Zukiswa Zingela
- Department of Psychiatry and Human Behavioral Sciences, Walter Sisulu University, Mthatha, South Africa
| |
Collapse
|
18
|
Khvorykh G, Khrunin A, Filippenkov I, Stavchansky V, Dergunova L, Limborska S. A Workflow for Selection of Single Nucleotide Polymorphic Markers for Studying of Genetics of Ischemic Stroke Outcomes. Genes (Basel) 2021; 12:328. [PMID: 33668793 PMCID: PMC7996278 DOI: 10.3390/genes12030328] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 02/21/2021] [Accepted: 02/21/2021] [Indexed: 11/17/2022] Open
Abstract
In this paper we propose a workflow for studying the genetic architecture of ischemic stroke outcomes. It develops further the candidate gene approach. The workflow is based on the animal model of brain ischemia, comparative genomics, human genomic variations, and algorithms of selection of tagging single nucleotide polymorphisms (tagSNPs) in genes which expression was changed after ischemic stroke. The workflow starts from a set of rat genes that changed their expression in response to brain ischemia and results in a set of tagSNPs, which represent other SNPs in the human genes analyzed and influenced on their expression as well.
Collapse
|
19
|
Sun H, Lin M, Russell EM, Minster RL, Chan TF, Dinh BL, Naseri T, Reupena MS, Lum-Jones A, Cheng I, Wilkens LR, Le Marchand L, Haiman CA, Chiang CWK. The impact of global and local Polynesian genetic ancestry on complex traits in Native Hawaiians. PLoS Genet 2021; 17:e1009273. [PMID: 33571193 PMCID: PMC7877570 DOI: 10.1371/journal.pgen.1009273] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 11/18/2020] [Indexed: 12/17/2022] Open
Abstract
Epidemiological studies of obesity, Type-2 diabetes (T2D), cardiovascular diseases and several common cancers have revealed an increased risk in Native Hawaiians compared to European- or Asian-Americans living in the Hawaiian islands. However, there remains a gap in our understanding of the genetic factors that affect the health of Native Hawaiians. To fill this gap, we studied the genetic risk factors at both the chromosomal and sub-chromosomal scales using genome-wide SNP array data on ~4,000 Native Hawaiians from the Multiethnic Cohort. We estimated the genomic proportion of Native Hawaiian ancestry ("global ancestry," which we presumed to be Polynesian in origin), as well as this ancestral component along each chromosome ("local ancestry") and tested their respective association with binary and quantitative cardiometabolic traits. After attempting to adjust for non-genetic covariates evaluated through questionnaires, we found that per 10% increase in global Polynesian genetic ancestry, there is a respective 8.6%, and 11.0% increase in the odds of being diabetic (P = 1.65×10-4) and having heart failure (P = 2.18×10-4), as well as a 0.059 s.d. increase in BMI (P = 1.04×10-10). When testing the association of local Polynesian ancestry with risk of disease or biomarkers, we identified a chr6 region associated with T2D. This association was driven by an uniquely prevalent variant in Polynesian ancestry individuals. However, we could not replicate this finding in an independent Polynesian cohort from Samoa due to the small sample size of the replication cohort. In conclusion, we showed that Polynesian ancestry, which likely capture both genetic and lifestyle risk factors, is associated with an increased risk of obesity, Type-2 diabetes, and heart failure, and that larger cohorts of Polynesian ancestry individuals will be needed to replicate the putative association on chr6 with T2D.
Collapse
Affiliation(s)
- Hanxiao Sun
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Meng Lin
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Emily M. Russell
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Ryan L. Minster
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Tsz Fung Chan
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Bryan L. Dinh
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | | | - Annette Lum-Jones
- Epidemiology Program, University of Hawai‘i Cancer Center, University of Hawai‘i, Manoa, Honolulu, Hawaii, United States of America
| | | | - Iona Cheng
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California, United States of America
| | - Lynne R. Wilkens
- Epidemiology Program, University of Hawai‘i Cancer Center, University of Hawai‘i, Manoa, Honolulu, Hawaii, United States of America
| | - Loïc Le Marchand
- Epidemiology Program, University of Hawai‘i Cancer Center, University of Hawai‘i, Manoa, Honolulu, Hawaii, United States of America
| | - Christopher A. Haiman
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Charleston W. K. Chiang
- Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
20
|
Lanata CM, Blazer A, Criswell LA. The Contribution of Genetics and Epigenetics to Our Understanding of Health Disparities in Rheumatic Diseases. Rheum Dis Clin North Am 2020; 47:65-81. [PMID: 34042055 DOI: 10.1016/j.rdc.2020.09.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Socioeconomic determinants of health are associated with worse outcomes in the rheumatic diseases and contribute significantly to health disparities. However, genetic and epigenetic risk factors may affect different populations disproportionally and further exacerbate health disparities. We discuss the role of genetics and epigenetics to the health disparities observed in rheumatic diseases. We review concepts of population genetics and natural selection, current genome-wide genetic and epigenetic studies of several autoimmune diseases, and environmental exposures associated with disease risk in different populations. To understand how genomics influence health disparities in the rheumatic diseases, further studies in different populations worldwide are needed.
Collapse
Affiliation(s)
- Cristina M Lanata
- Russell/Engleman Rheumatology Research Center, University of California, San Francisco, 513 Parnassus Avenue, MSB S865, San Francisco, CA, USA
| | - Ashira Blazer
- Department of Medicine, Division of Rheumatology, NYU Langone Health, 550 1st Avenue, MSB 606, New York, NY 10029, USA
| | - Lindsey A Criswell
- Russell/Engleman Rheumatology Research Center, University of California, San Francisco, 513 Parnassus Avenue, MSB S864, San Francisco, CA, USA.
| |
Collapse
|
21
|
Raffield LM, Iyengar AK, Wang B, Gaynor SM, Spracklen CN, Zhong X, Kowalski MH, Salimi S, Polfus LM, Benjamin EJ, Bis JC, Bowler R, Cade BE, Choi WJ, Comellas AP, Correa A, Cruz P, Doddapaneni H, Durda P, Gogarten SM, Jain D, Kim RW, Kral BG, Lange LA, Larson MG, Laurie C, Lee J, Lee S, Lewis JP, Metcalf GA, Mitchell BD, Momin Z, Muzny DM, Pankratz N, Park CJ, Rich SS, Rotter JI, Ryan K, Seo D, Tracy RP, Viaud-Martinez KA, Yanek LR, Zhao LP, Lin X, Li B, Li Y, Dupuis J, Reiner AP, Mohlke KL, Auer PL. Allelic Heterogeneity at the CRP Locus Identified by Whole-Genome Sequencing in Multi-ancestry Cohorts. Am J Hum Genet 2020; 106:112-120. [PMID: 31883642 PMCID: PMC7042494 DOI: 10.1016/j.ajhg.2019.12.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 12/02/2019] [Indexed: 12/19/2022] Open
Abstract
Whole-genome sequencing (WGS) can improve assessment of low-frequency and rare variants, particularly in non-European populations that have been underrepresented in existing genomic studies. The genetic determinants of C-reactive protein (CRP), a biomarker of chronic inflammation, have been extensively studied, with existing genome-wide association studies (GWASs) conducted in >200,000 individuals of European ancestry. In order to discover novel loci associated with CRP levels, we examined a multi-ancestry population (n = 23,279) with WGS (∼38× coverage) from the Trans-Omics for Precision Medicine (TOPMed) program. We found evidence for eight distinct associations at the CRP locus, including two variants that have not been identified previously (rs11265259 and rs181704186), both of which are non-coding and more common in individuals of African ancestry (∼10% and ∼1% minor allele frequency, respectively, and rare or monomorphic in 1000 Genomes populations of East Asian, South Asian, and European ancestry). We show that the minor (G) allele of rs181704186 is associated with lower CRP levels and decreased transcriptional activity and protein binding in vitro, providing a plausible molecular mechanism for this African ancestry-specific signal. The individuals homozygous for rs181704186-G have a mean CRP level of 0.23 mg/L, in contrast to individuals heterozygous for rs181704186 with mean CRP of 2.97 mg/L and major allele homozygotes with mean CRP of 4.11 mg/L. This study demonstrates the utility of WGS in multi-ethnic populations to drive discovery of complex trait associations of large effect and to identify functional alleles in noncoding regulatory regions.
Collapse
Affiliation(s)
- Laura M Raffield
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Apoorva K Iyengar
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Biqi Wang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Sheila M Gaynor
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA
| | | | - Xue Zhong
- Department of Medicine, Division of Genetic Medicine, Vanderbilt University, Nashville, TN 37232, USA
| | - Madeline H Kowalski
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Shabnam Salimi
- Department of Epidemiology and Public Health, School of Medicine, University of Maryland, Baltimore, MD 21201, USA
| | - Linda M Polfus
- Department of Preventive Medicine, Center for Genetic Epidemiology, University of Southern California, Los Angeles, CA 90089, USA
| | - Emelia J Benjamin
- Department of Medicine, Boston University School of Medicine, Boston, MA 02118, USA; Department of Epidemiology, Boston University School of Public Health, Boston, MA 02118, USA; National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, MA 01702, USA
| | - Joshua C Bis
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA 98101, USA
| | - Russell Bowler
- Department of Medicine, Division of Pulmonary, Critical Care & Sleep Medicine, National Jewish Health, Denver, CO 80206, USA
| | - Brian E Cade
- Department of Medicine, Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Medicine, Division of Sleep Medicine, Harvard Medical School, Boston, MA 02115, USA
| | | | - Alejandro P Comellas
- Department of Medicine, Division of Pulmonary and Critical Care, University of Iowa, Iowa City, IA 52242, USA
| | - Adolfo Correa
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS 39216, USA
| | - Pedro Cruz
- Illumina Laboratory Services, Illumina Inc., San Diego, CA 92122, USA
| | - Harsha Doddapaneni
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Peter Durda
- Department of Pathology & Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, VT 05446, USA
| | | | - Deepti Jain
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | | | - Brian G Kral
- GeneSTAR Research Program, Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Division of Cardiology, Department of Medicine, The Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Leslie A Lange
- Department of Medicine, University of Colorado Denver, Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Martin G Larson
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA; National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, MA 01702, USA
| | - Cecelia Laurie
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Jiwon Lee
- Department of Medicine, Division of Sleep and Circadian Disorders, Brigham and Women's Hospital, Boston, MA 02115, USA
| | | | - Joshua P Lewis
- Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | - Ginger A Metcalf
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Braxton D Mitchell
- Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD 21201, USA; Geriatrics Research and Education Clinical Center, Baltimore Veterans Administration Medical Center, Baltimore, MD 21201, USA
| | - Zeineen Momin
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Nathan Pankratz
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455, USA
| | | | - Stephen S Rich
- Department of Public Health Sciences, Center for Public Health Genomics, University of Virginia, Charlottesville, VA 22908, USA
| | - Jerome I Rotter
- The Institute for Translational Genomics and Population Sciences, Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Kathleen Ryan
- Department of Medicine, Division of Endocrinology, Diabetes, and Nutrition, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| | | | - Russell P Tracy
- Department of Pathology & Laboratory Medicine, Larner College of Medicine, University of Vermont, Burlington, VT 05446, USA; Department of Biochemistry, Larner College of Medicine, University of Vermont, Burlington, VT 05446, USA
| | | | - Lisa R Yanek
- GeneSTAR Research Program, Division of General Internal Medicine, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Lue Ping Zhao
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; School of Public Health, University of Washington, Seattle, WA 98195, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Department of Statistics, Harvard University, Cambridge, MA 02138, USA
| | - Bingshan Li
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
| | - Yun Li
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA; Department of Computer Science, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA; National Heart, Lung, and Blood Institute's and Boston University's Framingham Heart Study, Framingham, MA 01702, USA
| | - Alexander P Reiner
- Department of Epidemiology, University of Washington, Seattle, WA 98195, USA
| | - Karen L Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - Paul L Auer
- Joseph J. Zilber School of Public Health, University of Wisconsin Milwaukee, Milwaukee, WI 53205, USA.
| |
Collapse
|
22
|
Homburger JR, Neben CL, Mishne G, Zhou AY, Kathiresan S, Khera AV. Low coverage whole genome sequencing enables accurate assessment of common variants and calculation of genome-wide polygenic scores. Genome Med 2019; 11:74. [PMID: 31771638 PMCID: PMC6880438 DOI: 10.1186/s13073-019-0682-2] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 11/01/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Inherited susceptibility to common, complex diseases may be caused by rare, pathogenic variants ("monogenic") or by the cumulative effect of numerous common variants ("polygenic"). Comprehensive genome interpretation should enable assessment for both monogenic and polygenic components of inherited risk. The traditional approach requires two distinct genetic testing technologies-high coverage sequencing of known genes to detect monogenic variants and a genome-wide genotyping array followed by imputation to calculate genome-wide polygenic scores (GPSs). We assessed the feasibility and accuracy of using low coverage whole genome sequencing (lcWGS) as an alternative to genotyping arrays to calculate GPSs. METHODS First, we performed downsampling and imputation of WGS data from ten individuals to assess concordance with known genotypes. Second, we assessed the correlation between GPSs for 3 common diseases-coronary artery disease (CAD), breast cancer (BC), and atrial fibrillation (AF)-calculated using lcWGS and genotyping array in 184 samples. Third, we assessed concordance of lcWGS-based genotype calls and GPS calculation in 120 individuals with known genotypes, selected to reflect diverse ancestral backgrounds. Fourth, we assessed the relationship between GPSs calculated using lcWGS and disease phenotypes in a cohort of 11,502 individuals of European ancestry. RESULTS We found imputation accuracy r2 values of greater than 0.90 for all ten samples-including those of African and Ashkenazi Jewish ancestry-with lcWGS data at 0.5×. GPSs calculated using lcWGS and genotyping array followed by imputation in 184 individuals were highly correlated for each of the 3 common diseases (r2 = 0.93-0.97) with similar score distributions. Using lcWGS data from 120 individuals of diverse ancestral backgrounds, we found similar results with respect to imputation accuracy and GPS correlations. Finally, we calculated GPSs for CAD, BC, and AF using lcWGS in 11,502 individuals of European ancestry, confirming odds ratios per standard deviation increment ranging 1.28 to 1.59, consistent with previous studies. CONCLUSIONS lcWGS is an alternative technology to genotyping arrays for common genetic variant assessment and GPS calculation. lcWGS provides comparable imputation accuracy while also overcoming the ascertainment bias inherent to variant selection in genotyping array design.
Collapse
Affiliation(s)
| | - Cynthia L Neben
- Color Genomics, 831 Mitten Road, Suite 100, Burlingame, CA, 94010, USA
| | - Gilad Mishne
- Color Genomics, 831 Mitten Road, Suite 100, Burlingame, CA, 94010, USA
| | - Alicia Y Zhou
- Color Genomics, 831 Mitten Road, Suite 100, Burlingame, CA, 94010, USA
| | | | - Amit V Khera
- Center for Genomic Medicine and Cardiology Division, Department of Medicine, Massachusetts General Hospital, Simches Research Building | CPZN 6.256, Boston, MA, 02114, USA.
- Cardiovascular Disease Initiative, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
- Harvard Medical School, Boston, MA, 02115, USA.
| |
Collapse
|
23
|
Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, Gignoux CR, Highland HM, Patel YM, Sorokin EP, Avery CL, Belbin GM, Bien SA, Cheng I, Cullina S, Hodonsky CJ, Hu Y, Huckins LM, Jeff J, Justice AE, Kocarnik JM, Lim U, Lin BM, Lu Y, Nelson SC, Park SSL, Poisner H, Preuss MH, Richard MA, Schurmann C, Setiawan VW, Sockell A, Vahi K, Verbanck M, Vishnu A, Walker RW, Young KL, Zubair N, Acuña-Alonso V, Ambite JL, Barnes KC, Boerwinkle E, Bottinger EP, Bustamante CD, Caberto C, Canizales-Quinteros S, Conomos MP, Deelman E, Do R, Doheny K, Fernández-Rhodes L, Fornage M, Hailu B, Heiss G, Henn BM, Hindorff LA, Jackson RD, Laurie CA, Laurie CC, Li Y, Lin DY, Moreno-Estrada A, Nadkarni G, Norman PJ, Pooler LC, Reiner AP, Romm J, Sabatti C, Sandoval K, Sheng X, Stahl EA, Stram DO, Thornton TA, Wassel CL, Wilkens LR, Winkler CA, Yoneyama S, Buyske S, Haiman CA, Kooperberg C, Le Marchand L, Loos RJF, Matise TC, North KE, Peters U, Kenny EE, Carlson CS. Genetic analyses of diverse populations improves discovery for complex traits. Nature 2019; 570:514-518. [PMID: 31217584 DOI: 10.1038/s41586-019-1310-4] [Citation(s) in RCA: 534] [Impact Index Per Article: 106.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 05/15/2019] [Indexed: 12/20/2022]
Abstract
Genome-wide association studies (GWAS) have laid the foundation for investigations into the biology of complex traits, drug development and clinical guidelines. However, the majority of discovery efforts are based on data from populations of European ancestry1-3. In light of the differential genetic architecture that is known to exist between populations, bias in representation can exacerbate existing disease and healthcare disparities. Critical variants may be missed if they have a low frequency or are completely absent in European populations, especially as the field shifts its attention towards rare variants, which are more likely to be population-specific4-10. Additionally, effect sizes and their derived risk prediction scores derived in one population may not accurately extrapolate to other populations11,12. Here we demonstrate the value of diverse, multi-ethnic participants in large-scale genomic studies. The Population Architecture using Genomics and Epidemiology (PAGE) study conducted a GWAS of 26 clinical and behavioural phenotypes in 49,839 non-European individuals. Using strategies tailored for analysis of multi-ethnic and admixed populations, we describe a framework for analysing diverse populations, identify 27 novel loci and 38 secondary signals at known loci, as well as replicate 1,444 GWAS catalogue associations across these traits. Our data show evidence of effect-size heterogeneity across ancestries for published GWAS associations, substantial benefits for fine-mapping using diverse cohorts and insights into clinical implications. In the United States-where minority populations have a disproportionately higher burden of chronic conditions13-the lack of representation of diverse populations in genetic research will result in inequitable access to precision medicine for those with the highest burden of disease. We strongly advocate for continued, large genome-wide efforts in diverse populations to maximize genetic discovery and reduce health disparities.
Collapse
Affiliation(s)
- Genevieve L Wojcik
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Mariaelisa Graff
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Katherine K Nishimura
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Ran Tao
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.,Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Jeffrey Haessler
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Christopher R Gignoux
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.,Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Heather M Highland
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yesha M Patel
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Elena P Sorokin
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Christy L Avery
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Gillian M Belbin
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Stephanie A Bien
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Iona Cheng
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
| | - Sinead Cullina
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Chani J Hodonsky
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yao Hu
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Janina Jeff
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Anne E Justice
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Jonathan M Kocarnik
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Unhee Lim
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Bridget M Lin
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Yingchang Lu
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sarah C Nelson
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Sung-Shim L Park
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Hannah Poisner
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael H Preuss
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Melissa A Richard
- Brown Foundation Institute for Molecular Medicine, The University of Texas Health Science Center, Houston, TX, USA
| | - Claudia Schurmann
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Hasso-Plattner-Institute for Digital Engineering, Digital Health Center, Potsdam, Germany.,Hasso-Plattner-Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Veronica W Setiawan
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Alexandra Sockell
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Karan Vahi
- Information Sciences Institute, University of Southern California, Marina del Rey, CA, USA
| | - Marie Verbanck
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Abhishek Vishnu
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ryan W Walker
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kristin L Young
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Niha Zubair
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Jose Luis Ambite
- Information Sciences Institute, University of Southern California, Marina del Rey, CA, USA
| | - Kathleen C Barnes
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Eric Boerwinkle
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center, Houston, TX, USA
| | - Erwin P Bottinger
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Hasso-Plattner-Institute for Digital Engineering, Digital Health Center, Potsdam, Germany.,Hasso-Plattner-Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Carlos D Bustamante
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Christian Caberto
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | | | - Matthew P Conomos
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Ewa Deelman
- Information Sciences Institute, University of Southern California, Marina del Rey, CA, USA
| | - Ron Do
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kimberly Doheny
- Center for Inherited Disease Research, Johns Hopkins University, Baltimore, MD, USA
| | - Lindsay Fernández-Rhodes
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.,Department of Biobehavioral Health, The Pennsylvania State University, University Park, PA, USA
| | - Myriam Fornage
- Brown Foundation Institute for Molecular Medicine, The University of Texas Health Science Center, Houston, TX, USA
| | - Benyam Hailu
- NIH National Institute on Minority Health and Health Disparities, Bethesda, MD, USA
| | - Gerardo Heiss
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Brenna M Henn
- Department of Anthropology, University of California Davis, Davis, CA, USA
| | | | - Rebecca D Jackson
- Center for Clinical and Translational Science, Ohio State Medical Center, Columbus, OH, USA
| | - Cecelia A Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Cathy C Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Yuqing Li
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.,Cancer Prevention Institute of California, Fremont, CA, USA
| | - Dan-Yu Lin
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Girish Nadkarni
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Paul J Norman
- Colorado Center for Personalized Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Loreall C Pooler
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | | | - Jane Romm
- Center for Inherited Disease Research, Johns Hopkins University, Baltimore, MD, USA
| | - Chiara Sabatti
- Department of Biomedical Data Science, Stanford University, Stanford, CA, USA
| | - Karla Sandoval
- National Laboratory of Genomics for Biodiversity (UGA-LANGEBIO), Irapuato, Mexico
| | - Xin Sheng
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Eli A Stahl
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Daniel O Stram
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | | | | | - Lynne R Wilkens
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Cheryl A Winkler
- Basic Science Program, Frederick National Laboratory, Frederick, MD, USA
| | - Sachi Yoneyama
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Steven Buyske
- Department of Statistics, Rutgers University, New Brunswick, NJ, USA
| | - Christopher A Haiman
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Charles Kooperberg
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Loic Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.,Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Tara C Matise
- Department of Genetics, Rutgers University, New Brunswick, NJ, USA
| | - Kari E North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Ulrike Peters
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Eimear E Kenny
- The Center for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,The Charles Bronfman Institute of Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA. .,Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Christopher S Carlson
- Division of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
| |
Collapse
|
24
|
Bien SA, Wojcik GL, Hodonsky CJ, Gignoux CR, Cheng I, Matise TC, Peters U, Kenny EE, North KE. The Future of Genomic Studies Must Be Globally Representative: Perspectives from PAGE. Annu Rev Genomics Hum Genet 2019; 20:181-200. [PMID: 30978304 DOI: 10.1146/annurev-genom-091416-035517] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The past decade has seen a technological revolution in human genetics that has empowered population-level investigations into genetic associations with phenotypes. Although these discoveries rely on genetic variation across individuals, association studies have overwhelmingly been performed in populations of European descent. In this review, we describe limitations faced by single-population studies and provide an overview of strategies to improve global representation in existing data sets and future human genomics research via diversity-focused, multiethnic studies. We highlight the successes of individual studies and meta-analysis consortia that have provided unique knowledge. Additionally, we outline the approach taken by the Population Architecture Using Genomics and Epidemiology (PAGE) study to develop best practices for performing genetic epidemiology in multiethnic contexts. Finally, we discuss how limiting investigations to single populations impairs findings in the clinical domain for both rare-variant identification and genetic risk prediction.
Collapse
Affiliation(s)
- Stephanie A Bien
- Department of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA; ,
| | - Genevieve L Wojcik
- Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, California 94305, USA;
| | - Chani J Hodonsky
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA; ,
| | - Christopher R Gignoux
- Colorado Center for Personalized Medicine, Anschutz Medical Campus, University of Colorado, Aurora, Colorado 80045, USA;
| | - Iona Cheng
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California 94158, USA;
| | - Tara C Matise
- Department of Genetics, Rutgers University, New Brunswick, New Jersey 08554, USA;
| | - Ulrike Peters
- Department of Public Health Science, Fred Hutchinson Cancer Research Center, Seattle, Washington 98109, USA; ,
| | - Eimear E Kenny
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA;
| | - Kari E North
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, USA; ,
| |
Collapse
|
25
|
Martin AR, Teferra S, Möller M, Hoal EG, Daly MJ. The critical needs and challenges for genetic architecture studies in Africa. Curr Opin Genet Dev 2018; 53:113-120. [PMID: 30240950 PMCID: PMC6494470 DOI: 10.1016/j.gde.2018.08.005] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 08/17/2018] [Accepted: 08/31/2018] [Indexed: 12/11/2022]
Abstract
Human genetic studies have long been vastly Eurocentric, raising a key question about the generalizability of these study findings to other populations. Because humans originated in Africa, these populations retain more genetic diversity, and yet individuals of African descent have been tremendously underrepresented in genetic studies. The diversity in Africa affords ample opportunities to improve fine-mapping resolution for associated loci, discover novel genetic associations with phenotypes, build more generalizable genetic risk prediction models, and better understand the genetic architecture of complex traits and diseases subject to varying environmental pressures. Thus, it is both ethically and scientifically imperative that geneticists globally surmount challenges that have limited progress in African genetic studies to date. Additionally, African investigators need to be meaningfully included, as greater inclusivity and enhanced research capacity afford enormous opportunities to accelerate genomic discoveries that translate more effectively to all populations. We review the advantages, challenges, and examples of genetic architecture studies of complex traits and diseases in Africa. For example, with greater genetic diversity comes greater ancestral heterogeneity; this higher level of understudied diversity can yield novel genetic findings, but some methods that assume homogeneous population structure and work well in European populations may work less well in the presence of greater heterogeneity in African populations. Consequently, we advocate for methodological development that will accelerate studies important for all populations, especially those currently underrepresented in genetics.
Collapse
Affiliation(s)
- Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
| | - Solomon Teferra
- Department of Psychiatry, School of Medicine, College of Health Sciences, Addis Ababa University, Addis Ababa, Ethiopia; Department of Epidemiology, Harvard T. H. Chan School of Public Health, Harvard University, Boston, USA
| | - Marlo Möller
- DST-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town, South Africa
| | - Eileen G Hoal
- DST-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town, South Africa
| | - Mark J Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| |
Collapse
|
26
|
Vergara C, Parker MM, Franco L, Cho MH, Valencia-Duarte AV, Beaty TH, Duggal P. Genotype imputation performance of three reference panels using African ancestry individuals. Hum Genet 2018; 137:281-292. [PMID: 29637265 PMCID: PMC6209094 DOI: 10.1007/s00439-018-1881-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 03/31/2018] [Indexed: 12/22/2022]
Abstract
Genotype imputation estimates unobserved genotypes from genome-wide makers, to increase genome coverage and power for genome-wide association studies. Imputation has been successful for European ancestry populations in which very large reference panels are available. Smaller subsets of African descent populations are available in 1000 Genomes (1000G), the Consortium on Asthma among African ancestry Populations in the Americas (CAAPA) and the Haplotype Reference Consortium (HRC). We compared the performance of these reference panels when imputing variation in 3747 African Americans (AA) from two cohorts (HCV and COPDGene) genotyped using Illumina Omni microarrays. The haplotypes of 2504 (1000G), 883 (CAAPA) and 32,470 individuals (HRC) were used as reference. We compared the number of variants, imputation quality, imputation accuracy and coverage between panels. In both cohorts, 1000G imputed 1.5-1.6× more variants than CAAPA and 1.2× more than HRC. Similar findings were observed for variants with imputation R2 > 0.5 and for rare, low-frequency, and common variants. When merging imputed variants of the three panels, the total number was 62-63 M with 20 M overlapping variants imputed by all three panels, and a range of 5-15 M variants imputed exclusively with one of them. For overlapping variants, imputation quality was highest for HRC, followed by 1000G, then CAAPA, and improved as the minor allele frequency increased. 1000G, HRC and CAAPA provided high performance and accuracy for imputation of African American individuals, increasing the number of variants available for subsequent analyses. These panels are complementary and would benefit from the development of an integrated African reference panel.
Collapse
Affiliation(s)
| | - Margaret M Parker
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Liliana Franco
- National School of Public Health, Universidad de Antioquia, Medellín, Colombia
- School of Medicine, Universidad Pontificia Bolivariana, Medellín, Colombia
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | | | - Terri H Beaty
- Johns Hopkins University, Bloomberg School of Public Health, Baltimore, MD, USA
| | - Priya Duggal
- Johns Hopkins University, Bloomberg School of Public Health, Baltimore, MD, USA.
| |
Collapse
|