1
|
Rocheleau G, Clarke SL, Auguste G, Hasbani NR, Morrison AC, Heath AS, Bielak LF, Iyer KR, Young EP, Stitziel NO, Jun G, Laurie C, Broome JG, Khan AT, Arnett DK, Becker LC, Bis JC, Boerwinkle E, Bowden DW, Carson AP, Ellinor PT, Fornage M, Franceschini N, Freedman BI, Heard-Costa NL, Hou L, Chen YDI, Kenny EE, Kooperberg C, Kral BG, Loos RJF, Lutz SM, Manson JE, Martin LW, Mitchell BD, Nassir R, Palmer ND, Post WS, Preuss MH, Psaty BM, Raffield LM, Regan EA, Rich SS, Smith JA, Taylor KD, Yanek LR, Young KA, Hilliard AT, Tcheandjieu C, Peyser PA, Vasan RS, Rotter JI, Miller CL, Assimes TL, de Vries PS, Do R. Rare variant contribution to the heritability of coronary artery disease. Nat Commun 2024; 15:8741. [PMID: 39384761 PMCID: PMC11464707 DOI: 10.1038/s41467-024-52939-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 09/26/2024] [Indexed: 10/11/2024] Open
Abstract
Whole genome sequences (WGS) enable discovery of rare variants which may contribute to missing heritability of coronary artery disease (CAD). To measure their contribution, we apply the GREML-LDMS-I approach to WGS of 4949 cases and 17,494 controls of European ancestry from the NHLBI TOPMed program. We estimate CAD heritability at 34.3% assuming a prevalence of 8.2%. Ultra-rare (minor allele frequency ≤ 0.1%) variants with low linkage disequilibrium (LD) score contribute ~50% of the heritability. We also investigate CAD heritability enrichment using a diverse set of functional annotations: i) constraint; ii) predicted protein-altering impact; iii) cis-regulatory elements from a cell-specific chromatin atlas of the human coronary; and iv) annotation principal components representing a wide range of functional processes. We observe marked enrichment of CAD heritability for most functional annotations. These results reveal the predominant role of ultra-rare variants in low LD on the heritability of CAD. Moreover, they highlight several functional processes including cell type-specific regulatory mechanisms as key drivers of CAD genetic risk.
Collapse
Affiliation(s)
- Ghislain Rocheleau
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Center for Genomic Data Analytics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Shoa L Clarke
- Department of Medicine, Stanford Prevention Research Center, Stanford University School of Medicine, Stanford, CA, USA
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Gaëlle Auguste
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Natalie R Hasbani
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Alanna C Morrison
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Adam S Heath
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Lawrence F Bielak
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Kruthika R Iyer
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Erica P Young
- Department of Medicine, Division of Cardiology, Washington University School of Medicine, Saint Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO, USA
| | - Nathan O Stitziel
- Department of Medicine, Division of Cardiology, Washington University School of Medicine, Saint Louis, MO, USA
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO, USA
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA
| | - Goo Jun
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Cecelia Laurie
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Jai G Broome
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Alyna T Khan
- Department of Biostatistics, University of Washington, Seattle, WA, USA
| | - Donna K Arnett
- College of Public Health, University of Kentucky, Lexington, KY, USA
| | - Lewis C Becker
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Joshua C Bis
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
| | - Eric Boerwinkle
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - April P Carson
- Department of Medicine, University of Mississippi Medical Center, Jackson, MS, USA
| | - Patrick T Ellinor
- Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, USA
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Boston, MA, USA
- Demoulas Center for Cardiac Arrhythmias, Massachusetts General Hospital, Boston, MA, USA
| | - Myriam Fornage
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Nora Franceschini
- Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina, Chapel Hill, NC, USA
| | - Barry I Freedman
- Department of Internal Medicine, Section on Nephrology, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Nancy L Heard-Costa
- National Heart, Lung, and Blood Institute and Boston University's Framingham Heart Study, Framingham, MA, USA
- Department of Neurology, Boston University Chobanian & Avedisian School of Medicine, Boston, MA, USA
| | - Lifang Hou
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Yii-Der Ida Chen
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Eimear E Kenny
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Institute for Genomic Health, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Charles Kooperberg
- Division of Public Health Sciences, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Brian G Kral
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Ruth J F Loos
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Science, University of Copenhagen, Copenhagen, Denmark
| | - Sharon M Lutz
- Department of Population Medicine, Harvard Pilgrim Health Care, Boston, MA, USA
| | - JoAnn E Manson
- Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Lisa W Martin
- School of Medicine and Health Sciences, George Washington University, Washington, DC, USA
| | - Braxton D Mitchell
- Department of Medicine, University of Maryland School of Medicine, Baltimore, MD, USA
| | - Rami Nassir
- Department of Pathology, School of Medicine, Umm Al-Qura University, Mecca, Saudi Arabia
| | - Nicholette D Palmer
- Department of Biochemistry, Wake Forest University School of Medicine, Winston-Salem, NC, USA
| | - Wendy S Post
- Johns Hopkins Bloomberg School of Public Health, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Michael H Preuss
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Bruce M Psaty
- Department of Medicine, Cardiovascular Health Research Unit, University of Washington, Seattle, WA, USA
- Department of Epidemiology, University of Washington, Seattle, WA, USA
- Department of Health Systems and Population Health, University of Washington, Seattle, WA, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Elizabeth A Regan
- Department of Medicine, Division of Rheumatology, National Jewish Health, Denver, CO, USA
| | - Stephen S Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
| | - Jennifer A Smith
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI, USA
| | - Kent D Taylor
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Lisa R Yanek
- GeneSTAR Research Program, Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Kendra A Young
- Department of Epidemiology, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Catherine Tcheandjieu
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Gladstone Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA
| | - Patricia A Peyser
- Department of Epidemiology, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Ramachandran S Vasan
- National Heart, Lung, and Blood Institute and Boston University's Framingham Heart Study, Framingham, MA, USA
- Department of Medicine, Boston University School of Medicine, Boston, MA, USA
- School of Public Health, University of Texas, San Antonio, TX, USA
| | - Jerome I Rotter
- Department of Pediatrics, The Institute for Translational Genomics and Population Sciences, The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| | - Clint L Miller
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA
- Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA, USA
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Themistocles L Assimes
- Department of Medicine, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, CA, USA
- VA Palo Alto Health Care System, Palo Alto, CA, USA
- Department of Epidemiology and Population Health, Stanford University School of Medicine, Stanford, CA, USA
| | - Paul S de Vries
- Department of Epidemiology, Human Genetics, and Environmental Sciences, Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Ron Do
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Center for Genomic Data Analytics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
2
|
TANISAWA KUMPEI, TABATA HIROKI, NAKAMURA NOBUHIRO, KAWAKAMI RYOKO, USUI CHIYOKO, ITO TOMOKO, KAWAMURA TAKUJI, TORII SUGURU, ISHII KAORI, MURAOKA ISAO, SUZUKI KATSUHIKO, SAKAMOTO SHIZUO, HIGUCHI MITSURU, OKA KOICHIRO. Polygenic Risk Score, Cardiorespiratory Fitness, and Cardiometabolic Risk Factors: WASEDA'S Health Study. Med Sci Sports Exerc 2024; 56:2026-2038. [PMID: 38768052 PMCID: PMC11419280 DOI: 10.1249/mss.0000000000003477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
PURPOSE This study estimated an individual's genetic liability to cardiometabolic risk factors by polygenic risk score (PRS) construction and examined whether high cardiorespiratory fitness (CRF) modifies the association between PRS and cardiometabolic risk factors. METHODS This cross-sectional study enrolled 1296 Japanese adults aged ≥40 yr. The PRS for each cardiometabolic trait (blood lipids, glucose, hypertension, and obesity) was calculated using the LDpred2 and clumping and thresholding methods. Participants were divided into low-, intermediate-, and high-PRS groups according to PRS tertiles for each trait. CRF was quantified as peak oxygen uptake (V̇O 2peak ) per kilogram body weight. Participants were divided into low-, intermediate-, and high-CRF groups according to the tertile V̇O 2peak value. RESULTS Linear regression analysis revealed a significant interaction between PRS for triglyceride (PRS TG ) and CRF groups on serum TG levels regardless of the PRS calculation method, and the association between PRS TG and TG levels was attenuated in the high-CRF group. Logistic regression analysis revealed a significant sub-additive interaction between LDpred2 PRS TG and CRF on the prevalence of high TG, indicating that high CRF attenuated the genetic predisposition to high TG. Furthermore, a significant sub-additive interaction between PRS for body mass index and CRF on obesity was detected regardless of the PRS calculation method. These significant interaction effects on high TG and obesity were diminished in the sensitivity analysis using V̇O 2peak per kilogram fat-free mass as the CRF index. Effects of PRSs for other cardiometabolic traits were not significantly attenuated in the high-CRF group regardless of PRS calculation methods. CONCLUSIONS The findings of the present study suggest that individuals with high CRF overcome the genetic predisposition to high TG levels and obesity.
Collapse
Affiliation(s)
- KUMPEI TANISAWA
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - HIROKI TABATA
- Sportology Center, Juntendo University Graduate School of Medicine, Bunkyo-ku, Tokyo, JAPAN
- Waseda Institute for Sport Sciences, Tokorozawa, Saitama, JAPAN
| | - NOBUHIRO NAKAMURA
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - RYOKO KAWAKAMI
- Waseda Institute for Sport Sciences, Tokorozawa, Saitama, JAPAN
- Physical Fitness Research Institute, Meiji Yasuda Life Foundation of Health and Welfare, Hachioji, Tokyo, JAPAN
| | - CHIYOKO USUI
- Waseda Institute for Sport Sciences, Tokorozawa, Saitama, JAPAN
- Center for Liberal Education and Learning, Sophia University, Chiyoda-ku, Tokyo, JAPAN
| | - TOMOKO ITO
- Waseda Institute for Sport Sciences, Tokorozawa, Saitama, JAPAN
- Department of Food and Nutrition, Tokyo Kasei University, Itabashi-ku, Tokyo, JAPAN
| | - TAKUJI KAWAMURA
- Waseda Institute for Sport Sciences, Tokorozawa, Saitama, JAPAN
- Research Center for Molecular Exercise Science, Hungarian University of Sports Science, Budapest, HUNGARY
| | - SUGURU TORII
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - KAORI ISHII
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - ISAO MURAOKA
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - KATSUHIKO SUZUKI
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - SHIZUO SAKAMOTO
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
- Faculty of Sport Science, Surugadai University, Hanno, Saitama, JAPAN
| | - MITSURU HIGUCHI
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| | - KOICHIRO OKA
- Faculty of Sport Sciences, Waseda University, Tokorozawa, Saitama, JAPAN
| |
Collapse
|
3
|
Capalbo A, de Wert G, Mertes H, Klausner L, Coonen E, Spinella F, Van de Velde H, Viville S, Sermon K, Vermeulen N, Lencz T, Carmi S. Screening embryos for polygenic disease risk: a review of epidemiological, clinical, and ethical considerations. Hum Reprod Update 2024; 30:529-557. [PMID: 38805697 PMCID: PMC11369226 DOI: 10.1093/humupd/dmae012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/25/2024] [Indexed: 05/30/2024] Open
Abstract
BACKGROUND The genetic composition of embryos generated by in vitro fertilization (IVF) can be examined with preimplantation genetic testing (PGT). Until recently, PGT was limited to detecting single-gene, high-risk pathogenic variants, large structural variants, and aneuploidy. Recent advances have made genome-wide genotyping of IVF embryos feasible and affordable, raising the possibility of screening embryos for their risk of polygenic diseases such as breast cancer, hypertension, diabetes, or schizophrenia. Despite a heated debate around this new technology, called polygenic embryo screening (PES; also PGT-P), it is already available to IVF patients in some countries. Several articles have studied epidemiological, clinical, and ethical perspectives on PES; however, a comprehensive, principled review of this emerging field is missing. OBJECTIVE AND RATIONALE This review has four main goals. First, given the interdisciplinary nature of PES studies, we aim to provide a self-contained educational background about PES to reproductive specialists interested in the subject. Second, we provide a comprehensive and critical review of arguments for and against the introduction of PES, crystallizing and prioritizing the key issues. We also cover the attitudes of IVF patients, clinicians, and the public towards PES. Third, we distinguish between possible future groups of PES patients, highlighting the benefits and harms pertaining to each group. Finally, our review, which is supported by ESHRE, is intended to aid healthcare professionals and policymakers in decision-making regarding whether to introduce PES in the clinic, and if so, how, and to whom. SEARCH METHODS We searched for PubMed-indexed articles published between 1/1/2003 and 1/3/2024 using the terms 'polygenic embryo screening', 'polygenic preimplantation', and 'PGT-P'. We limited the review to primary research papers in English whose main focus was PES for medical conditions. We also included papers that did not appear in the search but were deemed relevant. OUTCOMES The main theoretical benefit of PES is a reduction in lifetime polygenic disease risk for children born after screening. The magnitude of the risk reduction has been predicted based on statistical modelling, simulations, and sibling pair analyses. Results based on all methods suggest that under the best-case scenario, large relative risk reductions are possible for one or more diseases. However, as these models abstract several practical limitations, the realized benefits may be smaller, particularly due to a limited number of embryos and unclear future accuracy of the risk estimates. PES may negatively impact patients and their future children, as well as society. The main personal harms are an unindicated IVF treatment, a possible reduction in IVF success rates, and patient confusion, incomplete counselling, and choice overload. The main possible societal harms include discarded embryos, an increasing demand for 'designer babies', overemphasis of the genetic determinants of disease, unequal access, and lower utility in people of non-European ancestries. Benefits and harms will vary across the main potential patient groups, comprising patients already requiring IVF, fertile people with a history of a severe polygenic disease, and fertile healthy people. In the United States, the attitudes of IVF patients and the public towards PES seem positive, while healthcare professionals are cautious, sceptical about clinical utility, and concerned about patient counselling. WIDER IMPLICATIONS The theoretical potential of PES to reduce risk across multiple polygenic diseases requires further research into its benefits and harms. Given the large number of practical limitations and possible harms, particularly unnecessary IVF treatments and discarded viable embryos, PES should be offered only within a research context before further clarity is achieved regarding its balance of benefits and harms. The gap in attitudes between healthcare professionals and the public needs to be narrowed by expanding public and patient education and providing resources for informative and unbiased genetic counselling.
Collapse
Affiliation(s)
- Antonio Capalbo
- Juno Genetics, Department of Reproductive Genetics, Rome, Italy
- Center for Advanced Studies and Technology (CAST), Department of Medical Genetics, “G. d’Annunzio” University of Chieti-Pescara, Chieti, Italy
| | - Guido de Wert
- Department of Health, Ethics & Society, CAPHRI-School for Public Health and Primary Care and GROW School for Oncology and Reproduction, Maastricht University, Maastricht, The Netherlands
| | - Heidi Mertes
- Department of Philosophy and Moral Sciences, Ghent University, Ghent, Belgium
- Department of Public Health and Primary Care, Ghent University, Ghent, Belgium
| | - Liraz Klausner
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Edith Coonen
- Departments of Clinical Genetics and Reproductive Medicine, Maastricht University Medical Centre, Maastricht, The Netherlands
- School for Oncology and Developmental Biology, GROW, Maastricht University, Maastricht, The Netherlands
| | - Francesca Spinella
- Eurofins GENOMA Group Srl, Molecular Genetics Laboratories, Department of Scientific Communication, Rome, Italy
| | - Hilde Van de Velde
- Research Group Genetics Reproduction and Development (GRAD), Vrije Universiteit Brussel, Brussel, Belgium
- Brussels IVF, UZ Brussel, Brussel, Belgium
| | - Stephane Viville
- Laboratoire de Génétique Médicale LGM, Institut de Génétique Médicale d’Alsace IGMA, INSERM UMR 1112, Université de Strasbourg, France
- Laboratoire de Diagnostic Génétique, Unité de Génétique de l’infertilité (UF3472), Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Karen Sermon
- Research Group Genetics Reproduction and Development (GRAD), Vrije Universiteit Brussel, Brussel, Belgium
| | | | - Todd Lencz
- Institute of Behavioral Science, Feinstein Institutes for Medical Research, Manhasset, NY, USA
- Departments of Psychiatry and Molecular Medicine, Zucker School of Medicine at Hofstra/Northwell, Hempstead, NY 11549, USA
| | - Shai Carmi
- Braun School of Public Health and Community Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
4
|
Akamatsu K, Golzari S, Amariuta T. Powerful mapping of cis-genetic effects on gene expression across diverse populations reveals novel disease-critical genes. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.09.25.24314410. [PMID: 39399015 PMCID: PMC11469471 DOI: 10.1101/2024.09.25.24314410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
While disease-associated variants identified by genome-wide association studies (GWAS) most likely regulate gene expression levels, linking variants to target genes is critical to determining the functional mechanisms of these variants. Genetic effects on gene expression have been extensively characterized by expression quantitative trait loci (eQTL) studies, yet data from non-European populations is limited. This restricts our understanding of disease to genes whose regulatory variants are common in European populations. While previous work has leveraged data from multiple populations to improve GWAS power and polygenic risk score (PRS) accuracy, multi-ancestry data has not yet been used to better estimate cis-genetic effects on gene expression. Here, we present a new method, Multi-Ancestry Gene Expression Prediction Regularized Optimization (MAGEPRO), which constructs robust genetic models of gene expression in understudied populations or cell types by fitting a regularized linear combination of eQTL summary data across diverse cohorts. In simulations, our tool generates more accurate models of gene expression than widely-used LASSO and the state-of-the-art multi-ancestry PRS method, PRS-CSx, adapted to gene expression prediction. We attribute this improvement to MAGEPRO's ability to more accurately estimate causal eQTL effect sizes (p < 3.98 × 10-4, two-sided paired t-test). With real data, we applied MAGEPRO to 8 eQTL cohorts representing 3 ancestries (average n = 355) and consistently outperformed each of 6 competing methods in gene expression prediction tasks. Integration with GWAS summary statistics across 66 complex traits (representing 22 phenotypes and 3 ancestries) resulted in 2,331 new gene-trait associations, many of which replicate across multiple ancestries, including PHTF1 linked to white blood cell count, a gene which is overexpressed in leukemia patients. MAGEPRO also identified biologically plausible novel findings, such as PIGB, an essential component of GPI biosynthesis, associated with heart failure, which has been previously evidenced by clinical outcome data. Overall, MAGEPRO is a powerful tool to enhance inference of gene regulatory effects in underpowered datasets and has improved our understanding of population-specific and shared genetic effects on complex traits.
Collapse
Affiliation(s)
- Kai Akamatsu
- School of Biological Sciences, UC San Diego, La Jolla, CA, USA
- Department of Medicine, Division of Biomedical Informatics, UC San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA, USA
| | - Stephen Golzari
- Department of Medicine, Division of Biomedical Informatics, UC San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA, USA
- Shu Chien-Gene Lay Department of Bioengineering, UC San Diego, La Jolla, CA, USA
| | - Tiffany Amariuta
- Department of Medicine, Division of Biomedical Informatics, UC San Diego, La Jolla, CA, USA
- Halıcıoğlu Data Science Institute, UC San Diego, La Jolla, CA, USA
| |
Collapse
|
5
|
Akbari A, Barton AR, Gazal S, Li Z, Kariminejad M, Perry A, Zeng Y, Mittnik A, Patterson N, Mah M, Zhou X, Price AL, Lander ES, Pinhasi R, Rohland N, Mallick S, Reich D. Pervasive findings of directional selection realize the promise of ancient DNA to elucidate human adaptation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.14.613021. [PMID: 39314480 PMCID: PMC11419161 DOI: 10.1101/2024.09.14.613021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
We present a method for detecting evidence of natural selection in ancient DNA time-series data that leverages an opportunity not utilized in previous scans: testing for a consistent trend in allele frequency change over time. By applying this to 8433 West Eurasians who lived over the past 14000 years and 6510 contemporary people, we find an order of magnitude more genome-wide significant signals than previous studies: 347 independent loci with >99% probability of selection. Previous work showed that classic hard sweeps driving advantageous mutations to fixation have been rare over the broad span of human evolution, but in the last ten millennia, many hundreds of alleles have been affected by strong directional selection. Discoveries include an increase from ~0% to ~20% in 4000 years for the major risk factor for celiac disease at HLA-DQB1; a rise from ~0% to ~8% in 6000 years of blood type B; and fluctuating selection at the TYK2 tuberculosis risk allele rising from ~2% to ~9% from ~5500 to ~3000 years ago before dropping to ~3%. We identify instances of coordinated selection on alleles affecting the same trait, with the polygenic score today predictive of body fat percentage decreasing by around a standard deviation over ten millennia, consistent with the "Thrifty Gene" hypothesis that a genetic predisposition to store energy during food scarcity became disadvantageous after farming. We also identify selection for combinations of alleles that are today associated with lighter skin color, lower risk for schizophrenia and bipolar disease, slower health decline, and increased measures related to cognitive performance (scores on intelligence tests, household income, and years of schooling). These traits are measured in modern industrialized societies, so what phenotypes were adaptive in the past is unclear. We estimate selection coefficients at 9.9 million variants, enabling study of how Darwinian forces couple to allelic effects and shape the genetic architecture of complex traits.
Collapse
Affiliation(s)
- Ali Akbari
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Alison R Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Steven Gazal
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Center for Genetic Epidemiology, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Zheng Li
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | | | - Annabel Perry
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Yating Zeng
- Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
- Department of Biostatistics and Data Science, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Alissa Mittnik
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, 04103 Leipzig, Germany
| | - Nick Patterson
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Matthew Mah
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | - Xiang Zhou
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, USA
| | - Alkes L Price
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Eric S Lander
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Ron Pinhasi
- Department of Biology, Massachusetts Institute of Technology (MIT), Cambridge, MA, USA
- Department of Evolutionary Anthropology, University of Vienna, Vienna, Austria
| | - Nadin Rohland
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | - Swapan Mallick
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - David Reich
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
6
|
Wang JY, Lin N, Zietz M, Mares J, Narasimhan VM, Rathouz PJ, Harpak A. Three Open Questions in Polygenic Score Portability. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.20.608703. [PMID: 39229140 PMCID: PMC11370354 DOI: 10.1101/2024.08.20.608703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
A major obstacle hindering the broad adoption of polygenic scores (PGS) is their lack of "portability" to people that differ-in genetic ancestry or other characteristics-from the GWAS samples in which genetic effects were estimated. Here, we use the UK Biobank to measure the change in PGS prediction accuracy as a continuous function of individuals' genome-wide genetic dissimilarity to the GWAS sample ("genetic distance"). Our results highlight three gaps in our understanding of PGS portability. First, prediction accuracy is extremely noisy at the individual level and not well predicted by genetic distance. In fact, variance in prediction accuracy is explained comparably well by socioeconomic measures. Second, trends of portability vary across traits. For several immunity-related traits, prediction accuracy drops near zero quickly even at intermediate levels of genetic distance. This quick drop may reflect GWAS associations being more ancestry-specific in immunity-related traits than in other traits. Third, we show that even qualitative trends of portability can depend on the measure of prediction accuracy used. For instance, for white blood cell count, a measure of prediction accuracy at the individual level (reduction in mean squared error) increases with genetic distance. Together, our results show that portability cannot be understood through global ancestry groupings alone. There are other, understudied factors influencing portability, such as the specifics of the evolution of the trait and its genetic architecture, social context, and the construction of the polygenic score. Addressing these gaps can aid in the development and application of PGS and inform more equitable genomic research.
Collapse
Affiliation(s)
- Joyce Y. Wang
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| | - Neeka Lin
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
| | - Michael Zietz
- Department of Biomedical Informatics, Columbia University, New York, NY
| | - Jason Mares
- Department of Neurology, Columbia University, New York, NY
| | - Vagheesh M. Narasimhan
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
- Department of Statistics and Data Science, The University of Texas at Austin, Austin, TX
| | - Paul J. Rathouz
- Department of Statistics and Data Science, The University of Texas at Austin, Austin, TX
- Department of Population Health, The University of Texas at Austin, Austin, TX
| | - Arbel Harpak
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX
- Department of Population Health, The University of Texas at Austin, Austin, TX
| |
Collapse
|
7
|
Wu Y, Zheng Z, Thibaut2 L, Goddard ME, Wray NR, Visscher PM, Zeng J. Genome-wide fine-mapping improves identification of causal variants. RESEARCH SQUARE 2024:rs.3.rs-4759390. [PMID: 39149449 PMCID: PMC11326397 DOI: 10.21203/rs.3.rs-4759390/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Fine-mapping refines genotype-phenotype association signals to identify causal variants underlying complex traits. However, current methods typically focus on individual genomic segments without considering the global genetic architecture. Here, we demonstrate the advantages of performing genome-wide fine-mapping (GWFM) and develop methods to facilitate GWFM. In simulations and real data analyses, GWFM outperforms current methods in error control, mapping power and precision, replication rate, and trans-ancestry phenotype prediction. For 48 well-powered traits in the UK Biobank, we identify causal variants that collectively explain 17% of the SNP-based heritability, and predict that fine-mapping 50% of that would require 2 million samples on average. We pinpoint a known causal variant, as proof-of-principle, at FTO for body mass index, unveil a hidden secondary variant with evolutionary conservation, and identify new missense causal variants for schizophrenia and Crohn's disease. Overall, we analyse 600 complex traits with 13 million SNPs, highlighting the efficacy of GWFM with functional annotations.
Collapse
Affiliation(s)
- Yang Wu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | | | - Michael E. Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Peter M. Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
8
|
Wu Y, Zheng Z, Thibaut L, Goddard ME, Wray NR, Visscher PM, Zeng J. Genome-wide fine-mapping improves identification of causal variants. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.18.24310667. [PMID: 39072021 PMCID: PMC11275676 DOI: 10.1101/2024.07.18.24310667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Fine-mapping refines genotype-phenotype association signals to identify causal variants underlying complex traits. However, current methods typically focus on individual genomic segments without considering the global genetic architecture. Here, we demonstrate the advantages of performing genome-wide fine-mapping (GWFM) and develop methods to facilitate GWFM. In simulations and real data analyses, GWFM outperforms current methods in error control, mapping power and precision, replication rate, and trans-ancestry phenotype prediction. For 48 well-powered traits in the UK Biobank, we identify causal variants that collectively explain 17% of the SNP-based heritability, and predict that fine-mapping 50% of that would require 2 million samples on average. We pinpoint a known causal variant, as proof-of-principle, at FTO for body mass index, unveil a hidden secondary variant with evolutionary conservation, and identify new missense causal variants for schizophrenia and Crohn's disease. Overall, we analyse 599 complex traits with 13 million SNPs, highlighting the efficacy of GWFM with functional annotations.
Collapse
Affiliation(s)
- Yang Wu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Loic Thibaut
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Michael E. Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Naomi R. Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Peter M. Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
9
|
Yao D, Binan L, Bezney J, Simonton B, Freedman J, Frangieh CJ, Dey K, Geiger-Schuller K, Eraslan B, Gusev A, Regev A, Cleary B. Scalable genetic screening for regulatory circuits using compressed Perturb-seq. Nat Biotechnol 2024; 42:1282-1295. [PMID: 37872410 PMCID: PMC11035494 DOI: 10.1038/s41587-023-01964-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 08/22/2023] [Indexed: 10/25/2023]
Abstract
Pooled CRISPR screens with single-cell RNA sequencing readout (Perturb-seq) have emerged as a key technique in functional genomics, but they are limited in scale by cost and combinatorial complexity. In this study, we modified the design of Perturb-seq by incorporating algorithms applied to random, low-dimensional observations. Compressed Perturb-seq measures multiple random perturbations per cell or multiple cells per droplet and computationally decompresses these measurements by leveraging the sparse structure of regulatory circuits. Applied to 598 genes in the immune response to bacterial lipopolysaccharide, compressed Perturb-seq achieves the same accuracy as conventional Perturb-seq with an order of magnitude cost reduction and greater power to learn genetic interactions. We identified known and novel regulators of immune responses and uncovered evolutionarily constrained genes with downstream targets enriched for immune disease heritability, including many missed by existing genome-wide association studies. Our framework enables new scales of interrogation for a foundational method in functional genomics.
Collapse
Affiliation(s)
- Douglas Yao
- Program in Systems, Synthetic, and Quantitative Biology, Harvard University, Cambridge, MA, USA
| | - Loic Binan
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jon Bezney
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Brooke Simonton
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Jahanara Freedman
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Chris J Frangieh
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Kushal Dey
- Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | | | | | - Alexander Gusev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA, USA
| | - Aviv Regev
- Klarman Cell Observatory, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA
- Genentech, South San Francisco, CA, USA
| | - Brian Cleary
- Faculty of Computing and Data Sciences, Boston University, Boston, MA, USA.
- Department of Biology, Boston University, Boston, MA, USA.
- Department of Biomedical Engineering, Boston University, Boston, MA, USA.
- Program in Bioinformatics, Boston University, Boston, MA, USA.
- Biological Design Center, Boston University, Boston, MA, USA.
| |
Collapse
|
10
|
Marigorta UM, Millet O, Lu SC, Mato JM. Dysfunctional VLDL metabolism in MASLD. NPJ METABOLIC HEALTH AND DISEASE 2024; 2:16. [PMID: 39049993 PMCID: PMC11263124 DOI: 10.1038/s44324-024-00018-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2024] [Accepted: 06/22/2024] [Indexed: 07/27/2024]
Abstract
Lipidomics has unveiled the intricate human lipidome, emphasizing the extensive diversity within lipid classes in mammalian tissues critical for cellular functions. This diversity poses a challenge in maintaining a delicate balance between adaptability to recurring physiological changes and overall stability. Metabolic Dysfunction-Associated Steatotic Liver Disease (MASLD), linked to factors such as obesity and diabetes, stems from a compromise in the structural and functional stability of the liver within the complexities of lipid metabolism. This compromise inaccurately senses an increase in energy status, such as during fasting-feeding cycles or an upsurge in lipogenesis. Serum lipidomic studies have delineated three distinct metabolic phenotypes, or "metabotypes" in MASLD. MASLD-A is characterized by lower very low-density lipoprotein (VLDL) secretion and triglyceride (TG) levels, associated with a reduced risk of cardiovascular disease (CVD). In contrast, MASLD-C exhibits increased VLDL secretion and TG levels, correlating with elevated CVD risk. An intermediate subtype, with a blend of features, is designated as the MASLD-B metabotype. In this perspective, we examine into recent findings that show the multifaceted regulation of VLDL secretion by S-adenosylmethionine, the primary cellular methyl donor. Furthermore, we explore the differential CVD and hepatic cancer risk across MASLD metabotypes and discuss the context and potential paths forward to gear the findings from genetic studies towards a better understanding of the observed heterogeneity in MASLD.
Collapse
Affiliation(s)
- Urko M. Marigorta
- Integrative Genomics Lab, CIC bioGUNE, Basque Research and Technology Alliance (BRTA), 48160 Derio, Spain
- Ikerbasque, Basque Foundation for Science, 48013 Bilbao, Spain
| | - Oscar Millet
- Precision Medicine and Metabolism Lab, CIC bioGUNE, Basque Research and Technology Alliance (BRTA), CIBERehd, 48160 Derio, Spain
| | - Shelly C. Lu
- Karsh Division of Gastroenterology and Hepatology, Cedars-Sinai Medical Center, Los Angeles, CA 90048 USA
| | - José M. Mato
- Precision Medicine and Metabolism Lab, CIC bioGUNE, Basque Research and Technology Alliance (BRTA), CIBERehd, 48160 Derio, Spain
| |
Collapse
|
11
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. Nat Genet 2024:10.1038/s41588-024-01820-9. [PMID: 38977852 DOI: 10.1038/s41588-024-01820-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 05/29/2024] [Indexed: 07/10/2024]
Abstract
Measures of selective constraint on genes have been used for many applications, including clinical interpretation of rare coding variants, disease gene discovery and studies of genome evolution. However, widely used metrics are severely underpowered at detecting constraints for the shortest ~25% of genes, potentially causing important pathogenic mutations to be overlooked. Here we developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease and other phenotypes, especially for short genes. Our estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve the estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford, CA, USA.
| | | | - Hakhamanesh Mostafavi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Population Health, New York University, New York, NY, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
12
|
He J, Perera D, Wen W, Ping J, Li Q, Lyu L, Chen Z, Shu X, Long J, Cai Q, Shu XO, Zheng W, Long Q, Guo X. Enhancing Disease Risk Gene Discovery by Integrating Transcription Factor-Linked Trans-located Variants into Transcriptome-Wide Association Analyses. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.10.10.23295443. [PMID: 37873299 PMCID: PMC10593059 DOI: 10.1101/2023.10.10.23295443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Transcriptome-wide association studies (TWAS) have been successful in identifying disease susceptibility genes by integrating cis-variants predicted gene expression with genome-wide association studies (GWAS) data. However, trans-located variants for predicting gene expression remain largely unexplored. Here, we introduce transTF-TWAS, which incorporates transcription factor (TF)-linked trans-located variants to enhance model building. Using data from the Genotype-Tissue Expression project, we predict gene expression and alternative splicing and applied these models to large GWAS datasets for breast, prostate, and lung cancers. We demonstrate that transTF-TWAS outperforms other existing TWAS approaches in both constructing gene prediction models and identifying disease-associated genes, as evidenced by simulations and real data analysis. Our transTF-TWAS approach significantly contributes to the discovery of disease risk genes. Findings from this study have shed new light on several genetically driven key regulators and their associated regulatory networks underlying disease susceptibility.
Collapse
|
13
|
Nadig A, Replogle JM, Pogson AN, McCarroll SA, Weissman JS, Robinson EB, O’Connor LJ. Transcriptome-wide characterization of genetic perturbations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.03.601903. [PMID: 39005298 PMCID: PMC11244993 DOI: 10.1101/2024.07.03.601903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]
Abstract
Single cell CRISPR screens such as Perturb-seq enable transcriptomic profiling of genetic perturbations at scale. However, the data produced by these screens are often noisy due to cost and technical constraints, limiting power to detect true effects with conventional differential expression analyses. Here, we introduce TRanscriptome-wide Analysis of Differential Expression (TRADE), a statistical framework which estimates the transcriptome-wide distribution of true differential expression effects from noisy gene-level measurements. Within TRADE, we derive multiple novel, interpretable statistical metrics, including the "transcriptome-wide impact", an estimator of the overall transcriptional effect of a perturbation which is stable across sampling depths. We analyze new and published large-scale Perturb-seq datasets to show that many true transcriptional effects are not statistically significant, but detectable in aggregate with TRADE. In a genome-scale Perturb-seq screen, we find that a typical gene perturbation affects an estimated 45 genes, whereas a typical essential gene perturbation affects over 500 genes. An advantage of our approach is its ability to compare the transcriptomic effects of genetic perturbations across contexts and dosages despite differences in power. We use this ability to identify perturbations with cell-type dependent effects and to find examples of perturbations where transcriptional responses are not only larger in magnitude, but also qualitatively different, as a function of dosage. Lastly, we expand our analysis to case/control comparison of gene expression for neuropsychiatric conditions, finding that transcriptomic effect correlations are greater than genetic correlations for these diagnoses. TRADE lays an analytic foundation for the systematic comparison of genetic perturbation atlases, as well as differential expression experiments more broadly.
Collapse
Affiliation(s)
- Ajay Nadig
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Joseph M. Replogle
- Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA, USA
- Whitehead Institute for Biomedical Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Angela N. Pogson
- Whitehead Institute for Biomedical Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Steven A McCarroll
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Jonathan S. Weissman
- Whitehead Institute for Biomedical Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, USA
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Elise B. Robinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Luke J. O’Connor
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| |
Collapse
|
14
|
Taylor CS, Lawson DJ. Heritability of complex traits in sub-populations experiencing bottlenecks and growth. J Hum Genet 2024; 69:329-335. [PMID: 38589509 PMCID: PMC11199143 DOI: 10.1038/s10038-024-01249-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2023] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 04/10/2024]
Abstract
Populations that have experienced a bottleneck are regularly used in Genome Wide Association Studies (GWAS) to investigate variants associated with complex traits. It is generally understood that these isolated sub-populations may experience high frequency of otherwise rare variants with large effect size, and therefore provide a unique opportunity to study said trait. However, the demographic history of the population under investigation affects all SNPs that determine the complex trait genome-wide, changing its heritability and genetic architecture. We use a simulation based approach to identify the impact of the demographic processes of drift, expansion, and migration on the heritability of complex trait. We show that demography has considerable impact on complex traits. We then investigate the power to resolve heritability of complex traits in GWAS studies subjected to demographic effects. We find that demography is an important component for interpreting inference of complex traits and has a nuanced impact on the power of GWAS. We conclude that demographic histories need to be explicitly modelled to properly quantify the history of selection on a complex trait.
Collapse
Affiliation(s)
| | - Daniel J Lawson
- School of Mathematics, University of Bristol, Bristol, UK.
- MRC Integrative Epidemiology Unit, Bristol Medical School, University of Bristol, Bristol, UK.
| |
Collapse
|
15
|
Strober BJ, Zhang MJ, Amariuta T, Rossen J, Price AL. Fine-mapping causal tissues and genes at disease-associated loci. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2023.11.01.23297909. [PMID: 37961337 PMCID: PMC10635248 DOI: 10.1101/2023.11.01.23297909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Heritable diseases often manifest in a highly tissue-specific manner, with different disease loci mediated by genes in distinct tissues or cell types. We propose Tissue-Gene Fine-Mapping (TGFM), a fine-mapping method that infers the posterior probability (PIP) for each gene-tissue pair to mediate a disease locus by analyzing GWAS summary statistics (and in-sample LD) and leveraging eQTL data from diverse tissues to build cis-predicted expression models; TGFM also assigns PIPs to causal variants that are not mediated by gene expression in assayed genes and tissues. TGFM accounts for both co-regulation across genes and tissues and LD between SNPs (generalizing existing fine-mapping methods), and incorporates genome-wide estimates of each tissue's contribution to disease as tissue-level priors. TGFM was well-calibrated and moderately well-powered in simulations; unlike previous methods, TGFM was able to attain correct calibration by modeling uncertainty in cis-predicted expression models. We applied TGFM to 45 UK Biobank diseases/traits (average N = 316K) using eQTL data from 38 GTEx tissues. TGFM identified an average of 147 PIP > 0.5 causal genetic elements per disease/trait, of which 11% were gene-tissue pairs. Implicated gene-tissue pairs were concentrated in known disease-critical tissues, and causal genes were strongly enriched in disease-relevant gene sets. Causal gene-tissue pairs identified by TGFM recapitulated known biology (e.g., TPO-thyroid for Hypothyroidism), but also included biologically plausible novel findings (e.g., SLC20A2-artery aorta for Diastolic blood pressure). Further application of TGFM to single-cell eQTL data from 9 cell types in peripheral blood mononuclear cells (PBMC), analyzed jointly with GTEx tissues, identified 30 additional causal gene-PBMC cell type pairs at PIP > 0.5-primarily for autoimmune disease and blood cell traits, including the biologically plausible example of CD52 in classical monocyte cells for Monocyte count. In conclusion, TGFM is a robust and powerful method for fine-mapping causal tissues and genes at disease-associated loci.
Collapse
Affiliation(s)
- Benjamin J. Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Martin Jinye Zhang
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Tiffany Amariuta
- Halıcıoğlu Data Science Institute, University of California San Diego, La Jolla, CA, USA
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Jordan Rossen
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
16
|
Patel RA, Weiß CL, Zhu H, Mostafavi H, Simons YB, Spence JP, Pritchard JK. Conditional frequency spectra as a tool for studying selection on complex traits in biobanks. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.15.599126. [PMID: 38948697 PMCID: PMC11212903 DOI: 10.1101/2024.06.15.599126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Natural selection on complex traits is difficult to study in part due to the ascertainment inherent to genome-wide association studies (GWAS). The power to detect a trait-associated variant in GWAS is a function of frequency and effect size - but for traits under selection, the effect size of a variant determines the strength of selection against it, constraining its frequency. To account for GWAS ascertainment, we propose studying the joint distribution of allele frequencies across populations, conditional on the frequencies in the GWAS cohort. Before considering these conditional frequency spectra, we first characterized the impact of selection and non-equilibrium demography on allele frequency dynamics forwards and backwards in time. We then used these results to understand conditional frequency spectra under realistic human demography. Finally, we investigated empirical conditional frequency spectra for GWAS variants associated with 106 complex traits, finding compelling evidence for either stabilizing or purifying selection. Our results provide insight into polygenic score portability and other properties of variants ascertained with GWAS, highlighting the utility of conditional frequency spectra.
Collapse
Affiliation(s)
- Roshni A. Patel
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Clemens L. Weiß
- Stanford Cancer Institute Core, Stanford University School of Medicine, Stanford, CA
| | - Huisheng Zhu
- Department of Biology, Stanford University, Stanford, CA
| | - Hakhamanesh Mostafavi
- Center for Human Genetics and Genomics, New York University School of Medicine, New York, NY
- Division of Biostatistics, Department of Population Health, New York University School of Medicine, New York, NY
| | | | - Jeffrey P. Spence
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
| | - Jonathan K. Pritchard
- Department of Genetics, Stanford University School of Medicine, Stanford, CA
- Department of Biology, Stanford University, Stanford, CA
| |
Collapse
|
17
|
Zhao B, Zheng S, Zhu H. ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS. Ann Stat 2024; 52:948-965. [PMID: 39281348 PMCID: PMC11391480 DOI: 10.1214/24-aos2378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]
Abstract
Genetic prediction holds immense promise for translating genetic discoveries into medical advances. As the high-dimensional covariance matrix (or the linkage disequilibrium (LD) pattern) of genetic variants often presents a block-diagonal structure, numerous methods account for the dependence among variants in predetermined local LD blocks. Moreover, due to privacy considerations and data protection concerns, genetic variant dependence in each LD block is typically estimated from external reference panels rather than the original training data set. This paper presents a unified analysis of blockwise and reference panel-based estimators in a high-dimensional prediction framework without sparsity restrictions. We find that, surprisingly, even when the covariance matrix has a block-diagonal structure with well-defined boundaries, blockwise estimation methods adjusting for local dependence can be substantially less accurate than methods controlling for the whole covariance matrix. Further, estimation methods built on the original training data set and external reference panels are likely to have varying performance in high dimensions, which may reflect the cost of having only access to summary level data from the training data set. This analysis is based on novel results in random matrix theory for block-diagonal covariance matrix. We numerically evaluate our results using extensive simulations and real data analysis in the UK Biobank.
Collapse
Affiliation(s)
- Bingxin Zhao
- Department of Statistics and Data Science, University of Pennsylvania
| | - Shurong Zheng
- School of Mathematics and Statistics, Northeast Normal University
| | - Hongtu Zhu
- Department of Biostatistics, University of North Carolina at Chapel Hill
| |
Collapse
|
18
|
Eastwood SV, Hemani G, Watkins SH, Scally A, Davey Smith G, Chaturvedi N. Ancestry, ethnicity, and race: explaining inequalities in cardiometabolic disease. Trends Mol Med 2024; 30:541-551. [PMID: 38677980 DOI: 10.1016/j.molmed.2024.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 03/30/2024] [Accepted: 04/03/2024] [Indexed: 04/29/2024]
Abstract
Population differences in cardiometabolic disease remain unexplained. Misleading assumptions over genetic explanations are partly due to terminology used to distinguish populations, specifically ancestry, race, and ethnicity. These terms differentially implicate environmental and biological causal pathways, which should inform their use. Genetic variation alone accounts for a limited fraction of population differences in cardiometabolic disease. Research effort should focus on societally driven, lifelong environmental determinants of population differences in disease. Rather than pursuing population stratifiers to personalize medicine, we advocate removing socioeconomic barriers to receipt of and adherence to healthcare interventions, which will have markedly greater impact on improving cardiometabolic outcomes. This requires multidisciplinary collaboration and public and policymaker engagement to address inequalities driven by society rather than biology per se.
Collapse
Affiliation(s)
- Sophie V Eastwood
- MRC Unit for Lifelong Health and Ageing at UCL Population Sciences and Experimental Medicine, Institute of Cardiovascular Sciences Faculty of Population Health Sciences, University College London, London, UK
| | - Gibran Hemani
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Sarah H Watkins
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK
| | - George Davey Smith
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK; MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Nishi Chaturvedi
- MRC Unit for Lifelong Health and Ageing at UCL Population Sciences and Experimental Medicine, Institute of Cardiovascular Sciences Faculty of Population Health Sciences, University College London, London, UK.
| |
Collapse
|
19
|
Davis CN, Jinwala Z, Hatoum AS, Toikumo S, Agrawal A, Rentsch CT, Edenberg HJ, Baurley JW, Hartwell EE, Crist RC, Gray JC, Justice AC, Gelernter J, Kember RL, Kranzler HR. Candidate Genes from an FDA-Approved Algorithm Fail to Predict Opioid Use Disorder Risk in Over 450,000 Veterans. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.16.24307486. [PMID: 38798430 PMCID: PMC11118646 DOI: 10.1101/2024.05.16.24307486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Importance Recently, the Food and Drug Administration gave pre-marketing approval to algorithm based on its purported ability to identify genetic risk for opioid use disorder. However, the clinical utility of the candidate genes comprising the algorithm has not been independently demonstrated. Objective To assess the utility of 15 variants in candidate genes from an algorithm intended to predict opioid use disorder risk. Design This case-control study examined the association of 15 candidate genetic variants with risk of opioid use disorder using available electronic health record data from December 20, 1992 to September 30, 2022. Setting Electronic health record data, including pharmacy records, from Million Veteran Program participants across the United States. Participants Participants were opioid-exposed individuals enrolled in the Million Veteran Program (n = 452,664). Opioid use disorder cases were identified using International Classification of Disease diagnostic codes, and controls were individuals with no opioid use disorder diagnosis. Exposures Number of risk alleles present across 15 candidate genetic variants. Main Outcome and Measures Predictive performance of 15 genetic variants for opioid use disorder risk assessed via logistic regression and machine learning models. Results Opioid exposed individuals (n=33,669 cases) were on average 61.15 (SD = 13.37) years old, 90.46% male, and had varied genetic similarity to global reference panels. Collectively, the 15 candidate genetic variants accounted for 0.4% of variation in opioid use disorder risk. The accuracy of the ensemble machine learning model using the 15 genes as predictors was 52.8% (95% CI = 52.1 - 53.6%) in an independent testing sample. Conclusions and Relevance Candidate genes that comprise the approved algorithm do not meet reasonable standards of efficacy in predicting opioid use disorder risk. Given the algorithm's limited predictive accuracy, its use in clinical care would lead to high rates of false positive and negative findings. More clinically useful models are needed to identify individuals at risk of developing opioid use disorder.
Collapse
Affiliation(s)
- Christal N. Davis
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Zeal Jinwala
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Alexander S. Hatoum
- Department of Psychological and Brain Sciences, Washington University School of Medicine, St. Louis, MO, USA
| | - Sylvanus Toikumo
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Arpana Agrawal
- Department of Psychiatry, Washington University, St. Louis, MO, USA
| | - Christopher T. Rentsch
- Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA
- Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
- Faculty of Epidemiology and Population Health, London School of Hygiene & Tropical Medicine, London, UK
| | - Howard J. Edenberg
- Department of Biochemistry and Molecular Biology, Indiana University School of Medicine, Indianapolis, IN, USA
| | | | - Emily E. Hartwell
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Richard C. Crist
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Joshua C. Gray
- Department of Medical and Clinical Psychology, Uniformed Services University, Bethesda, MD, USA
| | - Amy C. Justice
- Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA
- Department of Internal Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Joel Gelernter
- Veterans Affairs Connecticut Healthcare System, West Haven, CT, USA
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
- Departments of Genetics and Neuroscience, Yale University School of Medicine, New Haven, CT, USA
| | - Rachel L. Kember
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Henry R. Kranzler
- Mental Illness Research, Education and Clinical Center, Crescenz Veterans Affairs Medical Center, Philadelphia, PA, USA
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| |
Collapse
|
20
|
Rossen J, Shi H, Strober BJ, Zhang MJ, Kanai M, McCaw ZR, Liang L, Weissbrod O, Price AL. MultiSuSiE improves multi-ancestry fine-mapping in All of Us whole-genome sequencing data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.13.24307291. [PMID: 38798542 PMCID: PMC11118590 DOI: 10.1101/2024.05.13.24307291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Leveraging data from multiple ancestries can greatly improve fine-mapping power due to differences in linkage disequilibrium and allele frequencies. We propose MultiSuSiE, an extension of the sum of single effects model (SuSiE) to multiple ancestries that allows causal effect sizes to vary across ancestries based on a multivariate normal prior informed by empirical data. We evaluated MultiSuSiE via simulations and analyses of 14 quantitative traits leveraging whole-genome sequencing data in 47k African-ancestry and 94k European-ancestry individuals from All of Us. In simulations, MultiSuSiE applied to Afr47k+Eur47k was well-calibrated and attained higher power than SuSiE applied to Eur94k; interestingly, higher causal variant PIPs in Afr47k compared to Eur47k were entirely explained by differences in the extent of LD quantified by LD 4th moments. Compared to very recently proposed multi-ancestry fine-mapping methods, MultiSuSiE attained higher power and/or much lower computational costs, making the analysis of large-scale All of Us data feasible. In real trait analyses, MultiSuSiE applied to Afr47k+Eur94k identified 579 fine-mapped variants with PIP > 0.5, and MultiSuSiE applied to Afr47k+Eur47k identified 44% more fine-mapped variants with PIP > 0.5 than SuSiE applied to Eur94k. We validated MultiSuSiE results for real traits via functional enrichment of fine-mapped variants. We highlight several examples where MultiSuSiE implicates well-studied or biologically plausible fine-mapped variants that were not implicated by other methods.
Collapse
|
21
|
Agrawal S, Buyan A, Severin J, Koido M, Alam T, Abugessaisa I, Chang HY, Dostie J, Itoh M, Kere J, Kondo N, Li Y, Makeev VJ, Mendez M, Okazaki Y, Ramilowski JA, Sigorskikh AI, Strug LJ, Yagi K, Yasuzawa K, Yip CW, Hon CC, Hoffman MM, Terao C, Kulakovskiy IV, Kasukawa T, Shin JW, Carninci P, de Hoon MJL. Annotation of nuclear lncRNAs based on chromatin interactions. PLoS One 2024; 19:e0295971. [PMID: 38709794 PMCID: PMC11073715 DOI: 10.1371/journal.pone.0295971] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Accepted: 12/02/2023] [Indexed: 05/08/2024] Open
Abstract
The human genome is pervasively transcribed and produces a wide variety of long non-coding RNAs (lncRNAs), constituting the majority of transcripts across human cell types. Some specific nuclear lncRNAs have been shown to be important regulatory components acting locally. As RNA-chromatin interaction and Hi-C chromatin conformation data showed that chromatin interactions of nuclear lncRNAs are determined by the local chromatin 3D conformation, we used Hi-C data to identify potential target genes of lncRNAs. RNA-protein interaction data suggested that nuclear lncRNAs act as scaffolds to recruit regulatory proteins to target promoters and enhancers. Nuclear lncRNAs may therefore play a role in directing regulatory factors to locations spatially close to the lncRNA gene. We provide the analysis results through an interactive visualization web portal at https://fantom.gsc.riken.jp/zenbu/reports/#F6_3D_lncRNA.
Collapse
Affiliation(s)
- Saumya Agrawal
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Andrey Buyan
- Autosome.org, Russia
- FANTOM Consortium, Dolgoprudny, Russia
| | - Jessica Severin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Institute of Medical Science, The University of Tokyo, Tokyo, Japan
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | | | - Howard Y. Chang
- Center for Personal Dynamic Regulome, Stanford University, Stanford, California, United States of America
| | - Josée Dostie
- Department of Biochemistry, Rosalind and Morris Goodman Cancer Research Center, McGill University, Montréal, Québec, Canada
| | - Masayoshi Itoh
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- RIKEN Preventive Medicine and Diagnosis Innovation Program, Wako, Japan
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Stem Cells and Metabolism Research Program, University of Helsinki and Folkhälsan Research Center, Helsinki, Finland
| | - Naoto Kondo
- RIKEN Center for Life Science Technologies, Yokohama, Japan
| | - Yunjing Li
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | | | - Mickaël Mendez
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Yasushi Okazaki
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jordan A. Ramilowski
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Advanced Medical Research Center, Yokohama City University, Yokohama, Japan
| | | | - Lisa J. Strug
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Department of Statistical Sciences, University of Toronto, Ontario, Canada
- The Centre for Applied Genomics and Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Ken Yagi
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kayoko Yasuzawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chi Wai Yip
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Chung Chau Hon
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Michael M. Hoffman
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Princess Margaret Cancer Centre, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute, Toronto, Ontario, Canada
| | - Chikashi Terao
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | | | - Takeya Kasukawa
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Jay W. Shin
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), Singapore, Republic of Singapore
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Human Technopole, Milan, Italy
| | | |
Collapse
|
22
|
Siraj L, Castro RI, Dewey H, Kales S, Nguyen TTL, Kanai M, Berenzy D, Mouri K, Wang QS, McCaw ZR, Gosai SJ, Aguet F, Cui R, Vockley CM, Lareau CA, Okada Y, Gusev A, Jones TR, Lander ES, Sabeti PC, Finucane HK, Reilly SK, Ulirsch JC, Tewhey R. Functional dissection of complex and molecular trait variants at single nucleotide resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.05.592437. [PMID: 38766054 PMCID: PMC11100724 DOI: 10.1101/2024.05.05.592437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Identifying the causal variants and mechanisms that drive complex traits and diseases remains a core problem in human genetics. The majority of these variants have individually weak effects and lie in non-coding gene-regulatory elements where we lack a complete understanding of how single nucleotide alterations modulate transcriptional processes to affect human phenotypes. To address this, we measured the activity of 221,412 trait-associated variants that had been statistically fine-mapped using a Massively Parallel Reporter Assay (MPRA) in 5 diverse cell-types. We show that MPRA is able to discriminate between likely causal variants and controls, identifying 12,025 regulatory variants with high precision. Although the effects of these variants largely agree with orthogonal measures of function, only 69% can plausibly be explained by the disruption of a known transcription factor (TF) binding motif. We dissect the mechanisms of 136 variants using saturation mutagenesis and assign impacted TFs for 91% of variants without a clear canonical mechanism. Finally, we provide evidence that epistasis is prevalent for variants in close proximity and identify multiple functional variants on the same haplotype at a small, but important, subset of trait-associated loci. Overall, our study provides a systematic functional characterization of likely causal common variants underlying complex and molecular human traits, enabling new insights into the regulatory grammar underlying disease risk.
Collapse
Affiliation(s)
- Layla Siraj
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biophysics, Harvard Graduate School of Arts and Sciences, Boston, MA, USA
- Harvard-Massachusetts Institute of Technology MD/PhD Program, Harvard Medical School, Boston, MA, USA
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
| | | | | | | | | | - Masahiro Kanai
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Center for Computational and Integrative Biology, Massachusetts General Hospital, Boston, MA, USA
| | | | | | - Qingbo S. Wang
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
| | | | - Sager J. Gosai
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - François Aguet
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ran Cui
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Caleb A. Lareau
- Program in Computational and Systems Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Yukinori Okada
- Department of Statistical Genetics, Osaka University Graduate School of Medicine, Suita, Japan
- Department of Genome Informatics, Graduate School of Medicine, the University of Tokyo, Tokyo, Japan
- Laboratory for Systems Genetics, RIKEN Center for Integrative Medical Sciences, Kanagawa, Japan
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| | - Thouis R. Jones
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Eric S. Lander
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Biology, MIT, Cambridge, MA, USA
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Pardis C. Sabeti
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Hilary K. Finucane
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
| | - Steven K. Reilly
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
- Wu Tsai Institute, Yale University, New Haven, CT, USA
| | - Jacob C. Ulirsch
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA USA
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Illumina Artificial Intelligence Laboratory, Illumina, San Diego, CA, USA
| | - Ryan Tewhey
- The Jackson Laboratory, Bar Harbor, ME, USA
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA
- Graduate School of Biomedical Sciences, Tufts University School of Medicine, Boston, MA, USA
| |
Collapse
|
23
|
Zheng Z, Liu S, Sidorenko J, Wang Y, Lin T, Yengo L, Turley P, Ani A, Wang R, Nolte IM, Snieder H, Yang J, Wray NR, Goddard ME, Visscher PM, Zeng J. Leveraging functional genomic annotations and genome coverage to improve polygenic prediction of complex traits within and between ancestries. Nat Genet 2024; 56:767-777. [PMID: 38689000 PMCID: PMC11096109 DOI: 10.1038/s41588-024-01704-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Accepted: 03/05/2024] [Indexed: 05/02/2024]
Abstract
We develop a method, SBayesRC, that integrates genome-wide association study (GWAS) summary statistics with functional genomic annotations to improve polygenic prediction of complex traits. Our method is scalable to whole-genome variant analysis and refines signals from functional annotations by allowing them to affect both causal variant probability and causal effect distribution. We analyze 50 complex traits and diseases using ∼7 million common single-nucleotide polymorphisms (SNPs) and 96 annotations. SBayesRC improves prediction accuracy by 14% in European ancestry and up to 34% in cross-ancestry prediction compared to the baseline method SBayesR, which does not use annotations, and outperforms other methods, including LDpred2, LDpred-funct, MegaPRS, PolyPred-S and PRS-CSx. Investigation of factors affecting prediction accuracy identifies a significant interaction between SNP density and annotation information, suggesting whole-genome sequence variants with annotations may further improve prediction. Functional partitioning analysis highlights a major contribution of evolutionary constrained regions to prediction accuracy and the largest per-SNP contribution from nonsynonymous SNPs.
Collapse
Affiliation(s)
- Zhili Zheng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| | - Shouye Liu
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Julia Sidorenko
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Ying Wang
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Tian Lin
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Loic Yengo
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
| | - Patrick Turley
- Center for Economic and Social Research, University of Southern California, Los Angeles, CA, USA
- Department of Economics, University of Southern California, Los Angeles, CA, USA
| | - Alireza Ani
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Bioinformatics, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Rujia Wang
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Ilja M Nolte
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Harold Snieder
- Department of Epidemiology, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
| | - Jian Yang
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, China
| | - Naomi R Wray
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Michael E Goddard
- Faculty of Veterinary and Agricultural Science, University of Melbourne, Parkville, Victoria, Australia
- Biosciences Research Division, Department of Economic Development, Jobs, Transport and Resources, Bundoora, Victoria, Australia
| | - Peter M Visscher
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Jian Zeng
- Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, Australia.
| |
Collapse
|
24
|
Jeong R, Bulyk ML. Chromatin accessibility variation provides insights into missing regulation underlying immune-mediated diseases. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589213. [PMID: 38659802 PMCID: PMC11042205 DOI: 10.1101/2024.04.12.589213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Most genetic loci associated with complex traits and diseases through genome-wide association studies (GWAS) are noncoding, suggesting that the causal variants likely have gene regulatory effects. However, only a small number of loci have been linked to expression quantitative trait loci (eQTLs) detected currently. To better understand the potential reasons for many trait-associated loci lacking eQTL colocalization, we investigated whether chromatin accessibility QTLs (caQTLs) in lymphoblastoid cell lines (LCLs) explain immune-mediated disease associations that eQTLs in LCLs did not. The power to detect caQTLs was greater than that of eQTLs and was less affected by the distance from the transcription start site of the associated gene. Meta-analyzing LCL eQTL data to increase the sample size to over a thousand led to additional loci with eQTL colocalization, demonstrating that insufficient statistical power is still likely to be a factor. Moreover, further eQTL colocalization loci were uncovered by surveying eQTLs of other immune cell types. Altogether, insufficient power and context-specificity of eQTLs both contribute to the 'missing regulation.'
Collapse
Affiliation(s)
- Raehoon Jeong
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
| | - Martha L. Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
- Bioinformatics and Integrative Genomics Graduate Program, Harvard University, Cambridge, MA 02138, USA
- Department of Pathology, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
25
|
Zeng T, Spence JP, Mostafavi H, Pritchard JK. Bayesian estimation of gene constraint from an evolutionary model with gene features. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.05.19.541520. [PMID: 37292653 PMCID: PMC10245655 DOI: 10.1101/2023.05.19.541520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Measures of selective constraint on genes have been used for many applications including clinical interpretation of rare coding variants, disease gene discovery, and studies of genome evolution. However, widely-used metrics are severely underpowered at detecting constraint for the shortest ∼25% of genes, potentially causing important pathogenic mutations to be overlooked. We developed a framework combining a population genetics model with machine learning on gene features to enable accurate inference of an interpretable constraint metric, shet. Our estimates outperform existing metrics for prioritizing genes important for cell essentiality, human disease, and other phenotypes, especially for short genes. Our new estimates of selective constraint should have wide utility for characterizing genes relevant to human disease. Finally, our inference framework, GeneBayes, provides a flexible platform that can improve estimation of many gene-level properties, such as rare variant burden or gene expression differences.
Collapse
Affiliation(s)
- Tony Zeng
- Department of Genetics, Stanford University, Stanford CA
| | | | | | - Jonathan K. Pritchard
- Department of Genetics, Stanford University, Stanford CA
- Department of Biology, Stanford University, Stanford CA
| |
Collapse
|
26
|
Li Q, Bian J, Qian Y, Kossinna P, Gau C, Gordon PMK, Zhou X, Guo X, Yan J, Wu J, Long Q. An expression-directed linear mixed model discovering low-effect genetic variants. Genetics 2024; 226:iyae018. [PMID: 38314848 DOI: 10.1093/genetics/iyae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 11/29/2023] [Accepted: 01/05/2024] [Indexed: 02/07/2024] Open
Abstract
Detecting genetic variants with low-effect sizes using a moderate sample size is difficult, hindering downstream efforts to learn pathology and estimating heritability. In this work, by utilizing informative weights learned from training genetically predicted gene expression models, we formed an alternative approach to estimate the polygenic term in a linear mixed model. Our linear mixed model estimates the genetic background by incorporating their relevance to gene expression. Our protocol, expression-directed linear mixed model, enables the discovery of subtle signals of low-effect variants using moderate sample size. By applying expression-directed linear mixed model to cohorts of around 5,000 individuals with either binary (WTCCC) or quantitative (NFBC1966) traits, we demonstrated its power gain at the low-effect end of the genetic etiology spectrum. In aggregate, the additional low-effect variants detected by expression-directed linear mixed model substantially improved estimation of missing heritability. Expression-directed linear mixed model moves precision medicine forward by accurately detecting the contribution of low-effect genetic variants to human diseases.
Collapse
Affiliation(s)
- Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Jiayi Bian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Yanzhao Qian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Cooper Gau
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Paul M K Gordon
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Xiang Zhou
- School of Public Health, University of Michigan, Ann Arbor 48109, USA
| | - Xingyi Guo
- Department of Medicine & Biomedical Informatics, Vanderbilt University Medical Center, Nashville 37203, USA
| | - Jun Yan
- Physiology and Pharmacology, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Jingjing Wu
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
- Department of Medical Genetics, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
27
|
He XY, Wu BS, Yang L, Guo Y, Deng YT, Li ZY, Fei CJ, Liu WS, Ge YJ, Kang J, Feng J, Cheng W, Dong Q, Yu JT. Genetic associations of protein-coding variants in venous thromboembolism. Nat Commun 2024; 15:2819. [PMID: 38561338 PMCID: PMC10984941 DOI: 10.1038/s41467-024-47178-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 03/19/2024] [Indexed: 04/04/2024] Open
Abstract
Previous genetic studies of venous thromboembolism (VTE) have been largely limited to common variants, leaving the genetic determinants relatively incomplete. We performed an exome-wide association study of VTE among 14,723 cases and 334,315 controls. Fourteen known and four novel genes (SRSF6, PHPT1, CGN, and MAP3K2) were identified through protein-coding variants, with broad replication in the FinnGen cohort. Most genes we discovered exhibited the potential to predict future VTE events in longitudinal analysis. Notably, we provide evidence for the additive contribution of rare coding variants to known genome-wide polygenic risk in shaping VTE risk. The identified genes were enriched in pathways affecting coagulation and platelet activation, along with liver-specific expression. The pleiotropic effects of these genes indicated the potential involvement of coagulation factors, blood cell traits, liver function, and immunometabolic processes in VTE pathogenesis. In conclusion, our study unveils the valuable contribution of protein-coding variants in VTE etiology and sheds new light on its risk stratification.
Collapse
Affiliation(s)
- Xiao-Yu He
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Bang-Sheng Wu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Liu Yang
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yu Guo
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yue-Ting Deng
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Ze-Yu Li
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Chen-Jie Fei
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wei-Shi Liu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yi-Jun Ge
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China
| | - Jujiao Kang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Jianfeng Feng
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Ministry of Education, Shanghai, China
- Department of Computer Science, University of Warwick, Coventry, UK
| | - Wei Cheng
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China.
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China.
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Fudan University, Ministry of Education, Shanghai, China.
- Department of Computer Science, University of Warwick, Coventry, UK.
| | - Qiang Dong
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China.
| | - Jin-Tai Yu
- Department of Neurology and National Center for Neurological Disorders, Huashan Hospital, State Key Laboratory of Medical Neurobiology and MOE Frontiers Center for Brain Science, Shanghai Medical College, Fudan University, Shanghai, China.
| |
Collapse
|
28
|
Lappalainen T, Li YI, Ramachandran S, Gusev A. Genetic and molecular architecture of complex traits. Cell 2024; 187:1059-1075. [PMID: 38428388 DOI: 10.1016/j.cell.2024.01.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/20/2023] [Accepted: 01/16/2024] [Indexed: 03/03/2024]
Abstract
Human genetics has emerged as one of the most dynamic areas of biology, with a broadening societal impact. In this review, we discuss recent achievements, ongoing efforts, and future challenges in the field. Advances in technology, statistical methods, and the growing scale of research efforts have all provided many insights into the processes that have given rise to the current patterns of genetic variation. Vast maps of genetic associations with human traits and diseases have allowed characterization of their genetic architecture. Finally, studies of molecular and cellular effects of genetic variants have provided insights into biological processes underlying disease. Many outstanding questions remain, but the field is well poised for groundbreaking discoveries as it increases the use of genetic data to understand both the history of our species and its applications to improve human health.
Collapse
Affiliation(s)
- Tuuli Lappalainen
- New York Genome Center, New York, NY, USA; Science for Life Laboratory, Department of Gene Technology, KTH Royal Institute of Technology, Stockholm, Sweden.
| | - Yang I Li
- Section of Genetic Medicine, University of Chicago, Chicago, IL, USA; Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Sohini Ramachandran
- Ecology, Evolution and Organismal Biology, Center for Computational Molecular Biology, and the Data Science Institute, Brown University, Providence, RI 029129, USA
| | - Alexander Gusev
- Harvard Medical School and Dana-Farber Cancer Institute, Boston, MA, USA
| |
Collapse
|
29
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
30
|
Peyrot WJ, Panagiotaropoulou G, Olde Loohuis LM, Adams MJ, Awasthi S, Ge T, McIntosh AM, Mitchell BL, Mullins N, O'Connell KS, Penninx BWJH, Posthuma D, Ripke S, Ruderfer DM, Uffelmann E, Vilhjalmsson BJ, Zhu Z, Smoller JW, Price AL. Distinguishing different psychiatric disorders using DDx-PRS. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.02.24302228. [PMID: 38352307 PMCID: PMC10862992 DOI: 10.1101/2024.02.02.24302228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]
Abstract
Despite great progress on methods for case-control polygenic prediction (e.g. schizophrenia vs. control), there remains an unmet need for a method that genetically distinguishes clinically related disorders (e.g. schizophrenia (SCZ) vs. bipolar disorder (BIP) vs. depression (MDD) vs. control); such a method could have important clinical value, especially at disorder onset when differential diagnosis can be challenging. Here, we introduce a method, Differential Diagnosis-Polygenic Risk Score (DDx-PRS), that jointly estimates posterior probabilities of each possible diagnostic category (e.g. SCZ=50%, BIP=25%, MDD=15%, control=10%) by modeling variance/covariance structure across disorders, leveraging case-control polygenic risk scores (PRS) for each disorder (computed using existing methods) and prior clinical probabilities for each diagnostic category. DDx-PRS uses only summary-level training data and does not use tuning data, facilitating implementation in clinical settings. In simulations, DDx-PRS was well-calibrated (whereas a simpler approach that analyzes each disorder marginally was poorly calibrated), and effective in distinguishing each diagnostic category vs. the rest. We then applied DDx-PRS to Psychiatric Genomics Consortium SCZ/BIP/MDD/control data, including summary-level training data from 3 case-control GWAS ( N =41,917-173,140 cases; total N =1,048,683) and held-out test data from different cohorts with equal numbers of each diagnostic category (total N =11,460). DDx-PRS was well-calibrated and well-powered relative to these training sample sizes, attaining AUCs of 0.66 for SCZ vs. rest, 0.64 for BIP vs. rest, 0.59 for MDD vs. rest, and 0.68 for control vs. rest. DDx-PRS produced comparable results to methods that leverage tuning data, confirming that DDx-PRS is an effective method. True diagnosis probabilities in top deciles of predicted diagnosis probabilities were considerably larger than prior baseline probabilities, particularly in projections to larger training sample sizes, implying considerable potential for clinical utility under certain circumstances. In conclusion, DDx-PRS is an effective method for distinguishing clinically related disorders.
Collapse
|
31
|
Tandon R, Nasrallah H, Akbarian S, Carpenter WT, DeLisi LE, Gaebel W, Green MF, Gur RE, Heckers S, Kane JM, Malaspina D, Meyer-Lindenberg A, Murray R, Owen M, Smoller JW, Yassin W, Keshavan M. The schizophrenia syndrome, circa 2024: What we know and how that informs its nature. Schizophr Res 2024; 264:1-28. [PMID: 38086109 DOI: 10.1016/j.schres.2023.11.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Revised: 11/23/2023] [Accepted: 11/29/2023] [Indexed: 03/01/2024]
Abstract
With new data about different aspects of schizophrenia being continually generated, it becomes necessary to periodically revisit exactly what we know. Along with a need to review what we currently know about schizophrenia, there is an equal imperative to evaluate the construct itself. With these objectives, we undertook an iterative, multi-phase process involving fifty international experts in the field, with each step building on learnings from the prior one. This review assembles currently established findings about schizophrenia (construct, etiology, pathophysiology, clinical expression, treatment) and posits what they reveal about its nature. Schizophrenia is a heritable, complex, multi-dimensional syndrome with varying degrees of psychotic, negative, cognitive, mood, and motor manifestations. The illness exhibits a remitting and relapsing course, with varying degrees of recovery among affected individuals with most experiencing significant social and functional impairment. Genetic risk factors likely include thousands of common genetic variants that each have a small impact on an individual's risk and a plethora of rare gene variants that have a larger individual impact on risk. Their biological effects are concentrated in the brain and many of the same variants also increase the risk of other psychiatric disorders such as bipolar disorder, autism, and other neurodevelopmental conditions. Environmental risk factors include but are not limited to urban residence in childhood, migration, older paternal age at birth, cannabis use, childhood trauma, antenatal maternal infection, and perinatal hypoxia. Structural, functional, and neurochemical brain alterations implicate multiple regions and functional circuits. Dopamine D-2 receptor antagonists and partial agonists improve psychotic symptoms and reduce risk of relapse. Certain psychological and psychosocial interventions are beneficial. Early intervention can reduce treatment delay and improve outcomes. Schizophrenia is increasingly considered to be a heterogeneous syndrome and not a singular disease entity. There is no necessary or sufficient etiology, pathology, set of clinical features, or treatment that fully circumscribes this syndrome. A single, common pathophysiological pathway appears unlikely. The boundaries of schizophrenia remain fuzzy, suggesting the absence of a categorical fit and need to reconceptualize it as a broader, multi-dimensional and/or spectrum construct.
Collapse
Affiliation(s)
- Rajiv Tandon
- Department of Psychiatry, WMU Homer Stryker School of Medicine, Kalamazoo, MI 49008, United States of America.
| | - Henry Nasrallah
- Department of Psychiatry, University of Cincinnati College of Medicine Cincinnati, OH 45267, United States of America
| | - Schahram Akbarian
- Department of Psychiatry, Icahn School of Medicine at Mt. Sinai, New York, NY 10029, United States of America
| | - William T Carpenter
- Department of Psychiatry, University of Maryland School of Medicine, Baltimore, MD 21201, United States of America
| | - Lynn E DeLisi
- Department of Psychiatry, Cambridge Health Alliance and Harvard Medical School, Cambridge, MA 02139, United States of America
| | - Wolfgang Gaebel
- Department of Psychiatry and Psychotherapy, LVR-Klinikum Dusseldorf, Heinrich-Heine University, Dusseldorf, Germany
| | - Michael F Green
- Department of Psychiatry and Biobehavioral Sciences, Jane and Terry Semel Institute of Neuroscience and Human Behavior, UCLA, Los Angeles, CA 90024, United States of America; Greater Los Angeles Veterans' Administration Healthcare System, United States of America
| | - Raquel E Gur
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, United States of America
| | - Stephan Heckers
- Department of Psychiatry, Vanderbilt University Medical Center, Nashville, TN 37232, United States of America
| | - John M Kane
- Department of Psychiatry, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Glen Oaks, NY 11004, United States of America
| | - Dolores Malaspina
- Department of Psychiatry, Neuroscience, Genetics, and Genomics, Icahn School of Medicine at Mt. Sinai, New York, NY 10029, United States of America
| | - Andreas Meyer-Lindenberg
- Department of Psychiatry and Psychotherapy, Central Institute of Mental Health, Mannhein/Heidelberg University, Mannheim, Germany
| | - Robin Murray
- Institute of Psychiatry, Psychology, and Neuroscience, Kings College, London, UK
| | - Michael Owen
- Centre for Neuropsychiatric Genetics and Genomics, and Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
| | - Jordan W Smoller
- Center for Precision Psychiatry, Department of Psychiatry, Psychiatric and Neurodevelopmental Unit, Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, United States of America
| | - Walid Yassin
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02115, United States of America
| | - Matcheri Keshavan
- Department of Psychiatry, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02115, United States of America
| |
Collapse
|
32
|
Yao S, Harder A, Darki F, Chang YW, Li A, Nikouei K, Volpe G, Lundström JN, Zeng J, Wray N, Lu Y, Sullivan PF, Leffler JH. Connecting genomic results for psychiatric disorders to human brain cell types and regions reveals convergence with functional connectivity. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.18.24301478. [PMID: 38410450 PMCID: PMC10896415 DOI: 10.1101/2024.01.18.24301478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/28/2024]
Abstract
Understanding the temporal and spatial brain locations etiological for psychiatric disorders is essential for targeted neurobiological research. Integration of genomic insights from genome-wide association studies with single-cell transcriptomics is a powerful approach although past efforts have necessarily relied on mouse atlases. Leveraging a comprehensive atlas of the adult human brain, we prioritized cell types via the enrichment of SNP-heritabilities for brain diseases, disorders, and traits, progressing from individual cell types to brain regions. Our findings highlight specific neuronal clusters significantly enriched for the SNP-heritabilities for schizophrenia, bipolar disorder, and major depressive disorder along with intelligence, education, and neuroticism. Extrapolation of cell-type results to brain regions reveals important patterns for schizophrenia with distinct subregions in the hippocampus and amygdala exhibiting the highest significance. Cerebral cortical regions display similar enrichments despite the known prefrontal dysfunction in those with schizophrenia highlighting the importance of subcortical connectivity. Using functional MRI connectivity from cases with schizophrenia and neurotypical controls, we identified brain networks that distinguished cases from controls that also confirmed involvement of the central and lateral amygdala, hippocampal body, and prefrontal cortex. Our findings underscore the value of single-cell transcriptomics in decoding the polygenicity of psychiatric disorders and offer a promising convergence of genomic, transcriptomic, and brain imaging modalities toward common biological targets.
Collapse
Affiliation(s)
- Shuyang Yao
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Arvid Harder
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Fahimeh Darki
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Yu-Wei Chang
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Ang Li
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
| | - Kasra Nikouei
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Giovanni Volpe
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Johan N Lundström
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
- Monell Chemical Senses Center, Philadelphia, PA, USA
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
| | - Naomi Wray
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Australia
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Yi Lu
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Patrick F Sullivan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
- Departments of Genetics and Psychiatry, University of North Carolina, Chapel Hill, NC, USA
| | - Jens Hjerling Leffler
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
33
|
Janivara R, Hazra U, Pfennig A, Harlemon M, Kim MS, Eaaswarkhanth M, Chen WC, Ogunbiyi A, Kachambwa P, Petersen LN, Jalloh M, Mensah JE, Adjei AA, Adusei B, Joffe M, Gueye SM, Aisuodionoe-Shadrach OI, Fernandez PW, Rohan TE, Andrews C, Rebbeck TR, Adebiyi AO, Agalliu I, Lachance J. Uncovering the genetic architecture and evolutionary roots of androgenetic alopecia in African men. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.12.575396. [PMID: 38293167 PMCID: PMC10827056 DOI: 10.1101/2024.01.12.575396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Androgenetic alopecia is a highly heritable trait. However, much of our understanding about the genetics of male pattern baldness comes from individuals of European descent. Here, we examined a novel dataset comprising 2,136 men from Ghana, Nigeria, Senegal, and South Africa that were genotyped using a custom array. We first tested how genetic predictions of baldness generalize from Europe to Africa, finding that polygenic scores from European GWAS yielded AUC statistics that ranged from 0.513 to 0.546, indicating that genetic predictions of baldness in African populations performed notably worse than in European populations. Subsequently, we conducted the first African GWAS of androgenetic alopecia, focusing on self-reported baldness patterns at age 45. After correcting for present age, population structure, and study site, we identified 266 moderately significant associations, 51 of which were independent (p-value < 10-5, r2 < 0.2). Most baldness associations were autosomal, and the X chromosomes does not appear to have a large impact on baldness in African men. Finally, we examined the evolutionary causes of continental differences in genetic architecture. Although Neanderthal alleles have previously been associated with skin and hair phenotypes, we did not find evidence that European-ascertained baldness hits were enriched for signatures of ancient introgression. Most loci that are associated with androgenetic alopecia are evolving neutrally. However, multiple baldness-associated SNPs near the EDA2R and AR genes have large allele frequency differences between continents. Collectively, our findings illustrate how evolutionary history contributes to the limited portability of genetic predictions across ancestries.
Collapse
Affiliation(s)
- Rohini Janivara
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Ujani Hazra
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Aaron Pfennig
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Maxine Harlemon
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
- Department of Biology, Morgan State University, Baltimore, Maryland, USA
| | - Michelle S Kim
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
- Department of Human Genetics University of Michigan, Ann Arbor, Michigan, USA
| | | | - Wenlong C Chen
- Strengthening Oncology Services Research Unit, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
- National Cancer Registry, National Institute for Communicable Diseases a Division of the National Health Laboratory Service, Johannesburg, South Africa
| | | | - Paidamoyo Kachambwa
- Centre for Proteomic and Genomic Research, Cape Town, South Africa
- Mediclinic Precise Southern Africa, Cape Town, South Africa
| | - Lindsay N Petersen
- Centre for Proteomic and Genomic Research, Cape Town, South Africa
- Mediclinic Precise Southern Africa, Cape Town, South Africa
| | - Mohamed Jalloh
- Université Cheikh Anta Diop de Dakar, Dakar, Senegal
- Université Iba Der Thiam de Thiès, Thiès, Senegal
| | - James E Mensah
- Korle-Bu Teaching Hospital and University of Ghana Medical School, Accra, Ghana
| | - Andrew A Adjei
- Department of Pathology, University of Ghana Medical School, Accra, Ghana
| | | | - Maureen Joffe
- Strengthening Oncology Services Research Unit, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
| | | | - Oseremen I Aisuodionoe-Shadrach
- College of Health Sciences, University of Abuja, University of Abuja Teaching Hospital and Cancer Science Centre, Abuja, Nigeria
| | - Pedro W Fernandez
- Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Thomas E Rohan
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | | | - Timothy R Rebbeck
- Dana-Farber Cancer Institute, Boston, Massachusetts, USA
- Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | | | - Ilir Agalliu
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Joseph Lachance
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA
| |
Collapse
|
34
|
Duncan L, Deisseroth K. Are novel treatments for brain disorders hiding in plain sight? Neuropsychopharmacology 2024; 49:276-281. [PMID: 37422511 PMCID: PMC10700299 DOI: 10.1038/s41386-023-01636-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/11/2023] [Accepted: 05/19/2023] [Indexed: 07/10/2023]
Abstract
Psychiatric diseases are strongly influenced by genetics, but genetically guided treatments have been slow to develop, and precise molecular mechanisms remain mysterious. Although individual locations in the genome tend to not contribute powerfully to psychiatric disease incidence, genome-wide association studies (GWAS) have now successfully linked hundreds of specific genetic loci to psychiatric disorders [1-3]. Here, building upon results from well-powered GWAS of four phenotypes relevant to psychiatry, we motivate an exploratory workflow leading from GWAS screening, through causal testing in animal models using methods such as optogenetics, to new therapies in human beings. We focus on schizophrenia and the dopamine D2 receptor (DRD2), hot flashes and the neurokinin B receptor (TACR3), cigarette smoking and receptors bound by nicotine (CHRNA5, CHRNA3, CHRNB4), and alcohol use and enzymes that help to break down alcohol (ADH1B, ADH1C, ADH7). A single genomic locus may not powerfully determine disease at the level of the population, but the same locus may nevertheless represent a potent treatment target suitable for population-wide therapeutic approaches.
Collapse
Affiliation(s)
- Laramie Duncan
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA.
| | - Karl Deisseroth
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA.
- Department of Bioengineering and Howard Hughes Medial Institute, Stanford University, Stanford, CA, USA.
| |
Collapse
|
35
|
Cheng S, Jacobs CGC, Mogollón Pérez EA, Chen D, van de Sanden JT, Bretscher KM, Verweij F, Bosman JS, Hackmann A, Merks RMH, van den Heuvel J, van der Zee M. A life-history allele of large effect shortens developmental time in a wild insect population. Nat Ecol Evol 2024; 8:70-82. [PMID: 37957313 DOI: 10.1038/s41559-023-02246-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 10/04/2023] [Indexed: 11/15/2023]
Abstract
Developmental time is a key life-history trait with large effects on Darwinian fitness. In many insects, developmental time is currently under strong selection to minimize ecological mismatches in seasonal timing induced by climate change. The genetic basis of responses to such selection, however, is poorly understood. To address this problem, we set up a long-term evolve-and-resequence experiment in the beetle Tribolium castaneum and selected replicate, outbred populations for fast or slow embryonic development. The response to this selection was substantial and embryonic developmental timing of the selection lines started to diverge during dorsal closure. Pooled whole-genome resequencing, gene expression analysis and an RNAi screen pinpoint a 222 bp deletion containing binding sites for Broad and Tramtrack upstream of the ecdysone degrading enzyme Cyp18a1 as a main target of selection. Using CRISPR/Cas9 to reconstruct this allele in the homogenous genetic background of a laboratory strain, we unravel how this single deletion advances the embryonic ecdysone peak inducing dorsal closure and show that this allele accelerates larval development but causes a trade-off with fecundity. Our study uncovers a life-history allele of large effect and reveals the evolvability of developmental time in a natural insect population.
Collapse
Affiliation(s)
- Shixiong Cheng
- Institute of Biology, Leiden University, Leiden, the Netherlands
| | - Chris G C Jacobs
- Institute of Biology, Leiden University, Leiden, the Netherlands
| | - Elisa A Mogollón Pérez
- Institute of Biology, Leiden University, Leiden, the Netherlands
- Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Daipeng Chen
- Mathematical Institute, Leiden University, Leiden, the Netherlands
| | - Joep T van de Sanden
- Institute of Biology, Leiden University, Leiden, the Netherlands
- Department of Radiology and Nuclear Medicine, Erasmus MC, Rotterdam, the Netherlands
| | | | - Femke Verweij
- Institute of Biology, Leiden University, Leiden, the Netherlands
| | - Jelle S Bosman
- Institute of Biology, Leiden University, Leiden, the Netherlands
| | - Amke Hackmann
- Institute of Biology, Leiden University, Leiden, the Netherlands
| | - Roeland M H Merks
- Institute of Biology, Leiden University, Leiden, the Netherlands
- Mathematical Institute, Leiden University, Leiden, the Netherlands
| | - Joost van den Heuvel
- Laboratory of Genetics, Wageningen University and Research, Wageningen, the Netherlands
| | | |
Collapse
|
36
|
Zhang MJ, Durvasula A, Chiang C, Koch EM, Strober BJ, Shi H, Barton AR, Kim SS, Weissbrod O, Loh PR, Gazal S, Sunyaev S, Price AL. Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection. RESEARCH SQUARE 2023:rs.3.rs-3707248. [PMID: 38168385 PMCID: PMC10760228 DOI: 10.21203/rs.3.rs-3707248/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
Collapse
Affiliation(s)
- Martin Jinye Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Arun Durvasula
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Colby Chiang
- Department of Pediatrics, Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA
| | - Evan M. Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Benjamin J. Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alison R. Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Samuel S. Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L. Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
37
|
Matkarimov BT, Saparbaev MK. Chargaff's second parity rule lies at the origin of additive genetic interactions in quantitative traits to make omnigenic selection possible. PeerJ 2023; 11:e16671. [PMID: 38107580 PMCID: PMC10725672 DOI: 10.7717/peerj.16671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 11/22/2023] [Indexed: 12/19/2023] Open
Abstract
Background Francis Crick's central dogma provides a residue-by-residue mechanistic explanation of the flow of genetic information in living systems. However, this principle may not be sufficient for explaining how random mutations cause continuous variation of quantitative highly polygenic complex traits. Chargaff's second parity rule (CSPR), also referred to as intrastrand DNA symmetry, defined as near-exact equalities G ≈ C and A ≈ T within a single DNA strand, is a statistical property of cellular genomes. The phenomenon of intrastrand DNA symmetry was discovered more than 50 years ago; at present, it remains unclear what its biological role is, what the mechanisms are that force cellular genomes to comply strictly with CSPR, and why genomes of certain noncellular organisms have broken intrastrand DNA symmetry. The present work is aimed at studying a possible link between intrastrand DNA symmetry and the origin of genetic interactions in quantitative traits. Methods Computational analysis of single-nucleotide polymorphisms in human and mouse populations and of nucleotide composition biases at different codon positions in bacterial and human proteomes. Results The analysis of mutation spectra inferred from single-nucleotide polymorphisms observed in murine and human populations revealed near-exact equalities of numbers of reverse complementary mutations, indicating that random genetic variations obey CSPR. Furthermore, nucleotide compositions of coding sequences proved to be statistically interwoven via CSPR because pyrimidine bias at the 3rd codon position compensates purine bias at the 1st and 2nd positions. Conclusions According to Fisher's infinitesimal model, we propose that accumulation of reverse complementary mutations results in a continuous phenotypic variation due to small additive effects of statistically interwoven genetic variations. Therefore, additive genetic interactions can be inferred as a statistical entanglement of nucleotide compositions of separate genetic loci. CSPR challenges the neutral theory of molecular evolution-because all random mutations participate in variation of a trait-and provides an alternative solution to Haldane's dilemma by making a gene function diffuse. We propose that CSPR is symmetry of Fisher's infinitesimal model and that genetic information can be transferred in an implicit contactless manner.
Collapse
Affiliation(s)
- Bakhyt T. Matkarimov
- National Laboratory Astana, Nazarbayev University, Astana, Kazakhstan
- L.N.Gumilev Eurasian National University, Astana, Kazakhstan
| | - Murat K. Saparbaev
- Groupe «Mechanisms of DNA Repair and Carcinogenesis», CNRS UMR9019, Gustave Roussy Cancer Campus, Université Paris-Saclay, Villejuif, France
- Al-Farabi Kazakh National University, Almaty, Kazakhstan
| |
Collapse
|
38
|
Privé F, Albiñana C, Arbel J, Pasaniuc B, Vilhjálmsson BJ. Inferring disease architecture and predictive ability with LDpred2-auto. Am J Hum Genet 2023; 110:2042-2055. [PMID: 37944514 PMCID: PMC10716363 DOI: 10.1016/j.ajhg.2023.10.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 10/15/2023] [Accepted: 10/17/2023] [Indexed: 11/12/2023] Open
Abstract
LDpred2 is a widely used Bayesian method for building polygenic scores (PGSs). LDpred2-auto can infer the two parameters from the LDpred model, the SNP heritability h2 and polygenicity p, so that it does not require an additional validation dataset to choose best-performing parameters. The main aim of this paper is to properly validate the use of LDpred2-auto for inferring multiple genetic parameters. Here, we present a new version of LDpred2-auto that adds an optional third parameter α to its model, for modeling negative selection. We then validate the inference of these three parameters (or two, when using the previous model). We also show that LDpred2-auto provides per-variant probabilities of being causal that are well calibrated and can therefore be used for fine-mapping purposes. We also introduce a formula to infer the out-of-sample predictive performance r2 of the resulting PGS directly from the Gibbs sampler of LDpred2-auto. Finally, we extend the set of HapMap3 variants recommended to use with LDpred2 with 37% more variants to improve the coverage of this set, and we show that this new set of variants captures 12% more heritability and provides 6% more predictive performance, on average, in UK Biobank analyses.
Collapse
Affiliation(s)
- Florian Privé
- National Centre for Register-based Research, Aarhus University, Aarhus, Denmark.
| | - Clara Albiñana
- National Centre for Register-based Research, Aarhus University, Aarhus, Denmark
| | - Julyan Arbel
- University Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK, Grenoble, France
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA, USA; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA; Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Bjarni J Vilhjálmsson
- National Centre for Register-based Research, Aarhus University, Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark; Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute, Cambridge, MA, USA
| |
Collapse
|
39
|
Zhang MJ, Durvasula A, Chiang C, Koch EM, Strober BJ, Shi H, Barton AR, Kim SS, Weissbrod O, Loh PR, Gazal S, Sunyaev S, Price AL. Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.04.23299391. [PMID: 38106023 PMCID: PMC10723494 DOI: 10.1101/2023.12.04.23299391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
The genetic architecture of human diseases and complex traits has been extensively studied, but little is known about the relationship of causal disease effect sizes between proximal SNPs, which have largely been assumed to be independent. We introduce a new method, LD SNP-pair effect correlation regression (LDSPEC), to estimate the correlation of causal disease effect sizes of derived alleles between proximal SNPs, depending on their allele frequencies, LD, and functional annotations; LDSPEC produced robust estimates in simulations across various genetic architectures. We applied LDSPEC to 70 diseases and complex traits from the UK Biobank (average N=306K), meta-analyzing results across diseases/traits. We detected significantly nonzero effect correlations for proximal SNP pairs (e.g., -0.37±0.09 for low-frequency positive-LD 0-100bp SNP pairs) that decayed with distance (e.g., -0.07±0.01 for low-frequency positive-LD 1-10kb), varied with allele frequency (e.g., -0.15±0.04 for common positive-LD 0-100bp), and varied with LD between SNPs (e.g., +0.12±0.05 for common negative-LD 0-100bp) (because we consider derived alleles, positive-LD and negative-LD SNP pairs may yield very different results). We further determined that SNP pairs with shared functions had stronger effect correlations that spanned longer genomic distances, e.g., -0.37±0.08 for low-frequency positive-LD same-gene promoter SNP pairs (average genomic distance of 47kb (due to alternative splicing)) and -0.32±0.04 for low-frequency positive-LD H3K27ac 0-1kb SNP pairs. Consequently, SNP-heritability estimates were substantially smaller than estimates of the sum of causal effect size variances across all SNPs (ratio of 0.87±0.02 across diseases/traits), particularly for certain functional annotations (e.g., 0.78±0.01 for common Super enhancer SNPs)-even though these quantities are widely assumed to be equal. We recapitulated our findings via forward simulations with an evolutionary model involving stabilizing selection, implicating the action of linkage masking, whereby haplotypes containing linked SNPs with opposite effects on disease have reduced effects on fitness and escape negative selection.
Collapse
Affiliation(s)
- Martin Jinye Zhang
- Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Arun Durvasula
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
| | - Colby Chiang
- Department of Pediatrics, Division of Genetics and Genomics, Boston Children's Hospital, Boston, MA
| | - Evan M Koch
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Benjamin J Strober
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Alison R Barton
- Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Samuel S Kim
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Omer Weissbrod
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Po-Ru Loh
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Steven Gazal
- Center for Genetic Epidemiology, Department of Population and Public Health Sciences, Keck School of Medicine, University of Southern California
- Department of Quantitative and Computational Biology, University of Southern California
- Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California
| | - Shamil Sunyaev
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
40
|
Li H, Mazumder R, Lin X. Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix. Nat Commun 2023; 14:7954. [PMID: 38040712 PMCID: PMC10692177 DOI: 10.1038/s41467-023-43565-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 11/14/2023] [Indexed: 12/03/2023] Open
Abstract
Existing SNP-heritability estimators that leverage summary statistics from genome-wide association studies (GWAS) are much less efficient (i.e., have larger standard errors) than the restricted maximum likelihood (REML) estimators which require access to individual-level data. We introduce a new method for local heritability estimation-Heritability Estimation with high Efficiency using LD and association Summary Statistics (HEELS)-that significantly improves the statistical efficiency of summary-statistics-based heritability estimator and attains comparable statistical efficiency as REML (with a relative statistical efficiency >92%). Moreover, we propose representing the empirical LD matrix as the sum of a low-rank matrix and a banded matrix. We show that this way of modeling the LD can not only reduce the storage and memory cost, but also improve the computational efficiency of heritability estimation. We demonstrate the statistical efficiency of HEELS and the advantages of our proposed LD approximation strategies both in simulations and through empirical analyses of the UK Biobank data.
Collapse
Affiliation(s)
- Hui Li
- Harvard T.H. Chan School of Public Health, Department of Biostatistics, Boston, MA, USA
| | - Rahul Mazumder
- Massachusetts Institute of Technology, Operations Research and Statistics group, Cambridge, MA, USA
| | - Xihong Lin
- Harvard T.H. Chan School of Public Health, Department of Biostatistics, Boston, MA, USA.
- Harvard University, Department of Statistics, Cambridge, MA, USA.
| |
Collapse
|
41
|
Zhou D, Zhou Y, Xu Y, Meng R, Gamazon ER. A phenome-wide scan reveals convergence of common and rare variant associations. Genome Med 2023; 15:101. [PMID: 38017547 PMCID: PMC10683189 DOI: 10.1186/s13073-023-01253-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 11/08/2023] [Indexed: 11/30/2023] Open
Abstract
BACKGROUND Common and rare variants contribute to the etiology of complex traits. However, the extent to which the phenotypic effects of common and rare variants involve shared molecular mediators remains poorly understood. The question is essential to the basic and translational goals of the science of genomics, with critical basic-science, methodological, and clinical consequences. METHODS Leveraging the latest release of whole-exome sequencing (WES, for rare variants) and genome-wide association study (GWAS, for common variants) data from the UK Biobank, we developed a metric, the COmmon variant and RAre variant Convergence (CORAC) signature, to quantify the convergence for a broad range of complex traits. We characterized the relationship between CORAC and effective sample size across phenome-wide association studies. RESULTS We found that the signature is positively correlated with effective sample size (Spearman ρ = 0.594, P < 2.2e - 16), indicating increased functional convergence of trait-associated genetic variation, across the allele frequency spectrum, with increased power. Sensitivity analyses, including accounting for heteroskedasticity and varying the number of detected association signals, further strengthened the validity of the finding. In addition, consistent with empirical data, extensive simulations showed that negative selection, in line with enhancing polygenicity, has a dampening effect on the convergence signature. Methodologically, leveraging the convergence leads to enhanced association analysis. CONCLUSIONS The presented framework for the convergence signature has important implications for fine-mapping strategies and drug discovery efforts. In addition, our study provides a blueprint for the expectation from future large-scale whole-genome sequencing (WGS)/WES and sheds methodological light on post-GWAS studies.
Collapse
Affiliation(s)
- Dan Zhou
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA.
- The Key Laboratory of Intelligent Preventive Medicine of Zhejiang Province, Hangzhou, China.
| | - Yuan Zhou
- Department of Biostatistics and Center for Quantitative Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Yue Xu
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
- The Key Laboratory of Intelligent Preventive Medicine of Zhejiang Province, Hangzhou, China
| | - Ran Meng
- School of Public Health and the Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA.
- Data Science Institute, Vanderbilt University Medical Center, Nashville, TN, USA.
| |
Collapse
|
42
|
Majumdar A, Pasaniuc B. A Bayesian method for estimating gene-level polygenicity under the framework of transcriptome-wide association study. Stat Med 2023; 42:4867-4885. [PMID: 37643728 DOI: 10.1002/sim.9892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 06/03/2023] [Accepted: 08/10/2023] [Indexed: 08/31/2023]
Abstract
Polygenicity refers to the phenomenon that multiple genetic variants have a nonzero effect on a complex trait. It is defined as the proportion of genetic variants with a nonzero effect on the trait. Evaluation of polygenicity can provide valuable insights into the genetic architecture of the trait. Several recent works have attempted to estimate polygenicity at the single nucleotide polymorphism level. However, evaluating polygenicity at the gene level can be biologically more meaningful. We propose the notion of gene-level polygenicity, defined as the proportion of genes having a nonzero effect on the trait under the framework of a transcriptome-wide association study. We introduce a Bayesian approach genepoly to estimate this quantity for a trait. The method is based on spike and slab prior and simultaneously estimates the subset of non-null genes. Our simulation study shows that genepoly efficiently estimates gene-level polygenicity. The method produces a downward bias for small choices of trait heritability due to a non-null gene, which diminishes rapidly with an increase in the genome-wide association study (GWAS) sample size. While identifying the subset of non-null genes, genepoly offers a high level of specificity and an overall good level of sensitivity-the sensitivity increases as the sample size of the reference panel expression and GWAS data increase. We applied the method to seven phenotypes in the UK Biobank, integrating expression data. We find height to be the most polygenic and asthma to be the least polygenic.
Collapse
Affiliation(s)
- Arunabha Majumdar
- Department of Mathematics, Indian Institute of Technology Hyderabad, Kandi, Telangana, India
| | - Bogdan Pasaniuc
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, California
| |
Collapse
|
43
|
Ratajczak F, Joblin M, Hildebrandt M, Ringsquandl M, Falter-Braun P, Heinig M. Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases. Nat Commun 2023; 14:7206. [PMID: 37938585 PMCID: PMC10632370 DOI: 10.1038/s41467-023-42975-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 10/27/2023] [Indexed: 11/09/2023] Open
Abstract
Understanding phenotype-to-genotype relationships is a grand challenge of 21st century biology with translational implications. The recently proposed "omnigenic" model postulates that effects of genetic variation on traits are mediated by core-genes and -proteins whose activities mechanistically influence the phenotype, whereas peripheral genes encode a regulatory network that indirectly affects phenotypes via core gene products. Here, we develop a positive-unlabeled graph representation-learning ensemble-approach based on a nested cross-validation to predict core-like genes for diverse diseases using Mendelian disorder genes for training. Employing mouse knockout phenotypes for external validations, we demonstrate that core-like genes display several key properties of core genes: Mouse knockouts of genes corresponding to our most confident predictions give rise to relevant mouse phenotypes at rates on par with the Mendelian disorder genes, and all candidates exhibit core gene properties like transcriptional deregulation in disease and loss-of-function intolerance. Moreover, as predicted for core genes, our candidates are enriched for drug targets and druggable proteins. In contrast to Mendelian disorder genes the new core-like genes are enriched for druggable yet untargeted gene products, which are therefore attractive targets for drug development. Interpretation of the underlying deep learning model suggests plausible explanations for our core gene predictions in form of molecular mechanisms and physical interactions. Our results demonstrate the potential of graph representation learning for the interpretation of biological complexity and pave the way for studying core gene properties and future drug development.
Collapse
Affiliation(s)
- Florin Ratajczak
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany
| | | | | | | | - Pascal Falter-Braun
- Institute of Network Biology (INET), Molecular Targets and Therapeutics Center (MTTC), Helmholtz Munich, Neuherberg, Germany.
- Microbe-Host Interactions, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany.
| | - Matthias Heinig
- Institute of Computational Biology (ICB), Helmholtz Munich, Neuherberg, Germany.
- Department of Computer Science, TUM School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- German Centre for Cardiovascular Research (DZHK), Munich Heart Association, Partner Site Munich, Berlin, Germany.
| |
Collapse
|
44
|
Amariuta T. The power paradox of detecting disease-associated and gene-expression-associated variants. Nat Genet 2023; 55:1782-1783. [PMID: 37857936 DOI: 10.1038/s41588-023-01525-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2023]
Affiliation(s)
- Tiffany Amariuta
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA, USA.
- Department of Medicine, University of California, San Diego, La Jolla, CA, USA.
| |
Collapse
|
45
|
Yubero P, Lavin AA, Poyatos JF. The limitations of phenotype prediction in metabolism. PLoS Comput Biol 2023; 19:e1011631. [PMID: 37948461 PMCID: PMC10664875 DOI: 10.1371/journal.pcbi.1011631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2023] [Revised: 11/22/2023] [Accepted: 10/24/2023] [Indexed: 11/12/2023] Open
Abstract
Phenotype prediction is at the center of many questions in biology. Prediction is often achieved by determining statistical associations between genetic and phenotypic variation, ignoring the exact processes that cause the phenotype. Here, we present a framework based on genome-scale metabolic reconstructions to reveal the mechanisms behind the associations. We calculated a polygenic score (PGS) that identifies a set of enzymes as predictors of growth, the phenotype. This set arises from the synergy of the functional mode of metabolism in a particular setting and its evolutionary history, and is suitable to infer the phenotype across a variety of conditions. We also find that there is optimal genetic variation for predictability and demonstrate how the linear PGS can still explain phenotypes generated by the underlying nonlinear biochemistry. Therefore, the explicit model interprets the black box statistical associations of the genotype-to-phenotype map and helps to discover what limits the prediction in metabolism.
Collapse
Affiliation(s)
- Pablo Yubero
- Logic of Genomic Systems Lab, CNB-CSIC, Madrid, Spain
| | | | | |
Collapse
|
46
|
Mostafavi H, Spence JP, Naqvi S, Pritchard JK. Systematic differences in discovery of genetic effects on gene expression and complex traits. Nat Genet 2023; 55:1866-1875. [PMID: 37857933 DOI: 10.1038/s41588-023-01529-1] [Citation(s) in RCA: 67] [Impact Index Per Article: 67.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 09/14/2023] [Indexed: 10/21/2023]
Abstract
Most signals in genome-wide association studies (GWAS) of complex traits implicate noncoding genetic variants with putative gene regulatory effects. However, currently identified regulatory variants, notably expression quantitative trait loci (eQTLs), explain only a small fraction of GWAS signals. Here, we show that GWAS and cis-eQTL hits are systematically different: eQTLs cluster strongly near transcription start sites, whereas GWAS hits do not. Genes near GWAS hits are enriched in key functional annotations, are under strong selective constraint and have complex regulatory landscapes across different tissue/cell types, whereas genes near eQTLs are depleted of most functional annotations, show relaxed constraint, and have simpler regulatory landscapes. We describe a model to understand these observations, including how natural selection on complex traits hinders discovery of functionally relevant eQTLs. Our results imply that GWAS and eQTL studies are systematically biased toward different types of variant, and support the use of complementary functional approaches alongside the next generation of eQTL studies.
Collapse
Affiliation(s)
| | | | - Sahin Naqvi
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA, USA
| | - Jonathan K Pritchard
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Biology, Stanford University, Stanford, CA, USA.
| |
Collapse
|
47
|
Liang Y, Nyasimi F, Im HK. On the problem of inflation in transcriptome-wide association studies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.17.562831. [PMID: 37904952 PMCID: PMC10614931 DOI: 10.1101/2023.10.17.562831] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
Hundreds of thousands of loci have been associated with complex traits via genome-wide association studies (GWAS), but an understanding of the mechanistic connection between GWAS loci and disease remains elusive. Genetic predictors of molecular traits are useful for identifying the mediating roles of molecular traits and prioritizing actionable targets for intervention, as demonstrated in transcriptome-wide association studies (TWAS) and related studies. Given the widespread polygenicity of complex traits, it is imperative to understand the effect of polygenicity on the validity of these mediator-trait association tests. We found that for highly polygenic target traits, the standard test based on linear regression is inflated E χ twas 2 > 1 . This inflation has implications for all TWAS and related methods where the complex trait can be highly polygenic-even if the mediating trait is sparse. We derive an asymptotic expression of the inflation, estimate the inflation for gene expression, metabolites, and brain image derived features, and propose a solution to correct the inflation.
Collapse
Affiliation(s)
- Yanyu Liang
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Festus Nyasimi
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
| | - Hae Kyung Im
- Section of Genetic Medicine, University of Chicago, Chicago, Illinois, United States of America
- Computing Environment and Life Sciences Directorate, Argonne National Laboratory, Argonne, Illinois, United States of America
| |
Collapse
|
48
|
Owen MJ, Legge SE, Rees E, Walters JTR, O'Donovan MC. Genomic findings in schizophrenia and their implications. Mol Psychiatry 2023; 28:3638-3647. [PMID: 37853064 PMCID: PMC10730422 DOI: 10.1038/s41380-023-02293-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 10/02/2023] [Accepted: 10/03/2023] [Indexed: 10/20/2023]
Abstract
There has been substantial progress in understanding the genetics of schizophrenia over the past 15 years. This has revealed a highly polygenic condition with the majority of the currently explained heritability coming from common alleles of small effect but with additional contributions from rare copy number and coding variants. Many specific genes and loci have been implicated that provide a firm basis upon which mechanistic research can proceed. These point to disturbances in neuronal, and particularly synaptic, functions that are not confined to a small number of brain regions and circuits. Genetic findings have also revealed the nature of schizophrenia's close relationship to other conditions, particularly bipolar disorder and childhood neurodevelopmental disorders, and provided an explanation for how common risk alleles persist in the population in the face of reduced fecundity. Current genomic approaches only potentially explain around 40% of heritability, but only a small proportion of this is attributable to robustly identified loci. The extreme polygenicity poses challenges for understanding biological mechanisms. The high degree of pleiotropy points to the need for more transdiagnostic research and the shortcomings of current diagnostic criteria as means of delineating biologically distinct strata. It also poses challenges for inferring causality in observational and experimental studies in both humans and model systems. Finally, the Eurocentric bias of genomic studies needs to be rectified to maximise benefits and ensure these are felt across diverse communities. Further advances are likely to come through the application of new and emerging technologies, such as whole-genome and long-read sequencing, to large and diverse samples. Substantive progress in biological understanding will require parallel advances in functional genomics and proteomics applied to the brain across developmental stages. For these efforts to succeed in identifying disease mechanisms and defining novel strata they will need to be combined with sufficiently granular phenotypic data.
Collapse
Affiliation(s)
- Michael J Owen
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK.
| | - Sophie E Legge
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
| | - Elliott Rees
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
| | - James T R Walters
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
| | - Michael C O'Donovan
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK.
| |
Collapse
|
49
|
Gusev A. Germline mechanisms of immunotherapy toxicities in the era of genome-wide association studies. Immunol Rev 2023; 318:138-156. [PMID: 37515388 PMCID: PMC11472697 DOI: 10.1111/imr.13253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 06/29/2023] [Indexed: 07/30/2023]
Abstract
Cancer immunotherapy has revolutionized the treatment of advanced cancers and is quickly becoming an option for early-stage disease. By reactivating the host immune system, immunotherapy harnesses patients' innate defenses to eradicate the tumor. By putatively similar mechanisms, immunotherapy can also substantially increase the risk of toxicities or immune-related adverse events (irAEs). Severe irAEs can lead to hospitalization, treatment discontinuation, lifelong immune complications, or even death. Many irAEs present with similar symptoms to heritable autoimmune diseases, suggesting that germline genetics may contribute to their onset. Recently, genome-wide association studies (GWAS) of irAEs have identified common germline associations and putative mechanisms, lending support to this hypothesis. A wide range of well-established GWAS methods can potentially be harnessed to understand the etiology of irAEs specifically and immunotherapy outcomes broadly. This review summarizes current findings regarding germline effects on immunotherapy outcomes and discusses opportunities and challenges for leveraging germline genetics to understand, predict, and treat irAEs.
Collapse
Affiliation(s)
- Alexander Gusev
- Division of Population Sciences, Dana-Farber Cancer Institute and Harvard Medical School, Boston, Massachusetts, USA
- Division of Genetics, Brigham & Women's Hospital, Boston, Massachusetts, USA
- The Broad Institute, Cambridge, Massachusetts, USA
| |
Collapse
|
50
|
De Lillo A, Wendt FR, Pathak GA, Polimanti R. Characterizing the polygenic architecture of complex traits in populations of East Asian and European descent. Hum Genomics 2023; 17:67. [PMID: 37475089 PMCID: PMC10360343 DOI: 10.1186/s40246-023-00514-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 07/14/2023] [Indexed: 07/22/2023] Open
Abstract
To investigate the polygenicity of complex traits in populations of East Asian (EAS) and European (EUR) descents, we leveraged genome-wide data from Biobank Japan, UK Biobank, and FinnGen cohorts. Specifically, we analyzed up to 215 outcomes related to 18 health domains, assessing their polygenic architecture via descriptive statistics, such as the proportion of susceptibility SNPs per trait (πc). While we did not observe EAS-EUR differences in the overall distribution of polygenicity parameters across the phenotypes investigated, there were ancestry-specific patterns in the polygenicity differences between health domains. In EAS, pairwise comparisons across health domains showed enrichment for πc differences related to hematological and metabolic traits (hematological fold-enrichment = 4.45, p = 2.15 × 10-7; metabolic fold-enrichment = 4.05, p = 4.01 × 10-6). For both categories, the proportion of susceptibility SNPs was lower than that observed for several other health domains (EAS-hematological median πc = 0.15%, EAS-metabolic median πc = 0.18%) with the strongest πc difference with respect to respiratory traits (EAS-respiratory median πc = 0.50%; hematological-p = 2.26 × 10-3; metabolic-p = 3.48 × 10-3). In EUR, pairwise comparisons showed multiple πc differences related to the endocrine category (fold-enrichment = 5.83, p = 4.76 × 10-6), where these traits showed a low proportion of susceptibility SNPs (EUR-endocrine median πc = 0.01%) with the strongest difference with respect to psychiatric phenotypes (EUR-psychiatric median πc = 0.50%; p = 1.19 × 10-4). Simulating sample sizes of 1,000,000 and 5,000,000 individuals, we also showed that ancestry-specific polygenicity patterns translate into differences across health domains in the genetic variance explained by susceptibility SNPs projected to be genome-wide significant (e.g., EAS hematological-neoplasm p = 2.18 × 10-4; EUR endocrine-gastrointestinal p = 6.80 × 10-4). These findings highlight that traits related to the same health domains may present ancestry-specific variability in their polygenicity.
Collapse
Affiliation(s)
- Antonella De Lillo
- Department of Psychiatry, Yale University School of Medicine, 60 Temple, Suite 7A, New Haven, CT, 06510, USA
- Department of Biology, University of Rome "Tor Vergata", Rome, Italy
| | - Frank R Wendt
- Department of Psychiatry, Yale University School of Medicine, 60 Temple, Suite 7A, New Haven, CT, 06510, USA
- Department of Anthropology, University of Toronto, Mississauga, ON, Canada
- Biostatistics Division, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada
| | - Gita A Pathak
- Department of Psychiatry, Yale University School of Medicine, 60 Temple, Suite 7A, New Haven, CT, 06510, USA
- VA CT Healthcare Center, West Haven, CT, USA
| | - Renato Polimanti
- Department of Psychiatry, Yale University School of Medicine, 60 Temple, Suite 7A, New Haven, CT, 06510, USA.
- VA CT Healthcare Center, West Haven, CT, USA.
- Wu Tsai Institute, Yale University, New Haven, CT, USA.
| |
Collapse
|