1
|
Wen C, Margolis M, Dai R, Zhang P, Przytycki PF, Vo DD, Bhattacharya A, Matoba N, Tang M, Jiao C, Kim M, Tsai E, Hoh C, Aygün N, Walker RL, Chatzinakos C, Clarke D, Pratt H, Peters MA, Gerstein M, Daskalakis NP, Weng Z, Jaffe AE, Kleinman JE, Hyde TM, Weinberger DR, Bray NJ, Sestan N, Geschwind DH, Roeder K, Gusev A, Pasaniuc B, Stein JL, Love MI, Pollard KS, Liu C, Gandal MJ. Cross-ancestry atlas of gene, isoform, and splicing regulation in the developing human brain. Science 2024; 384:eadh0829. [PMID: 38781368 DOI: 10.1126/science.adh0829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 03/07/2024] [Indexed: 05/25/2024]
Abstract
Neuropsychiatric genome-wide association studies (GWASs), including those for autism spectrum disorder and schizophrenia, show strong enrichment for regulatory elements in the developing brain. However, prioritizing risk genes and mechanisms is challenging without a unified regulatory atlas. Across 672 diverse developing human brains, we identified 15,752 genes harboring gene, isoform, and/or splicing quantitative trait loci, mapping 3739 to cellular contexts. Gene expression heritability drops during development, likely reflecting both increasing cellular heterogeneity and the intrinsic properties of neuronal maturation. Isoform-level regulation, particularly in the second trimester, mediated the largest proportion of GWAS heritability. Through colocalization, we prioritized mechanisms for about 60% of GWAS loci across five disorders, exceeding adult brain findings. Finally, we contextualized results within gene and isoform coexpression networks, revealing the comprehensive landscape of transcriptome regulation in development and disease.
Collapse
Affiliation(s)
- Cindy Wen
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Michael Margolis
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Rujia Dai
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, USA
| | - Pan Zhang
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Pawel F Przytycki
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA
| | - Daniel D Vo
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Miao Tang
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Chuan Jiao
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, USA
- Université Paris Cité, Institute of Psychiatry and Neuroscience of Paris (IPNP), INSERM U1266, Team Krebs, 75014 Paris, France
| | - Minsoo Kim
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Ellen Tsai
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Celine Hoh
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Rebecca L Walker
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Christos Chatzinakos
- Department of Psychiatry, Harvard Medical School, Boston, MA 02215, USA
- McLean Hospital, Belmont, MA 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Declan Clarke
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Henry Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Mette A Peters
- CNS Data Coordination Group, Sage Bionetworks, Seattle, WA 98109, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Nikolaos P Daskalakis
- Department of Psychiatry, Harvard Medical School, Boston, MA 02215, USA
- McLean Hospital, Belmont, MA 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Andrew E Jaffe
- Lieber Institute for Brain Development, Baltimore, MD 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
- Neumora Therapeutics, Watertown, MA 02472, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development, Baltimore, MD 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development, Baltimore, MD 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Daniel R Weinberger
- Lieber Institute for Brain Development, Baltimore, MD 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Nicholas J Bray
- MRC Centre for Neuropsychiatric Genetics & Genomics, Division of Psychological Medicine & Clinical Neurosciences, Cardiff University School of Medicine, Cardiff CF24 4HQ, UK
| | - Nenad Sestan
- Department of Comparative Medicine, Yale University School of Medicine, New Haven, CT 06520, USA
- Department of Neuroscience, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kathryn Roeder
- Department of Statistics & Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alexander Gusev
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute, Boston, MA 02215, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Harvard Medical School, Boston, MA 02215, USA
- Division of Genetics, Brigham and Women's Hospital, Boston, MA 02215, USA
| | - Bogdan Pasaniuc
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA 94158, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, USA
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410008, China
| | - Michael J Gandal
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| |
Collapse
|
2
|
Moors J, Krishnan M, Sumpter N, Takei R, Bixley M, Cadzow M, Major TJ, Phipps-Green A, Topless R, Merriman M, Rutledge M, Morgan B, Carlson JC, Zhang JZ, Russell EM, Sun G, Cheng H, Weeks DE, Naseri T, Reupena MS, Viali S, Tuitele J, Hawley NL, Deka R, McGarvey ST, de Zoysa J, Murphy R, Dalbeth N, Stamp L, Taumoepeau M, King F, Wilcox P, Rapana N, McCormick S, Minster RL, Merriman TR, Leask M. A Polynesian -specific missense CETP variant alters the lipid profile. HGG ADVANCES 2023; 4:100204. [PMID: 37250494 PMCID: PMC10209881 DOI: 10.1016/j.xhgg.2023.100204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 05/01/2023] [Accepted: 05/02/2023] [Indexed: 05/31/2023] Open
Abstract
Identifying population-specific genetic variants associated with disease and disease-predisposing traits is important to provide insights into the genetic determinants of health and disease between populations, as well as furthering genomic justice. Various common pan-population polymorphisms at CETP associate with serum lipid profiles and cardiovascular disease. Here, sequencing of CETP identified a missense variant rs1597000001 (p.Pro177Leu) specific to Māori and Pacific people that associates with higher HDL-C and lower LDL-C levels. Each copy of the minor allele associated with higher HDL-C by 0.236 mmol/L and lower LDL-C by 0.133 mmol/L. The rs1597000001 effect on HDL-C is comparable with CETP Mendelian loss-of-function mutations that result in CETP deficiency, consistent with our data, which shows that rs1597000001 lowers CETP activity by 27.9%. This study highlights the potential of population-specific genetic analyses for improving equity in genomics and health outcomes for population groups underrepresented in genomic studies.
Collapse
Affiliation(s)
- Jaye Moors
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Mohanraj Krishnan
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nick Sumpter
- Division of Clinical Rheumatology and Immunology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Riku Takei
- Division of Clinical Rheumatology and Immunology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Matt Bixley
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Murray Cadzow
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Tanya J. Major
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | | | - Ruth Topless
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Marilyn Merriman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Malcolm Rutledge
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Ben Morgan
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Jenna C. Carlson
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jerry Z. Zhang
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Emily M. Russell
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Guangyun Sun
- Department of Environmental and Public Health Sciences, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Hong Cheng
- Department of Environmental and Public Health Sciences, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Daniel E. Weeks
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Take Naseri
- Ministry of Health, Apia, Samoa
- International Health Institute, Department of Epidemiology, School of Public Health, Brown University, Providence, RI, USA
| | | | | | - John Tuitele
- Department of Public Health, Lyndon B. Johnson Tropical Medical Center, Faga’alu, American Samoa, USA
| | - Nicola L. Hawley
- Department of Chronic Disease Epidemiology, Yale School of Public Health, New Haven, CT, USA
| | - Ranjan Deka
- Department of Environmental and Public Health Sciences, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Stephen T. McGarvey
- International Health Institute, Department of Epidemiology, School of Public Health, Brown University, Providence, RI, USA
| | - Janak de Zoysa
- Department of Medicine, University of Auckland, Auckland, New Zealand
| | - Rinki Murphy
- Department of Medicine, University of Auckland, Auckland, New Zealand
| | - Nicola Dalbeth
- Department of Medicine, University of Auckland, Auckland, New Zealand
| | - Lisa Stamp
- Department of Medicine, University of Otago, Christchurch, New Zealand
| | - Mele Taumoepeau
- Department of Psychology, University of Otago, Dunedin, New Zealand
| | - Frances King
- Ngāti Porou Hauora, Te Puia Springs, New Zealand
| | - Phillip Wilcox
- Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand
| | - Nuku Rapana
- Pukapukan Community Centre, Māngere, Auckland, New Zealand
| | - Sally McCormick
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Ryan L. Minster
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Tony R. Merriman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
- Division of Clinical Rheumatology and Immunology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Megan Leask
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
- Division of Clinical Rheumatology and Immunology, University of Alabama at Birmingham, Birmingham, AL, USA
| |
Collapse
|
3
|
Krishnan M, Phipps-Green A, Russell EM, Major TJ, Cadzow M, Stamp LK, Dalbeth N, Hindmarsh JH, Qasim M, Watson H, Liu S, Carlson JC, Minster RL, Hawley NL, Naseri T, Reupena MS, Deka R, McGarvey ST, Merriman TR, Murphy R, Weeks DE. Association of rs9939609 in FTO with BMI among Polynesian peoples living in Aotearoa New Zealand and other Pacific nations. J Hum Genet 2023; 68:463-468. [PMID: 36864286 PMCID: PMC10313811 DOI: 10.1038/s10038-023-01141-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 01/30/2023] [Accepted: 02/19/2023] [Indexed: 03/04/2023]
Abstract
The fat mass and obesity associated (FTO) locus consistently associates with higher body mass index (BMI) across diverse ancestral groups. However, previous small studies of people of Polynesian ancestries have failed to replicate the association. In this study, we used Bayesian meta-analysis to test rs9939609, the most replicated FTO variant, for association with BMI with a large sample (n = 6095) of Aotearoa New Zealanders of Polynesian (Māori and Pacific) ancestry and of Samoan people living in the Independent State of Samoa and in American Samoa. We did not observe statistically significant association within each separate Polynesian subgroup. Bayesian meta-analysis of the Aotearoa New Zealand Polynesian and Samoan samples resulted in a posterior mean effect size estimate of +0.21 kg/m2, with a 95% credible interval [+0.03 kg/m2, +0.39 kg/m2]. While the Bayes Factor (BF) of 0.77 weakly favors the null, the BF = 1.4 Bayesian support interval is [+0.04, +0.20]. These results suggest that rs9939609 in FTO may have a similar effect on mean BMI in people of Polynesian ancestries as previously observed in other ancestral groups.
Collapse
Affiliation(s)
- Mohanraj Krishnan
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Emily M Russell
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Tanya J Major
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Murray Cadzow
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Lisa K Stamp
- Department of Medicine, University of Otago, Christchurch, New Zealand
| | - Nicola Dalbeth
- Department of Medicine, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
- Maurice Wilkins Centre, Auckland, New Zealand
| | - Jennie Harré Hindmarsh
- Ngāti Porou Hauora Charitable Trust, Te Puia Springs, Tairāwhiti East Coast, New Zealand
| | - Muhammad Qasim
- Ngāti Porou Hauora Charitable Trust, Te Puia Springs, Tairāwhiti East Coast, New Zealand
| | - Huti Watson
- Ngāti Porou Hauora Charitable Trust, Te Puia Springs, Tairāwhiti East Coast, New Zealand
| | - Shuwei Liu
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jenna C Carlson
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ryan L Minster
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Nicola L Hawley
- Department of Chronic Disease Epidemiology, School of Public Health, Yale University, New Haven, CT, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
- International Health Institute, Department of Epidemiology, School of Public Health, Brown University, Providence, RI, USA
| | | | - Ranjan Deka
- Department of Environmental and Public Health Sciences, College of Medicine, University of Cincinnati, Cincinnati, OH, USA
| | - Stephen T McGarvey
- International Health Institute, Department of Epidemiology, School of Public Health, Brown University, Providence, RI, USA
- Department of Anthropology, Brown University, Providence, RI, USA
| | - Tony R Merriman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Rinki Murphy
- Department of Medicine, Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
- Maurice Wilkins Centre, Auckland, New Zealand
| | - Daniel E Weeks
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA.
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
4
|
Colbran LL, Ramos-Almodovar FC, Mathieson I. A gene-level test for directional selection on gene expression. Genetics 2023; 224:iyad060. [PMID: 37036411 PMCID: PMC10213495 DOI: 10.1093/genetics/iyad060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 01/11/2023] [Accepted: 03/31/2023] [Indexed: 04/11/2023] Open
Abstract
Most variants identified in human genome-wide association studies and scans for selection are noncoding. Interpretation of their effects and the way in which they contribute to phenotypic variation and adaptation in human populations is therefore limited by our understanding of gene regulation and the difficulty of confidently linking noncoding variants to genes. To overcome this, we developed a gene-wise test for population-specific selection based on combinations of regulatory variants. Specifically, we use the QX statistic to test for polygenic selection on cis-regulatory variants based on whether the variance across populations in the predicted expression of a particular gene is higher than expected under neutrality. We then applied this approach to human data, testing for selection on 17,388 protein-coding genes in 26 populations from the Thousand Genomes Project. We identified 45 genes with significant evidence (FDR<0.1) for selection, including FADS1, KHK, SULT1A2, ITGAM, and several genes in the HLA region. We further confirm that these signals correspond to plausible population-level differences in predicted expression. While the small number of significant genes (0.2%) is consistent with most cis-regulatory variation evolving under genetic drift or stabilizing selection, it remains possible that there are effects not captured in this study. Our gene-level QX score is independent of standard genomic tests for selection, and may therefore be useful in combination with traditional selection scans to specifically identify selection on regulatory variation. Overall, our results demonstrate the utility of combining population-level genomic data with functional data to understand the evolution of gene expression.
Collapse
Affiliation(s)
- Laura L Colbran
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | | | | |
Collapse
|
5
|
Chu BB, Ko S, Zhou JJ, Jensen A, Zhou H, Sinsheimer JS, Lange K. Multivariate genome-wide association analysis by iterative hard thresholding. Bioinformatics 2023; 39:btad193. [PMID: 37067496 PMCID: PMC10133532 DOI: 10.1093/bioinformatics/btad193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 04/07/2023] [Accepted: 04/13/2023] [Indexed: 04/18/2023] Open
Abstract
MOTIVATION In a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive. RESULTS We present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA's linear mixed models and mv-PLINK's canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits. AVAILABILITY AND IMPLEMENTATION Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.
Collapse
Affiliation(s)
- Benjamin B Chu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Seyoon Ko
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Jin J Zhou
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Aubrey Jensen
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Hua Zhou
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
| | - Janet S Sinsheimer
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
| | - Kenneth Lange
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095-1554, United States
- Department of Statistics at UCLA, Los Angeles, CA 90095-1554, United States
| |
Collapse
|
6
|
Wen C, Margolis M, Dai R, Zhang P, Przytycki PF, Vo DD, Bhattacharya A, Matoba N, Jiao C, Kim M, Tsai E, Hoh C, Aygün N, Walker RL, Chatzinakos C, Clarke D, Pratt H, Consortium P, Peters MA, Gerstein M, Daskalakis NP, Weng Z, Jaffe AE, Kleinman JE, Hyde TM, Weinberger DR, Bray NJ, Sestan N, Geschwind DH, Roeder K, Gusev A, Pasaniuc B, Stein JL, Love MI, Pollard KS, Liu C, Gandal MJ. Cross-ancestry, cell-type-informed atlas of gene, isoform, and splicing regulation in the developing human brain. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.03.23286706. [PMID: 36945630 PMCID: PMC10029021 DOI: 10.1101/2023.03.03.23286706] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
Genomic regulatory elements active in the developing human brain are notably enriched in genetic risk for neuropsychiatric disorders, including autism spectrum disorder (ASD), schizophrenia, and bipolar disorder. However, prioritizing the specific risk genes and candidate molecular mechanisms underlying these genetic enrichments has been hindered by the lack of a single unified large-scale gene regulatory atlas of human brain development. Here, we uniformly process and systematically characterize gene, isoform, and splicing quantitative trait loci (xQTLs) in 672 fetal brain samples from unique subjects across multiple ancestral populations. We identify 15,752 genes harboring a significant xQTL and map 3,739 eQTLs to a specific cellular context. We observe a striking drop in gene expression and splicing heritability as the human brain develops. Isoform-level regulation, particularly in the second trimester, mediates the greatest proportion of heritability across multiple psychiatric GWAS, compared with eQTLs. Via colocalization and TWAS, we prioritize biological mechanisms for ~60% of GWAS loci across five neuropsychiatric disorders, nearly two-fold that observed in the adult brain. Finally, we build a comprehensive set of developmentally regulated gene and isoform co-expression networks capturing unique genetic enrichments across disorders. Together, this work provides a comprehensive view of genetic regulation across human brain development as well as the stage-and cell type-informed mechanistic underpinnings of neuropsychiatric disorders.
Collapse
Affiliation(s)
- Cindy Wen
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Michael Margolis
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Rujia Dai
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
| | - Pan Zhang
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Pawel F Przytycki
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
| | - Daniel D Vo
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Chuan Jiao
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
| | - Minsoo Kim
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Ellen Tsai
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Celine Hoh
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Nil Aygün
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Rebecca L Walker
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Christos Chatzinakos
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
| | - Declan Clarke
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
| | - Henry Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
| | - PsychENCODE Consortium
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
- CNS Data Coordination Group, Sage Bionetworks; Seattle, WA, 98109, USA
- Program in Computational Biology and Bioinformatics, Yale University; New Haven, CT, 06520, USA
- Department of Computer Science, Yale University; New Haven, CT, 06520, USA
- Department of Statistics and Data Science, Yale University; New Haven, CT, 06520, USA
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Neumora Therapeutics; Watertown, MA, 02472, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- MRC Centre for Neuropsychiatric Genetics & Genomics, Division of Psychological Medicine & Clinical Neurosciences, Cardiff University School of Medicine; Cardiff, CF24 4HQ, UK
- Department of Comparative Medicine, Yale University School of Medicine; New Haven, CT, 06520, USA
- Department of Neuroscience, Yale University School of Medicine; New Haven, CT, 06520, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Statistics & Data Science, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Computational Biology Department, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute; Boston, MA, 02215, USA
- Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Harvard Medical School; Boston, MA, 02215, USA
- Division of Genetics, Brigham and Women's Hospital; Boston, MA, 02215, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco; San Francisco, CA, 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA, 94158, USA
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University; Changsha, Hunan, 410008, China
| | - Mette A Peters
- CNS Data Coordination Group, Sage Bionetworks; Seattle, WA, 98109, USA
| | - Mark Gerstein
- Department of Molecular Biophysics and Biochemistry, Yale University; New Haven, CT, 06520, USA
- Program in Computational Biology and Bioinformatics, Yale University; New Haven, CT, 06520, USA
- Department of Computer Science, Yale University; New Haven, CT, 06520, USA
- Department of Statistics and Data Science, Yale University; New Haven, CT, 06520, USA
| | - Nikolaos P Daskalakis
- Department of Psychiatry, Harvard Medical School; Boston, MA, 02215, USA
- McLean Hospital; Belmont, MA, 02478, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School; Worcester, MA, 01605, USA
| | - Andrew E Jaffe
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Mental Health, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health; Baltimore, MD, 21205, USA
- Neumora Therapeutics; Watertown, MA, 02472, USA
| | - Joel E Kleinman
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Thomas M Hyde
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Daniel R Weinberger
- Lieber Institute for Brain Development; Baltimore, MD, 21205, USA
- Department of Psychiatry & Behavioral Sciences, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neuroscience, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Genetic Medicine, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
- Department of Neurology, Johns Hopkins University School of Medicine; Baltimore, MD, 21205, USA
| | - Nicholas J Bray
- MRC Centre for Neuropsychiatric Genetics & Genomics, Division of Psychological Medicine & Clinical Neurosciences, Cardiff University School of Medicine; Cardiff, CF24 4HQ, UK
| | - Nenad Sestan
- Department of Comparative Medicine, Yale University School of Medicine; New Haven, CT, 06520, USA
- Department of Neuroscience, Yale University School of Medicine; New Haven, CT, 06520, USA
| | - Daniel H Geschwind
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Kathryn Roeder
- Department of Statistics & Data Science, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
- Computational Biology Department, Carnegie Mellon University; Pittsburgh, PA, 15213, USA
| | - Alexander Gusev
- Department of Medical Oncology, Division of Population Sciences, Dana-Farber Cancer Institute; Boston, MA, 02215, USA
- Broad Institute of MIT and Harvard; Cambridge, MA, 02142, USA
- Harvard Medical School; Boston, MA, 02215, USA
- Division of Genetics, Brigham and Women's Hospital; Boston, MA, 02215, USA
| | - Bogdan Pasaniuc
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Institute for Precision Health, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Michael I Love
- Department of Genetics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
- Department of Biostatistics, University of North Carolina at Chapel Hill; Chapel Hill, NC, 27599, USA
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology; San Francisco, CA, 94158, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco; San Francisco, CA, 94158, USA
- Chan Zuckerberg Biohub; San Francisco, CA, 94158, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University; Syracuse, NY, 13210, USA
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University; Changsha, Hunan, 410008, China
| | - Michael J Gandal
- Interdepartmental Program in Bioinformatics, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles; Los Angeles, CA, 90095, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania; Philadelphia, PA, 19104, USA
- Lifespan Brain Institute, The Children's Hospital of Philadelphia; Philadelphia, PA, 19104, USA
| |
Collapse
|
7
|
Ko S, Chu BB, Peterson D, Okenwa C, Papp JC, Alexander DH, Sobel EM, Zhou H, Lange KL. Unsupervised discovery of ancestry-informative markers and genetic admixture proportions in biobank-scale datasets. Am J Hum Genet 2023; 110:314-325. [PMID: 36610401 PMCID: PMC9943729 DOI: 10.1016/j.ajhg.2022.12.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 12/12/2022] [Indexed: 01/09/2023] Open
Abstract
Admixture estimation plays a crucial role in ancestry inference and genome-wide association studies (GWASs). Computer programs such as ADMIXTURE and STRUCTURE are commonly employed to estimate the admixture proportions of sample individuals. However, these programs can be overwhelmed by the computational burdens imposed by the 105 to 106 samples and millions of markers commonly found in modern biobanks. An attractive strategy is to run these programs on a set of ancestry-informative SNP markers (AIMs) that exhibit substantially different frequencies across populations. Unfortunately, existing methods for identifying AIMs require knowing ancestry labels for a subset of the sample. This supervised learning approach creates a chicken and the egg scenario. In this paper, we present an unsupervised, scalable framework that seamlessly carries out AIM selection and likelihood-based estimation of admixture proportions. Our simulated and real data examples show that this approach is scalable to modern biobank datasets. OpenADMIXTURE, our Julia implementation of the method, is open source and available for free.
Collapse
Affiliation(s)
- Seyoon Ko
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Benjamin B. Chu
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA
| | - Daniel Peterson
- Department of Mathematics, Brigham Young University, Provo, UT 84602, USA
| | - Chidera Okenwa
- Department of Mathematics, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Jeanette C. Papp
- Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | | | - Eric M. Sobel
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA,Corresponding author
| | - Hua Zhou
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Kenneth L. Lange
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Human Genetics, University of California, Los Angeles, Los Angeles, CA 90095, USA,Department of Statistics, University of California, Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
8
|
Kim M, Vo DD, Kumagai ME, Jops CT, Gandal MJ. GeneticsMakie.jl: a versatile and scalable toolkit for visualizing locus-level genetic and genomic data. Bioinformatics 2023; 39:6887175. [PMID: 36495218 PMCID: PMC9825774 DOI: 10.1093/bioinformatics/btac786] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 10/29/2022] [Accepted: 12/09/2022] [Indexed: 12/14/2022] Open
Abstract
SUMMARY With the continued deluge of results from genome-wide association and functional genomic studies, it has become increasingly imperative to quickly combine and visualize different layers of genetic and genomic data within a given locus to facilitate exploratory and integrative data analyses. While several tools have been developed to visualize locus-level genetic results, the limited speed, scalability and flexibility of current approaches remain a significant bottleneck. Here, we present a Julia package for high-performance genetics and genomics-related data visualization that enables fast, simultaneous plotting of hundreds of association results along with multiple relevant genomic annotations. Leveraging the powerful plotting and layout utilities from Makie.jl facilitates the customization and extensibility of every component of a plot, enabling generation of publication-ready figures. AVAILABILITY AND IMPLEMENTATION The GeneticsMakie.jl package is open source and distributed under the MIT license via GitHub (https://github.com/mmkim1210/GeneticsMakie.jl). The GitHub repository contains installation instructions as well as examples and documentation for built-in functions. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Minsoo Kim
- To whom correspondence should be addressed. or
| | | | | | | | | |
Collapse
|
9
|
Eilertsen EM, Cheesman R, Ayorech Z, Røysamb E, Pingault J, Njølstad PR, Andreassen OA, Havdahl A, McAdams TA, Torvik FA, Ystrøm E. On the importance of parenting in externalizing disorders: an evaluation of indirect genetic effects in families. J Child Psychol Psychiatry 2022; 63:1186-1195. [PMID: 35778910 PMCID: PMC9796091 DOI: 10.1111/jcpp.13654] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/08/2022] [Indexed: 12/30/2022]
Abstract
BACKGROUND Theoretical models of the development of childhood externalizing disorders emphasize the role of parents. Empirical studies have not been able to identify specific aspects of parental behaviors explaining a considerable proportion of the observed individual differences in externalizing problems. The problem is complicated by the contribution of genetic factors to externalizing problems, as parents provide both genes and environments to their children. We studied the joint contributions of direct genetic effects of children and the indirect genetic effects of parents through the environment on externalizing problems. METHODS The study used genome-wide single nucleotide polymorphism data from 9,675 parent-offspring trios participating in the Norwegian Mother Father and child cohort study. Based on genomic relatedness matrices, we estimated the contribution of direct genetic effects and indirect maternal and paternal genetic effects on ADHD, conduct and disruptive behaviors at 8 years of age. RESULTS Models including indirect parental genetic effects were preferred for the ADHD symptoms of inattention and hyperactivity, and conduct problems, but not oppositional defiant behaviors. Direct genetic effects accounted for 11% to 24% of the variance, whereas indirect parental genetic effects accounted for 0% to 16% in ADHD symptoms and conduct problems. The correlation between direct and indirect genetic effects, or gene-environment correlations, decreased the variance with 16% and 13% for conduct and inattention problems, and increased the variance with 6% for hyperactivity problems. CONCLUSIONS This study provides empirical support to the notion that parents have a significant role in the development of childhood externalizing behaviors. The parental contribution to decrease in variation of inattention and conduct problems by gene-environment correlations would limit the number of children reaching clinical ranges in symptoms. Not accounting for indirect parental genetic effects can lead to both positive and negative bias when identifying genetic variants for childhood externalizing behaviors.
Collapse
Affiliation(s)
- Espen M. Eilertsen
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway,Centre for Fertility and HealthNorwegian Institute of Public HealthOsloNorway
| | - Rosa Cheesman
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway
| | - Ziada Ayorech
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway
| | - Espen Røysamb
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway
| | - Jean‐Baptiste Pingault
- Division of Psychology and Language SciencesUniversity College LondonLondonUK,MRC Social, Genetic and Developmental Psychiatry CentreInstitute of Psychiatry, King's CollegeLondonUK
| | - Pål R. Njølstad
- Department of Clinical Science, Center for Diabetes ResearchUniversity of BergenBergenNorway,Children and Youth ClinicHaukeland University HospitalBergenNorway
| | - Ole A. Andreassen
- Division of Mental Health and Addiction, NORMENTOslo University HospitalOsloNorway,Institute of Clinical MedicineUniversity of OsloOsloNorway
| | - Alexandra Havdahl
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway,Department of Mental DisordersNorwegian Institute of Public HealthOsloNorway,Nic Waals Institute, Lovisenberg Diaconal HospitalOsloNorway
| | - Tom A. McAdams
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway,Social, Genetic and Developmental Psychiatry CentreInstitute of Psychiatry, Psychology and Neuroscience, King's College LondonLondonUK
| | - Fartein A. Torvik
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway,Centre for Fertility and HealthNorwegian Institute of Public HealthOsloNorway
| | - Eivind Ystrøm
- Department of Psychology, PROMENTA Research CenterUniversity of OsloOsloNorway,Department of Mental DisordersNorwegian Institute of Public HealthOsloNorway,School of PharmacyUniversity of OsloOsloNorway
| |
Collapse
|
10
|
Russell EM, Carlson JC, Krishnan M, Hawley NL, Sun G, Cheng H, Naseri T, Reupena MS, Viali S, Tuitele J, Major TJ, Miljkovic I, Merriman TR, Deka R, Weeks DE, McGarvey ST, Minster RL. CREBRF missense variant rs373863828 has both direct and indirect effects on type 2 diabetes and fasting glucose in Polynesian peoples living in Samoa and Aotearoa New Zealand. BMJ Open Diabetes Res Care 2022; 10:10/1/e002275. [PMID: 35144939 PMCID: PMC8845200 DOI: 10.1136/bmjdrc-2021-002275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 12/07/2021] [Indexed: 11/16/2022] Open
Abstract
INTRODUCTION The minor allele of a missense variant, rs373863828, in CREBRF is associated with higher body mass index (BMI), lower fasting glucose, and lower odds of type 2 diabetes. rs373863828 is common in Pacific Island populations (minor allele frequency (MAF) 0.096-0.259) but rare in non-Pacific Island populations (MAF <0.001). We examined the cross-sectional associations between BMI and rs373863828 in type 2 diabetes and fasting glucose with a large sample of adults of Polynesian ancestries from Samoa, American Samoa, and Aotearoa New Zealand, and estimated the direct and indirect (via BMI) effects of rs373863828 on type 2 diabetes and fasting glucose. RESEARCH DESIGN AND METHODS We regressed type 2 diabetes and fasting glucose on BMI and rs373863828 stratified by obesity, regressed type 2 diabetes and fasting glucose on BMI stratified by rs373863828 genotype, and assessed the effects of rs373863828 on type 2 diabetes and fasting glucose with path analysis. The regression analyses were completed separately in four samples that were recruited during different time periods between 1990 and 2010 and then the results were meta-analyzed. All samples were pooled for the path analysis. RESULTS Association of BMI with type 2 diabetes and fasting glucose may be greater in those without obesity (OR=7.77, p=0.015 and β=0.213, p=9.53×10-5, respectively) than in those with obesity (OR=5.01, p=1.12×10-9 and β=0.162, p=5.63×10-6, respectively). We did not observe evidence of differences in the association of BMI with type 2 diabetes or fasting glucose by genotype. In the path analysis, the minor allele has direct negative (lower odds of type 2 diabetes and fasting glucose) and indirect positive (higher odds of type 2 diabetes and fasting glucose) effects on type 2 diabetes risk and fasting glucose, with the indirect effects mediated through a direct positive effect of rs373863828 on BMI. CONCLUSIONS There may be a stronger effect of BMI on fasting glucose in Polynesian individuals without obesity than in those with obesity. Carrying the rs373863828 minor allele does not decouple higher BMI from higher odds of type 2 diabetes.
Collapse
Affiliation(s)
- Emily M Russell
- Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| | - Jenna C Carlson
- Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
- Department of Biostatistics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| | - Mohanraj Krishnan
- Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| | - Nicola L Hawley
- Department of Chronic Disease Epidemiology, Yale University School of Public Health, New Haven, Connecticut, USA
| | - Guangyun Sun
- Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| | - Hong Cheng
- Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| | - Take Naseri
- Ministry of Health, Government of Samoa, Apia, Samoa
| | | | | | - John Tuitele
- Department of Public Health, Lyndon B Johnson Tropical Medical Center, Faga'alu, American Samoa
| | - Tanya J Major
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Iva Miljkovic
- Department of Epidemiology, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| | - Tony R Merriman
- Department of Biochemistry, University of Otago, Dunedin, New Zealand
- Division of Clinical Immunology and Rheumatology, The University of Alabama at Birmingham, Birmingham, Alabama, USA
| | - Ranjan Deka
- Department of Environmental and Public Health Sciences, University of Cincinnati College of Medicine, Cincinnati, Ohio, USA
| | - Daniel E Weeks
- Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
- Department of Biostatistics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| | - Stephen T McGarvey
- International Health Institute and Department of Epidemiology, Brown University School of Public Health, Providence, Rhode Island, USA
- Department of Anthropology, Brown University, Providence, Rhode Island, USA
| | - Ryan L Minster
- Department of Human Genetics, University of Pittsburgh Graduate School of Public Health, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
11
|
Chu BB, Sobel EM, Wasiolek R, Ko S, Sinsheimer JS, Zhou H, Lange K. A fast Data-Driven method for genotype imputation, phasing, and local ancestry inference: MendelImpute.jl. Bioinformatics 2021; 37:4756-4763. [PMID: 34289008 PMCID: PMC8665755 DOI: 10.1093/bioinformatics/btab489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/18/2021] [Accepted: 07/19/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Current methods for genotype imputation and phasing exploit the volume of data in haplotype reference panels and rely on hidden Markov models. Existing programs all have essentially the same imputation accuracy, are computationally intensive, and generally require pre-phasing the typed markers. RESULTS We introduce a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for hidden Markov model calculations. This strategy, embodied in our Julia program MendelImpute.jl, avoids explicit assumptions about recombination and population structure while delivering similar prediction accuracy, better memory usage, and an order of magnitude or better run-times compared to the fastest competing method. MendelImpute operates on both dosage data and unphased genotype data and simultaneously imputes missing genotypes and phase at both the typed and untyped SNPs. Finally, MendelImpute naturally extends to global and local ancestry estimation and lends itself to new strategies for data compression and hence faster data transport and sharing. AVAILABILITY Software, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelImpute.jl. SUPPLEMENTARY INFORMATION Supplementary data are available from Bioinformatics online.
Collapse
Affiliation(s)
- Benjamin B Chu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Eric M Sobel
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, USA.,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Rory Wasiolek
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, USA
| | - Seyoon Ko
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, USA
| | - Janet S Sinsheimer
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, USA.,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA.,Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, USA
| | - Hua Zhou
- Department of Biostatistics, Fielding School of Public Health at UCLA, Los Angeles, USA
| | - Kenneth Lange
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, USA.,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, USA
| |
Collapse
|
12
|
Ji SS, German CA, Lange K, Sinsheimer JS, Zhou H, Zhou J, Sobel EM. Modern simulation utilities for genetic analysis. BMC Bioinformatics 2021; 22:228. [PMID: 33941078 PMCID: PMC8091532 DOI: 10.1186/s12859-021-04086-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 03/17/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Statistical geneticists employ simulation to estimate the power of proposed studies, test new analysis tools, and evaluate properties of causal models. Although there are existing trait simulators, there is ample room for modernization. For example, most phenotype simulators are limited to Gaussian traits or traits transformable to normality, while ignoring qualitative traits and realistic, non-normal trait distributions. Also, modern computer languages, such as Julia, that accommodate parallelization and cloud-based computing are now mainstream but rarely used in older applications. To meet the challenges of contemporary big studies, it is important for geneticists to adopt new computational tools. RESULTS We present TraitSimulation, an open-source Julia package that makes it trivial to quickly simulate phenotypes under a variety of genetic architectures. This package is integrated into our OpenMendel suite for easy downstream analyses. Julia was purpose-built for scientific programming and provides tremendous speed and memory efficiency, easy access to multi-CPU and GPU hardware, and to distributed and cloud-based parallelization. TraitSimulation is designed to encourage flexible trait simulation, including via the standard devices of applied statistics, generalized linear models (GLMs) and generalized linear mixed models (GLMMs). TraitSimulation also accommodates many study designs: unrelateds, sibships, pedigrees, or a mixture of all three. (Of course, for data with pedigrees or cryptic relationships, the simulation process must include the genetic dependencies among the individuals.) We consider an assortment of trait models and study designs to illustrate integrated simulation and analysis pipelines. Step-by-step instructions for these analyses are available in our electronic Jupyter notebooks on Github. These interactive notebooks are ideal for reproducible research. CONCLUSION The TraitSimulation package has three main advantages. (1) It leverages the computational efficiency and ease of use of Julia to provide extremely fast, straightforward simulation of even the most complex genetic models, including GLMs and GLMMs. (2) It can be operated entirely within, but is not limited to, the integrated analysis pipeline of OpenMendel. And finally (3), by allowing a wider range of more realistic phenotype models, TraitSimulation brings power calculations and diagnostic tools closer to what investigators might see in real-world analyses.
Collapse
Affiliation(s)
- Sarah S. Ji
- Department of Biostatistics, University of California, Los Angeles, 90095 USA
| | | | - Kenneth Lange
- Department of Computational Medicine, University of California, Los Angeles, 90095 USA
- Department of Human Genetics, University of California, Los Angeles, 90095 USA
| | - Janet S. Sinsheimer
- Department of Biostatistics, University of California, Los Angeles, 90095 USA
- Department of Computational Medicine, University of California, Los Angeles, 90095 USA
- Department of Human Genetics, University of California, Los Angeles, 90095 USA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, 90095 USA
| | - Jin Zhou
- Departments of Epidemiology and Biostatistics, University of Arizona, Tucson, 85721 USA
| | - Eric M. Sobel
- Department of Computational Medicine, University of California, Los Angeles, 90095 USA
- Department of Human Genetics, University of California, Los Angeles, 90095 USA
| |
Collapse
|
13
|
Chu BB, Keys KL, German CA, Zhou H, Zhou JJ, Sobel EM, Sinsheimer JS, Lange K. Iterative hard thresholding in genome-wide association studies: Generalized linear models, prior weights, and double sparsity. Gigascience 2020; 9:giaa044. [PMID: 32491161 PMCID: PMC7268817 DOI: 10.1093/gigascience/giaa044] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/27/2020] [Accepted: 04/14/2020] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Consecutive testing of single nucleotide polymorphisms (SNPs) is usually employed to identify genetic variants associated with complex traits. Ideally one should model all covariates in unison, but most existing analysis methods for genome-wide association studies (GWAS) perform only univariate regression. RESULTS We extend and efficiently implement iterative hard thresholding (IHT) for multiple regression, treating all SNPs simultaneously. Our extensions accommodate generalized linear models, prior information on genetic variants, and grouping of variants. In our simulations, IHT recovers up to 30% more true predictors than SNP-by-SNP association testing and exhibits a 2-3 orders of magnitude decrease in false-positive rates compared with lasso regression. We also test IHT on the UK Biobank hypertension phenotypes and the Northern Finland Birth Cohort of 1966 cardiovascular phenotypes. We find that IHT scales to the large datasets of contemporary human genetics and recovers the plausible genetic variants identified by previous studies. CONCLUSIONS Our real data analysis and simulation studies suggest that IHT can (i) recover highly correlated predictors, (ii) avoid over-fitting, (iii) deliver better true-positive and false-positive rates than either marginal testing or lasso regression, (iv) recover unbiased regression coefficients, (v) exploit prior information and group-sparsity, and (vi) be used with biobank-sized datasets. Although these advances are studied for genome-wide association studies inference, our extensions are pertinent to other regression problems with large numbers of predictors.
Collapse
Affiliation(s)
- Benjamin B Chu
- Department of Computational Medicine, University of California, Los Angeles, 621 Charles E Young Dr S, Los Angeles, CA, 90095, USA
| | - Kevin L Keys
- Department of Medicine, University of California, San Francisco, 1701 Divisadero St, San Francisco, CA, 94115, USA
- Berkeley Institute of Data Science, University of California, Berkeley, 190 Doe Library, Berkeley, CA 94720, USA
| | - Christopher A German
- Department of Biostatistics, University of California, Los Angeles, 650 Charles E Young Dr S, Los Angeles, CA, 90095, USA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, 650 Charles E Young Dr S, Los Angeles, CA, 90095, USA
| | - Jin J Zhou
- Division of Epidemiology and Biostatistics, University of Arizona, 1295 N. Martin Ave. Tucson, AZ, 85724, USA
| | - Eric M Sobel
- Department of Computational Medicine, University of California, Los Angeles, 621 Charles E Young Dr S, Los Angeles, CA, 90095, USA
- Department of Human Genetics, University of California, Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA, 90095 USA
| | - Janet S Sinsheimer
- Department of Computational Medicine, University of California, Los Angeles, 621 Charles E Young Dr S, Los Angeles, CA, 90095, USA
- Department of Biostatistics, University of California, Los Angeles, 650 Charles E Young Dr S, Los Angeles, CA, 90095, USA
- Department of Human Genetics, University of California, Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA, 90095 USA
| | - Kenneth Lange
- Department of Computational Medicine, University of California, Los Angeles, 621 Charles E Young Dr S, Los Angeles, CA, 90095, USA
- Department of Human Genetics, University of California, Los Angeles, 695 Charles E Young Dr S, Los Angeles, CA, 90095 USA
| |
Collapse
|
14
|
vonHoldt BM, DeCandia AL, Heppenheimer E, Janowitz-Koch I, Shi R, Zhou H, German CA, Brzeski KE, Cassidy KA, Stahler DR, Sinsheimer JS. Heritability of interpack aggression in a wild pedigreed population of North American grey wolves. Mol Ecol 2020; 29:1764-1775. [PMID: 31905256 DOI: 10.1111/mec.15349] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Revised: 12/16/2019] [Accepted: 12/17/2019] [Indexed: 12/24/2022]
Abstract
Aggression is a quantitative trait deeply entwined with individual fitness. Mapping the genomic architecture underlying such traits is complicated by complex inheritance patterns, social structure, pedigree information and gene pleiotropy. Here, we leveraged the pedigree of a reintroduced population of grey wolves (Canis lupus) in Yellowstone National Park, Wyoming, USA, to examine the heritability of and the genetic variation associated with aggression. Since their reintroduction, many ecological and behavioural aspects have been documented, providing unmatched records of aggressive behaviour across multiple generations of a wild population of wolves. Using a linear mixed model, a robust genetic relationship matrix, 12,288 single nucleotide polymorphisms (SNPs) and 111 wolves, we estimated the SNP-based heritability of aggression to be 37% and an additional 14% of the phenotypic variation explained by shared environmental exposures. We identified 598 SNP genotypes from 425 grey wolves to resolve a consensus pedigree that was included in a heritability analysis of 141 individuals with SNP genotype, metadata and aggression data. The pedigree-based heritability estimate for aggression is 14%, and an additional 16% of the phenotypic variation was explained by shared environmental exposures. We find strong effects of breeding status and relative pack size on aggression. Through an integrative approach, these results provide a framework for understanding the genetic architecture of a complex trait that influences individual fitness, with linkages to reproduction, in a social carnivore. Along with a few other studies, we show here the incredible utility of a pedigreed natural population for dissecting a complex, fitness-related behavioural trait.
Collapse
Affiliation(s)
| | | | | | | | - Ruoyao Shi
- BioKnow Health Informatics Lab, College of Life Sciences, Jilin University, Changchun, China
| | - Hua Zhou
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA
| | - Christopher A German
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA
| | - Kristin E Brzeski
- College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI, USA
| | - Kira A Cassidy
- Yellowstone Center for Resources, National Park Service, Yellowstone National Park, WY, USA
| | - Daniel R Stahler
- Yellowstone Center for Resources, National Park Service, Yellowstone National Park, WY, USA
| | - Janet S Sinsheimer
- Department of Biostatistics, UCLA Fielding School of Public Health, University of California, Los Angeles, CA, USA.,Department of Human Genetics and Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| |
Collapse
|
15
|
Caliebe A, Nothnagel M. Special issue on 'Genetic epidemiology of complex diseases: impact of population history and modelling assumptions'. Hum Genet 2020; 139:1-3. [PMID: 31664516 DOI: 10.1007/s00439-019-02074-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Amke Caliebe
- Institute of Medical Informatics and Statistics, Kiel University, Kiel, Germany. .,University Medical Centre Schleswig-Holstein, Kiel, Germany.
| | - Michael Nothnagel
- Cologne Center for Genomics, University of Cologne, Cologne, Germany. .,University Hospital Cologne, Cologne, Germany.
| |
Collapse
|
16
|
German CA, Sinsheimer JS, Klimentidis YC, Zhou H, Zhou JJ. Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale. Genet Epidemiol 2019; 44:248-260. [PMID: 31879980 DOI: 10.1002/gepi.22276] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2019] [Revised: 10/23/2019] [Accepted: 11/25/2019] [Indexed: 12/23/2022]
Abstract
Logistic regression is the primary analysis tool for binary traits in genome-wide association studies (GWAS). Multinomial regression extends logistic regression to multiple categories. However, many phenotypes more naturally take ordered, discrete values. Examples include (a) subtypes defined from multiple sources of clinical information and (b) derived phenotypes generated by specific phenotyping algorithms for electronic health records (EHR). GWAS of ordinal traits have been problematic. Dichotomizing can lead to a range of arbitrary cutoff values, generating inconsistent, hard to interpret results. Using multinomial regression ignores trait value hierarchy and potentially loses power. Treating ordinal data as quantitative can lead to misleading inference. To address these issues, we analyze ordinal traits with an ordered, multinomial model. This approach increases power and leads to more interpretable results. We derive efficient algorithms for computing test statistics, making ordinal trait GWAS computationally practical for Biobank scale data. Our method is available as a Julia package OrdinalGWAS.jl. Application to a COPDGene study confirms previously found signals based on binary case-control status, but with more significance. Additionally, we demonstrate the capability of our package to run on UK Biobank data by analyzing hypertension as an ordinal trait.
Collapse
Affiliation(s)
- Christopher A German
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, California
| | - Janet S Sinsheimer
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, California.,Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, California.,Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, California
| | - Yann C Klimentidis
- Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona
| | - Hua Zhou
- Department of Biostatistics, UCLA Fielding School of Public Health, Los Angeles, California
| | - Jin J Zhou
- Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, Tucson, Arizona
| |
Collapse
|