Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hou K, Xu Z, Ding Y, Harpak A, Pasaniuc B. Calibrated prediction intervals for polygenic scores across diverse contexts. medRxiv 2023:2023.07.24.23293056. [PMID: 37546999 PMCID: PMC10402211 DOI: 10.1101/2023.07.24.23293056] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

For:	Hou K, Xu Z, Ding Y, Harpak A, Pasaniuc B. Calibrated prediction intervals for polygenic scores across diverse contexts. medRxiv 2023:2023.07.24.23293056. [PMID: 37546999 PMCID: PMC10402211 DOI: 10.1101/2023.07.24.23293056] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

Number

Cited by Other Article(s)

Tsuo K, Shi Z, Ge T, Mandla R, Hou K, Ding Y, Pasaniuc B, Wang Y, Martin AR. All of Us diversity and scale improve polygenic prediction contextually with greatest improvements for under-represented populations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.06.606846. [PMID: 39149254 PMCID: PMC11326295 DOI: 10.1101/2024.08.06.606846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]

Abstract

Recent studies have demonstrated that polygenic risk scores (PRS) trained on multi-ancestry data can improve prediction accuracy in groups historically underrepresented in genomic studies, but the availability of linked health and genetic data from large-scale diverse cohorts representative of a wide spectrum of human diversity remains limited. To address this need, the All of Us research program (AoU) generated whole-genome sequences of 245,388 individuals who collectively reflect the diversity of the USA. Leveraging this resource and another widely-used population-scale biobank, the UK Biobank (UKB) with a half million participants, we developed PRS trained on multi-ancestry and multi-biobank data with up to ~750,000 participants for 32 common, complex traits and diseases across a range of genetic architectures. We then compared effects of ancestry, PRS methodology, and genetic architecture on PRS accuracy across a held out subset of ancestrally diverse AoU participants. Due to the more heterogeneous study design of AoU, we found lower heritability on average compared to UKB (0.075 vs 0.165), which limited the maximal achievable PRS accuracy in AoU. Overall, we found that the increased diversity of AoU significantly improved PRS performance in some participants in AoU, especially underrepresented individuals, across multiple phenotypes. Notably, maximizing sample size by combining discovery data across AoU and UKB is not the optimal approach for predicting some phenotypes in African ancestry populations; rather, using data from only AoU for these traits resulted in the greatest accuracy. This was especially true for less polygenic traits with large ancestry-enriched effects, such as neutrophil count (R 2: 0.055 vs. 0.035 using AoU vs. cross-biobank meta-analysis, respectively, because of e.g. DARC). Lastly, we calculated individual-level PRS accuracies rather than grouping by continental ancestry, a critical step towards interpretability in precision medicine. Individualized PRS accuracy decays linearly as a function of ancestry divergence, but the slope was smaller using multi-ancestry GWAS compared to using European GWAS. Our results highlight the potential of biobanks with more balanced representations of human diversity to facilitate more accurate PRS for the individuals least represented in genomic studies.

Collapse

Affiliation(s)

Kristin Tsuo Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Zhuozheng Shi Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
Tian Ge Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA Department of Psychiatry, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA Center for Precision Psychiatry, Massachusetts General Hospital, Boston, MA, USA
Ravi Mandla Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
Kangcheng Hou Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
Yi Ding Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA
Bogdan Pasaniuc Interdepartmental Program in Bioinformatics, University of California, Los Angeles, Los Angeles, CA 90095, USA Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA 90095, USA Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
Ying Wang Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
Alicia R Martin Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA

Collapse

Tubbs JD, Chen Y, Duan R, Huang H, Ge T. Real-time dynamic polygenic prediction for streaming data. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.07.12.24310357. [PMID: 39040195 PMCID: PMC11261927 DOI: 10.1101/2024.07.12.24310357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]

Boye C, Nirmalan S, Ranjbaran A, Luca F. Genotype × environment interactions in gene regulation and complex traits. Nat Genet 2024;56:1057-1068. [PMID: 38858456 DOI: 10.1038/s41588-024-01776-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 04/25/2024] [Indexed: 06/12/2024]

Hong SC, Muyas F, Cortés-Ciriano I, Hormoz S. scAI-SNP: a method for inferring ancestry from single-cell data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.14.594208. [PMID: 38798590 PMCID: PMC11118306 DOI: 10.1101/2024.05.14.594208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Peyrot WJ, Panagiotaropoulou G, Olde Loohuis LM, Adams MJ, Awasthi S, Ge T, McIntosh AM, Mitchell BL, Mullins N, O'Connell KS, Penninx BWJH, Posthuma D, Ripke S, Ruderfer DM, Uffelmann E, Vilhjalmsson BJ, Zhu Z, Smoller JW, Price AL. Distinguishing different psychiatric disorders using DDx-PRS. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.02.02.24302228. [PMID: 38352307 PMCID: PMC10862992 DOI: 10.1101/2024.02.02.24302228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2024]

Abstract

Despite great progress on methods for case-control polygenic prediction (e.g. schizophrenia vs. control), there remains an unmet need for a method that genetically distinguishes clinically related disorders (e.g. schizophrenia (SCZ) vs. bipolar disorder (BIP) vs. depression (MDD) vs. control); such a method could have important clinical value, especially at disorder onset when differential diagnosis can be challenging. Here, we introduce a method, Differential Diagnosis-Polygenic Risk Score (DDx-PRS), that jointly estimates posterior probabilities of each possible diagnostic category (e.g. SCZ=50%, BIP=25%, MDD=15%, control=10%) by modeling variance/covariance structure across disorders, leveraging case-control polygenic risk scores (PRS) for each disorder (computed using existing methods) and prior clinical probabilities for each diagnostic category. DDx-PRS uses only summary-level training data and does not use tuning data, facilitating implementation in clinical settings. In simulations, DDx-PRS was well-calibrated (whereas a simpler approach that analyzes each disorder marginally was poorly calibrated), and effective in distinguishing each diagnostic category vs. the rest. We then applied DDx-PRS to Psychiatric Genomics Consortium SCZ/BIP/MDD/control data, including summary-level training data from 3 case-control GWAS ( N =41,917-173,140 cases; total N =1,048,683) and held-out test data from different cohorts with equal numbers of each diagnostic category (total N =11,460). DDx-PRS was well-calibrated and well-powered relative to these training sample sizes, attaining AUCs of 0.66 for SCZ vs. rest, 0.64 for BIP vs. rest, 0.59 for MDD vs. rest, and 0.68 for control vs. rest. DDx-PRS produced comparable results to methods that leverage tuning data, confirming that DDx-PRS is an effective method. True diagnosis probabilities in top deciles of predicted diagnosis probabilities were considerably larger than prior baseline probabilities, particularly in projections to larger training sample sizes, implying considerable potential for clinical utility under certain circumstances. In conclusion, DDx-PRS is an effective method for distinguishing clinically related disorders.

Collapse

Uffelmann E, Price AL, Posthuma D, Peyrot WJ. Estimating Disorder Probability Based on Polygenic Prediction Using the BPC Approach. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.01.12.24301157. [PMID: 38260678 PMCID: PMC10802765 DOI: 10.1101/2024.01.12.24301157] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]

Veller C, Przeworski M, Coop G. Causal interpretations of family GWAS in the presence of heterogeneous effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.13.566950. [PMID: 38014124 PMCID: PMC10680648 DOI: 10.1101/2023.11.13.566950] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]

Abstract

Family-based genome-wide association studies (GWAS) have emerged as a gold standard for assessing causal effects of alleles and polygenic scores. Notably, family studies are often claimed to provide an unbiased estimate of the average causal effect (or average treatment effect; ATE) of an allele, on the basis of an analogy between the random transmission of alleles from parents to children and a randomized controlled trial. Here, we show that this interpretation does not hold in general. Because Mendelian segregation only randomizes alleles among children of heterozygotes, the effects of alleles in the children of homozygotes are not observable. Consequently, if an allele has different average effects in the children of homozygotes and heterozygotes, as can arise in the presence of gene-by-environment interactions, gene-by-gene interactions, or differences in LD patterns, family studies provide a biased estimate of the average effect in the sample. At a single locus, family-based association studies can be thought of as providing an unbiased estimate of the average effect in the children of heterozygotes (i.e., a local average treatment effect; LATE). This interpretation does not extend to polygenic scores, however, because different sets of SNPs are heterozygous in each family. Therefore, other than under specific conditions, the within-family regression slope of a PGS cannot be assumed to provide an unbiased estimate for any subset or weighted average of families. Instead, family-based studies can be reinterpreted as enabling an unbiased estimate of the extent to which Mendelian segregation at loci in the PGS contributes to the population-level variance in the trait. Because this estimate does not include the between-family variance, however, this interpretation applies to only (roughly) half of the sample PGS variance. In practice, the potential biases of a family-based GWAS are likely smaller than those arising from confounding in a standard, population-based GWAS, and so family studies remain important for the dissection of genetic contributions to phenotypic variation. Nonetheless, the causal interpretation of family-based GWAS estimates is less straightforward than has been widely appreciated.

Collapse