1
|
Sun WX, Chang XY, Chen Y, Zhao Q, Zhang YM. The integration of quantile regression with 3VmrMLM identifies more QTNs and QTN-by-environment interactions using SNP- and haplotype-based markers. PLANT COMMUNICATIONS 2025; 6:101196. [PMID: 39580620 PMCID: PMC11956104 DOI: 10.1016/j.xplc.2024.101196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 10/11/2024] [Accepted: 11/20/2024] [Indexed: 11/26/2024]
Abstract
Current methods used in genome-wide association studies frequently lack power owing to their inability to detect heterogeneous associations and rare and multiallelic variants. To address these issues, quantile regression is integrated with a three (compressed) variance component multi-locus random-SNP-effect mixed linear model (3VmrMLM) to propose q3VmrMLM for detecting heterogeneous quantitative trait nucleotides (QTNs) and QTN-by-environment interactions (QEIs), and then design haplotype-based q3VmrMLM (q3VmrMLM-Hap) for identifying multiallelic haplotypes and rare variants. In Monte Carlo simulation studies, q3VmrMLM had higher power than 3VmrMLM, sequence kernel association test (SKAT), and integrated quantile rank test (iQRAT). In a re-analysis of 10 traits in 1439 rice hybrids, 261 known genes were identified only by q3VmrMLM and q3VmrMLM-Hap, whereas 175 known genes were detected by both the new and existing methods. Of all the significant QTNs with known genes, q3VmrMLM (179: 140 variance heterogeneity and 157 quantile effect heterogeneity) found more heterogeneous QTNs than 3VmrMLM (123), SKAT (27), and iQRAT (29); q3VmrMLM-Hap (121) mapped more low-frequency (<0.05) QTNs than q3VmrMLM (51), 3VmrMLM (43), SKAT (11), and iQRAT (12); and q3VmrMLM-Hap (12), q3VmrMLM (16), and 3VmrMLM (12) had similar power in identifying gene-by-environment interactions. All significant and suggested QTNs achieved the highest predictive accuracy (r = 0.9045). In conclusion, this study describes a new and complementary approach to mining genes and unraveling the genetic architecture of complex traits in crops.
Collapse
Affiliation(s)
- Wen-Xian Sun
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Xiao-Yu Chang
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Ying Chen
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Qiong Zhao
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuan-Ming Zhang
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
2
|
Wang C, Wang T, Kiryluk K, Wei Y, Aschard H, Ionita-Laza I. Genome-wide discovery for biomarkers using quantile regression at biobank scale. Nat Commun 2024; 15:6460. [PMID: 39085219 PMCID: PMC11291931 DOI: 10.1038/s41467-024-50726-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 07/18/2024] [Indexed: 08/02/2024] Open
Abstract
Genome-wide association studies (GWAS) for biomarkers important for clinical phenotypes can lead to clinically relevant discoveries. Conventional GWAS for quantitative traits are based on simplified regression models modeling the conditional mean of a phenotype as a linear function of genotype. We draw attention here to an alternative, lesser known approach, namely quantile regression that naturally extends linear regression to the analysis of the entire conditional distribution of a phenotype of interest. Quantile regression can be applied efficiently at biobank scale, while having some unique advantages such as (1) identifying variants with heterogeneous effects across quantiles of the phenotype distribution; (2) accommodating a wide range of phenotype distributions including non-normal distributions, with invariance of results to trait transformations; and (3) providing more detailed information about genotype-phenotype associations even for those associations identified by conventional GWAS. We show in simulations that quantile regression is powerful across both homogeneous and various heterogeneous models. Applications to 39 quantitative traits in the UK Biobank demonstrate that quantile regression can be a helpful complement to linear regression in GWAS and can identify variants with larger effects on high-risk subgroups of individuals but with lower or no contribution overall.
Collapse
Affiliation(s)
- Chen Wang
- Department of Biostatistics, Columbia University, New York, NY, USA
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | | | - Krzysztof Kiryluk
- Division of Nephrology, Department of Medicine, Vagelos College of Physicians and Surgeons, Columbia University, New York, NY, USA
| | - Ying Wei
- Department of Biostatistics, Columbia University, New York, NY, USA
| | - Hugues Aschard
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
| | - Iuliana Ionita-Laza
- Department of Biostatistics, Columbia University, New York, NY, USA.
- Department of Statistics, Lund University, Lund, Sweden.
| |
Collapse
|
3
|
Yang Y, Wang C, Liu L, Buxbaum J, He Z, Ionita-Laza I. KnockoffTrio: A knockoff framework for the identification of putative causal variants in genome-wide association studies with trio design. Am J Hum Genet 2022; 109:1761-1776. [PMID: 36150388 PMCID: PMC9606389 DOI: 10.1016/j.ajhg.2022.08.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 08/24/2022] [Indexed: 01/25/2023] Open
Abstract
Family-based designs can eliminate confounding due to population substructure and can distinguish direct from indirect genetic effects, but these designs are underpowered due to limited sample sizes. Here, we propose KnockoffTrio, a statistical method to identify putative causal genetic variants for father-mother-child trio design built upon a recently developed knockoff framework in statistics. KnockoffTrio controls the false discovery rate (FDR) in the presence of arbitrary correlations among tests and is less conservative and thus more powerful than the conventional methods that control the family-wise error rate via Bonferroni correction. Furthermore, KnockoffTrio is not restricted to family-based association tests and can be used in conjunction with more powerful, potentially nonlinear models to improve the power of standard family-based tests. We show, using empirical simulations, that KnockoffTrio can prioritize causal variants over associations due to linkage disequilibrium and can provide protection against confounding due to population stratification. In applications to 14,200 trios from three study cohorts for autism spectrum disorders (ASDs), including AGP, SPARK, and SSC, we show that KnockoffTrio can identify multiple significant associations that are missed by conventional tests applied to the same data. In particular, we replicate known ASD association signals with variants in several genes such as MACROD2, NRXN1, PRKAR1B, CADM2, PCDH9, and DOCK4 and identify additional associations with variants in other genes including ARHGEF10, SLC28A1, ZNF589, and HINT1 at FDR 10%.
Collapse
Affiliation(s)
- Yi Yang
- Department of Biostatistics, Columbia University, New York, NY 10032, USA; Department of Biostatistics, City University of Hong Kong, Hong Kong SAR, China; School of Data Science, City University of Hong Kong, Hong Kong SAR, China
| | - Chen Wang
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - Linxi Liu
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Joseph Buxbaum
- Departments of Psychiatry, Neuroscience, and Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zihuai He
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA 94305, USA; Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA
| | | |
Collapse
|
4
|
Miao J, Lin Y, Wu Y, Zheng B, Schmitz LL, Fletcher JM, Lu Q. A quantile integral linear model to quantify genetic effects on phenotypic variability. Proc Natl Acad Sci U S A 2022; 119:e2212959119. [PMID: 36122202 PMCID: PMC9522331 DOI: 10.1073/pnas.2212959119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 08/22/2022] [Indexed: 11/18/2022] Open
Abstract
Detecting genetic variants associated with the variance of complex traits, that is, variance quantitative trait loci (vQTLs), can provide crucial insights into the interplay between genes and environments and how they jointly shape human phenotypes in the population. We propose a quantile integral linear model (QUAIL) to estimate genetic effects on trait variability. Through extensive simulations and analyses of real data, we demonstrate that QUAIL provides computationally efficient and statistically powerful vQTL mapping that is robust to non-Gaussian phenotypes and confounding effects on phenotypic variability. Applied to UK Biobank (n = 375,791), QUAIL identified 11 vQTLs for body mass index (BMI) that have not been previously reported. Top vQTL findings showed substantial enrichment for interactions with physical activities and sedentary behavior. Furthermore, variance polygenic scores (vPGSs) based on QUAIL effect estimates showed superior predictive performance on both population-level and within-individual BMI variability compared to existing approaches. Overall, QUAIL is a unified framework to quantify genetic effects on the phenotypic variability at both single-variant and vPGS levels. It addresses critical limitations in existing approaches and may have broad applications in future gene-environment interaction studies.
Collapse
Affiliation(s)
- Jiacheng Miao
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
| | - Yupei Lin
- Baylor College of Medicine, Houston, TX 77030
| | - Yuchang Wu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
| | - Boyan Zheng
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
| | - Lauren L. Schmitz
- Robert M. La Follette School of Public Affairs, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
| | - Jason M. Fletcher
- Department of Sociology, University of Wisconsin–Madison, Madison, WI 53706
- Robert M. La Follette School of Public Affairs, University of Wisconsin–Madison, Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
| | - Qiongshi Lu
- Department of Biostatistics and Medical Informatics, University of Wisconsin–Madison, WI 53706
- Center for Demography of Health and Aging, University of Wisconsin–Madison, Madison, WI 53706
- Department of Statistics, University of Wisconsin–Madison, Madison, WI 53706
| |
Collapse
|