1
|
Ren J, Pan W. Statistical inference with large-scale trait imputation. Stat Med 2024; 43:625-641. [PMID: 38038193 PMCID: PMC10848238 DOI: 10.1002/sim.9975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 09/26/2023] [Accepted: 11/17/2023] [Indexed: 12/02/2023]
Abstract
Recently a nonparametric method called LS-imputation has been proposed for large-scale trait imputation based on a GWAS summary dataset and a large set of genotyped individuals. The imputed trait values, along with the genotypes, can be treated as an individual-level dataset for downstream genetic analyses, including those that cannot be done with GWAS summary data. However, since the covariance matrix of the imputed trait values is often too large to calculate, the current method imposes a working assumption that the imputed trait values are identically and independently distributed, which is incorrect in truth. Here we propose a "divide and conquer/combine" strategy to estimate and account for the covariance matrix of the imputed trait values via batches, thus relaxing the incorrect working assumption. Applications of the methods to the UK Biobank data for marginal association analysis showed some improvement by the new method in some cases, but overall the original method performed well, which was explained by nearly constant variances of and mostly weak correlations among imputed trait values.
Collapse
Affiliation(s)
- Jingchen Ren
- School of Statistics, University of Minnesota, Minneapolis, MN, 55455
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, 55455
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, 55455
| |
Collapse
|
2
|
Ren J, Lin Z, He R, Shen X, Pan W. Using GWAS summary data to impute traits for genotyped individuals. HGG ADVANCES 2023; 4:100197. [PMID: 37181332 PMCID: PMC10173780 DOI: 10.1016/j.xhgg.2023.100197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 04/07/2023] [Indexed: 05/16/2023] Open
Abstract
Genome-wide association study (GWAS) summary data have become extremely useful in daily routine data analysis, largely facilitating new methods development and new applications. However, a severe limitation with the current use of GWAS summary data is its exclusive restriction to only linear single nucleotide polymorphism (SNP)-trait association analyses. To further expand the use of GWAS summary data, along with a large sample of individual-level genotypes, we propose a nonparametric method for large-scale imputation of the genetic component of the trait for the given genotypes. The imputed individual-level trait values, along with the individual-level genotypes, make it possible to conduct any analysis as with individual-level GWAS data, including nonlinear SNP-trait associations and predictions. We use the UK Biobank data to highlight the usefulness and effectiveness of the proposed method in three applications that currently cannot be done with only GWAS summary data (for SNP-trait associations): marginal SNP-trait association analysis under non-additive genetic models, detection of SNP-SNP interactions, and genetic prediction of a trait using a nonlinear model of SNPs.
Collapse
Affiliation(s)
- Jingchen Ren
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Zhaotong Lin
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Ruoyu He
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
| | - Xiaotong Shen
- School of Statistics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Wei Pan
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455, USA
- Corresponding author
| |
Collapse
|
3
|
Zhang J, Luo Q, Hou J, Xiao W, Long P, Hu Y, Chen X, Wang H. Fatty acids and risk of dilated cardiomyopathy: A two-sample Mendelian randomization study. Front Nutr 2023; 10:1068050. [PMID: 36875854 PMCID: PMC9980906 DOI: 10.3389/fnut.2023.1068050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Accepted: 01/30/2023] [Indexed: 02/18/2023] Open
Abstract
Background Previous observational studies have shown intimate associations between fatty acids (FAs) and dilated cardiomyopathy (DCM). However, due to the confounding factors and reverse causal association found in observational epidemiological studies, the etiological explanation is not credible. Objective To exclude possible confounding factors and reverse causal associations found in observational epidemiological studies, we used the two-sample Mendelian randomization (MR) analysis to verify the causal relationship between FAs and DCM risk. Method All data of 54 FAs were downloaded from the genome-wide association studies (GWAS) catalog, and the summary statistics of DCM were extracted from the HF Molecular Epidemiology for Therapeutic Targets Consortium GWAS. Two-sample MR analysis was conducted to evaluate the causal effect of FAs on DCM risk through several analytical methods, including MR-Egger, inverse variance weighting (IVW), maximum likelihood, weighted median estimator (WME), and the MR pleiotropy residual sum and outlier test (MRPRESSO). Directionality tests using MR-Steiger to assess the possibility of reverse causation. Results Our analysis identified two FAs, oleic acid and fatty acid (18:1)-OH, that may have a significant causal effect on DCM. MR analyses indicated that oleic acid was suggestively associated with a heightened risk of DCM (OR = 1.291, 95%CI: 1.044-1.595, P = 0.018). As a probable metabolite of oleic acid, fatty acid (18:1)-OH has a suggestive association with a lower risk of DCM (OR = 0.402, 95%CI: 0.167-0.966, P = 0.041). The results of the directionality test suggested that there was no reverse causality between exposure and outcome (P < 0.001). In contrast, the other 52 available FAs were discovered to have no significant causal relationships with DCM (P > 0.05). Conclusion Our findings propose that oleic acid and fatty acid (18:1)-OH may have causal relationships with DCM, indicating that the risk of DCM from oleic acid may be decreased by encouraging the conversion of oleic acid to fatty acid (18:1)-OH.
Collapse
Affiliation(s)
- Jiexin Zhang
- Department of Laboratory Medicine, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China.,Central Laboratory, The General Hospital of Western Theater Command, Chengdu, Sichuan, China
| | - Qiang Luo
- Department of Laboratory Medicine, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China
| | - Jun Hou
- Department of Laboratory Medicine, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China
| | - Wenjing Xiao
- Central Laboratory, The General Hospital of Western Theater Command, Chengdu, Sichuan, China
| | - Pan Long
- Central Laboratory, The General Hospital of Western Theater Command, Chengdu, Sichuan, China
| | - Yonghe Hu
- Central Laboratory, The General Hospital of Western Theater Command, Chengdu, Sichuan, China
| | - Xin Chen
- Department of Laboratory Medicine, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China
| | - Han Wang
- Department of Cardiology, Affiliated Hospital of Southwest Jiaotong University, The Third People's Hospital of Chengdu, Chengdu, Sichuan, China
| |
Collapse
|
4
|
Armstrong ND, Srinivasasainagendra V, Patki A, Tanner RM, Hidalgo BA, Tiwari HK, Limdi NA, Lange EM, Lange LA, Arnett DK, Irvin MR. Genetic Contributors of Incident Stroke in 10,700 African Americans With Hypertension: A Meta-Analysis From the Genetics of Hypertension Associated Treatments and Reasons for Geographic and Racial Differences in Stroke Studies. Front Genet 2022; 12:781451. [PMID: 34992631 PMCID: PMC8724550 DOI: 10.3389/fgene.2021.781451] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/23/2021] [Indexed: 11/25/2022] Open
Abstract
Background: African Americans (AAs) suffer a higher stroke burden due to hypertension. Identifying genetic contributors to stroke among AAs with hypertension is critical to understanding the genetic basis of the disease, as well as detecting at-risk individuals. Methods: In a population comprising over 10,700 AAs treated for hypertension from the Genetics of Hypertension Associated Treatments (GenHAT) and Reasons for Geographic and Racial Differences in Stroke (REGARDS) studies, we performed an inverse variance-weighted meta-analysis of incident stroke. Additionally, we tested the predictive accuracy of a polygenic risk score (PRS) derived from a European ancestral population in both GenHAT and REGARDS AAs aiming to evaluate cross-ethnic performance. Results: We identified 10 statistically significant (p < 5.00E-08) and 90 additional suggestive (p < 1.00E-06) variants associated with incident stroke in the meta-analysis. Six of the top 10 variants were located in an intergenic region on chromosome 18 (LINC01443-LOC644669). Additional variants of interest were located in or near the COL12A1, SNTG1, PCDH7, TMTC1, and NTM genes. Replication was conducted in the Warfarin Pharmacogenomics Cohort (WPC), and while none of the variants were directly validated, seven intronic variants of NTM proximal to our target variants, had a p-value <5.00E-04 in the WPC. The inclusion of the PRS did not improve the prediction accuracy compared to a reference model adjusting for age, sex, and genetic ancestry in either study and had lower predictive accuracy compared to models accounting for established stroke risk factors. These results demonstrate the necessity for PRS derivation in AAs, particularly for diseases that affect AAs disproportionately. Conclusion: This study highlights biologically plausible genetic determinants for incident stroke in hypertensive AAs. Ultimately, a better understanding of genetic risk factors for stroke in AAs may give new insight into stroke burden and potential clinical tools for those among the highest at risk.
Collapse
Affiliation(s)
- Nicole D Armstrong
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, United States
| | | | - Amit Patki
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Rikki M Tanner
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Bertha A Hidalgo
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Hemant K Tiwari
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Nita A Limdi
- Department of Neurology, University of Alabama at Birmingham, Birmingham, AL, United States
| | - Ethan M Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Leslie A Lange
- Division of Biomedical Informatics and Personalized Medicine, Department of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, United States
| | - Donna K Arnett
- College of Public Health, University of Kentucky, Lexington, KY, United States
| | - Marguerite R Irvin
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, United States
| |
Collapse
|
5
|
Machine learning approaches for classification of colorectal cancer with and without feature selection method on microarray data. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101419] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|