1
|
Zhang Q, Yang Z, Yang J. Dissecting the colocalized GWAS and eQTLs with mediation analysis for high-dimensional exposures and confounders. Biometrics 2024; 80:ujae050. [PMID: 38801257 DOI: 10.1093/biomtc/ujae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 03/14/2024] [Accepted: 05/14/2024] [Indexed: 05/29/2024]
Abstract
To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).
Collapse
Affiliation(s)
- Qi Zhang
- Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, United States
| | - Zhikai Yang
- Complex Biosystems Program and Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, United States
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, United States
| |
Collapse
|
2
|
Yang Z, Zhao T, Cheng H, Yang J. Microbiome-enabled genomic selection improves prediction accuracy for nitrogen-related traits in maize. G3 (BETHESDA, MD.) 2024; 14:jkad286. [PMID: 38113533 PMCID: PMC11090461 DOI: 10.1093/g3journal/jkad286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 05/19/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
Root-associated microbiomes in the rhizosphere (rhizobiomes) are increasingly known to play an important role in nutrient acquisition, stress tolerance, and disease resistance of plants. However, it remains largely unclear to what extent these rhizobiomes contribute to trait variation for different genotypes and if their inclusion in the genomic selection protocol can enhance prediction accuracy. To address these questions, we developed a microbiome-enabled genomic selection method that incorporated host SNPs and amplicon sequence variants from plant rhizobiomes in a maize diversity panel under high and low nitrogen (N) field conditions. Our cross-validation results showed that the microbiome-enabled genomic selection model significantly outperformed the conventional genomic selection model for nearly all time-series traits related to plant growth and N responses, with an average relative improvement of 3.7%. The improvement was more pronounced under low N conditions (8.4-40.2% of relative improvement), consistent with the view that some beneficial microbes can enhance N nutrient uptake, particularly in low N fields. However, our study could not definitively rule out the possibility that the observed improvement is partially due to the amplicon sequence variants being influenced by microenvironments. Using a high-dimensional mediation analysis method, our study has also identified microbial mediators that establish a link between plant genotype and phenotype. Some of the detected mediator microbes were previously reported to promote plant growth. The enhanced prediction accuracy of the microbiome-enabled genomic selection models, demonstrated in a single environment, serves as a proof-of-concept for the potential application of microbiome-enabled plant breeding for sustainable agriculture.
Collapse
Affiliation(s)
- Zhikai Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Tianjing Zhao
- Department of Animal Science, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA
| | - Hao Cheng
- Department of Animal Science, University of California Davis, Davis, CA 95616, USA
| | - Jinliang Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| |
Collapse
|
3
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for mediation analysis with high-dimensional DNA methylation data: Possible choices and comparisons. PLoS Genet 2023; 19:e1011022. [PMID: 37934796 PMCID: PMC10655967 DOI: 10.1371/journal.pgen.1011022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/17/2023] [Accepted: 10/18/2023] [Indexed: 11/09/2023] Open
Abstract
Epigenetic researchers often evaluate DNA methylation as a potential mediator of the effect of social/environmental exposures on a health outcome. Modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large multi-ethnic cohort in the United States, while providing an R package for their seamless implementation and adoption. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model (BSLMM) and high-dimensional mediation analysis (HDMA); while the preferred methods for estimating the global mediation effect are high-dimensional linear mediation analysis (HILMA) and principal component mediation analysis (PCMA). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health; Boston, Massachusetts, United States of America
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jiacong Du
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center; Durham, North Carolina, United States of America
| | - Belinda L. Needham
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Jennifer A. Smith
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
- Survey Research Center, Institute for Social Research, University of Michigan; Ann Arbor, Michigan, United States of America
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan; Ann Arbor, Michigan, United States of America
- Department of Epidemiology, University of Michigan; Ann Arbor, Michigan, United States of America
| |
Collapse
|
4
|
Clark-Boucher D, Zhou X, Du J, Liu Y, Needham BL, Smith JA, Mukherjee B. Methods for Mediation Analysis with High-Dimensional DNA Methylation Data: Possible Choices and Comparison. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.02.10.23285764. [PMID: 36824903 PMCID: PMC9949196 DOI: 10.1101/2023.02.10.23285764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
Epigenetic researchers often evaluate DNA methylation as a mediator between social/environmental exposures and disease, but modern statistical methods for jointly evaluating many mediators have not been widely adopted. We compare seven methods for high-dimensional mediation analysis with continuous outcomes through both diverse simulations and analysis of DNAm data from a large national cohort in the United States, while providing an R package for their implementation. Among the considered choices, the best-performing methods for detecting active mediators in simulations are the Bayesian sparse linear mixed model by Song et al. (2020) and high-dimensional mediation analysis by Gao et al. (2019); while the superior methods for estimating the global mediation effect are high-dimensional linear mediation analysis by Zhou et al. (2021) and principal component mediation analysis by Huang and Pan (2016). We provide guidelines for epigenetic researchers on choosing the best method in practice and offer suggestions for future methodological development.
Collapse
Affiliation(s)
- Dylan Clark-Boucher
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA
| | - Xiang Zhou
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Jiacong Du
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
| | - Yongmei Liu
- Department of Medicine, Divisions of Cardiology and Neurology, Duke University Medical Center, Durham, NC
| | | | - Jennifer A Smith
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
- Survey Research Center, Institute for Social Research, University of Michigan, Ann Arbor, MI
| | - Bhramar Mukherjee
- Department of Biostatistics, University of Michigan, Ann Arbor, MI
- Department of Epidemiology, University of Michigan, Ann Arbor, MI
| |
Collapse
|
5
|
A cognitive neurogenetic approach to uncovering the structure of executive functions. Nat Commun 2022; 13:4588. [PMID: 35933428 PMCID: PMC9357028 DOI: 10.1038/s41467-022-32383-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 07/27/2022] [Indexed: 11/08/2022] Open
Abstract
One central mission of cognitive neuroscience is to understand the ontology of complex cognitive functions. We addressed this question with a cognitive neurogenetic approach using a large-scale dataset of executive functions (EFs), whole-brain resting-state functional connectivity, and genetic polymorphisms. We found that the bifactor model with common and shifting-specific components not only was parsimonious but also showed maximal dissociations among the EF components at behavioral, neural, and genetic levels. In particular, the genes with enhanced expression in the middle frontal gyrus (MFG) and the subcallosal cingulate gyrus (SCG) showed enrichment for the common and shifting-specific component, respectively. Finally, High-dimensional mediation models further revealed that the functional connectivity patterns significantly mediated the genetic effect on the common EF component. Our study not only reveals insights into the ontology of EFs and their neurogenetic basis, but also provides useful tools to uncover the structure of complex constructs of human cognition.
Collapse
|
6
|
Yang Z, Xu G, Zhang Q, Obata T, Yang J. Genome-wide mediation analysis: an empirical study to connect phenotype with genotype via intermediate transcriptomic data in maize. Genetics 2022; 221:6572813. [PMID: 35460234 PMCID: PMC9157066 DOI: 10.1093/genetics/iyac057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
Mapping genotype to phenotype is an essential topic in genetics and genomics research. As the Omics data become increasingly available, 2-variable methods have been widely applied to associate genotype with the phenotype (genome-wide association study), gene expression with the phenotype (transcriptome-wide association study), and genotype with gene expression. However, signals detected by these 2-variable association methods suffer from low mapping resolution or inexplicit causality between genotype and phenotype, making it challenging to interpret and validate the molecular mechanisms of the underlying genomic variations and the candidate genes. Under the context of genetics research, we hypothesized a causal chain from genotype to phenotype partially mediated by intermediate molecular processes, i.e. gene expression. To test this hypothesis, we applied the high-dimensional mediation analysis, a class of causal inference method with an assumed causal chain from the exposure to the mediator to the outcome, and implemented it with a maize association panel (N = 280 lines). Using 40 publicly available agronomy traits, 66 newly generated metabolite traits, and published RNA-seq data from 7 different tissues, our empirical study detected 736 unique mediating genes. Noticeably, 83/736 (11%) genes were identified in mediating more than 1 trait, suggesting the prevalence of pleiotropic mediating effects. We demonstrated that several identified mediating genes are consistent with their known functions. In addition, our results provided explicit hypotheses for functional validation and suggested that the mediation analysis is a powerful tool to integrate Omics data to connect genotype to phenotype.
Collapse
Affiliation(s)
- Zhikai Yang
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA,Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Gen Xu
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA,Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Qi Zhang
- Department of Mathematics and Statistics, University of New Hampshire, Durham, NH 03824, USA
| | - Toshihiro Obata
- Center for Plant Science Innovation, University of Nebraska-Lincoln, Lincoln, NE 68583, USA,Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE 68583, USA
| | - Jinliang Yang
- Corresponding author: Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588, USA.
| |
Collapse
|