1
|
Momen M, Ayatollahi Mehrgardi A, Amiri Roudbar M, Kranis A, Mercuri Pinto R, Valente BD, Morota G, Rosa GJM, Gianola D. Including Phenotypic Causal Networks in Genome-Wide Association Studies Using Mixed Effects Structural Equation Models. Front Genet 2018; 9:455. [PMID: 30356716 PMCID: PMC6189326 DOI: 10.3389/fgene.2018.00455] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 09/18/2018] [Indexed: 12/21/2022] Open
Abstract
Network based statistical models accounting for putative causal relationships among multiple phenotypes can be used to infer single-nucleotide polymorphism (SNP) effect which transmitting through a given causal path in genome-wide association studies (GWAS). In GWAS with multiple phenotypes, reconstructing underlying causal structures among traits and SNPs using a single statistical framework is essential for understanding the entirety of genotype-phenotype maps. A structural equation model (SEM) can be used for such purposes. We applied SEM to GWAS (SEM-GWAS) in chickens, taking into account putative causal relationships among breast meat (BM), body weight (BW), hen-house production (HHP), and SNPs. We assessed the performance of SEM-GWAS by comparing the model results with those obtained from traditional multi-trait association analyses (MTM-GWAS). Three different putative causal path diagrams were inferred from highest posterior density (HPD) intervals of 0.75, 0.85, and 0.95 using the inductive causation algorithm. A positive path coefficient was estimated for BM → BW, and negative values were obtained for BM → HHP and BW → HHP in all implemented scenarios. Further, the application of SEM-GWAS enabled the decomposition of SNP effects into direct, indirect, and total effects, identifying whether a SNP effect is acting directly or indirectly on a given trait. In contrast, MTM-GWAS only captured overall genetic effects on traits, which is equivalent to combining the direct and indirect SNP effects from SEM-GWAS. Although MTM-GWAS and SEM-GWAS use the similar probabilistic models, we provide evidence that SEM-GWAS captures complex relationships in terms of causal meaning and mediation and delivers a more comprehensive understanding of SNP effects compared to MTM-GWAS. Our results showed that SEM-GWAS provides important insight regarding the mechanism by which identified SNPs control traits by partitioning them into direct, indirect, and total SNP effects.
Collapse
Affiliation(s)
- Mehdi Momen
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, Iran
| | | | - Mahmoud Amiri Roudbar
- Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Andreas Kranis
- Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Renan Mercuri Pinto
- Department of Exact Sciences, University of São Paulo-Escola Superior de Agricultura Luiz de Queiroz, Piracicaba, Brazil.,Department of Animal Sciences, University of Wisconsin, Madison, WI, United States
| | - Bruno D Valente
- Department of Animal Sciences, University of Wisconsin, Madison, WI, United States
| | - Gota Morota
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, United States
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison, WI, United States.,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, United States
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin, Madison, WI, United States.,Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, WI, United States.,Department of Dairy Science, University of Wisconsin, Madison, WI, United States
| |
Collapse
|
3
|
Igolkina AA, Armoskus C, Newman JRB, Evgrafov OV, McIntyre LM, Nuzhdin SV, Samsonova MG. Analysis of Gene Expression Variance in Schizophrenia Using Structural Equation Modeling. Front Mol Neurosci 2018; 11:192. [PMID: 29942251 PMCID: PMC6004421 DOI: 10.3389/fnmol.2018.00192] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/15/2018] [Indexed: 01/02/2023] Open
Abstract
Schizophrenia (SCZ) is a psychiatric disorder of unknown etiology. There is evidence suggesting that aberrations in neurodevelopment are a significant attribute of schizophrenia pathogenesis and progression. To identify biologically relevant molecular abnormalities affecting neurodevelopment in SCZ we used cultured neural progenitor cells derived from olfactory neuroepithelium (CNON cells). Here, we tested the hypothesis that variance in gene expression differs between individuals from SCZ and control groups. In CNON cells, variance in gene expression was significantly higher in SCZ samples in comparison with control samples. Variance in gene expression was enriched in five molecular pathways: serine biosynthesis, PI3K-Akt, MAPK, neurotrophin and focal adhesion. More than 14% of variance in disease status was explained within the logistic regression model (C-value = 0.70) by predictors accounting for gene expression in 69 genes from these five pathways. Structural equation modeling (SEM) was applied to explore how the structure of these five pathways was altered between SCZ patients and controls. Four out of five pathways showed differences in the estimated relationships among genes: between KRAS and NF1, and KRAS and SOS1 in the MAPK pathway; between PSPH and SHMT2 in serine biosynthesis; between AKT3 and TSC2 in the PI3K-Akt signaling pathway; and between CRK and RAPGEF1 in the focal adhesion pathway. Our analysis provides evidence that variance in gene expression is an important characteristic of SCZ, and SEM is a promising method for uncovering altered relationships between specific genes thus suggesting affected gene regulation associated with the disease. We identified altered gene-gene interactions in pathways enriched for genes with increased variance in expression in SCZ. These pathways and loci were previously implicated in SCZ, providing further support for the hypothesis that gene expression variance plays important role in the etiology of SCZ.
Collapse
Affiliation(s)
- Anna A Igolkina
- Institute of Applied Mathematics and Mechanics, Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia
| | - Chris Armoskus
- Zilkha Neurogenetic Institute, Keck School of Medicine, University of Southern California, Los Angeles, CA, United States
| | - Jeremy R B Newman
- Department of Molecular Genetics & Microbiology, Genetics Institute, University of Florida, Gainesville, FL, United States
| | - Oleg V Evgrafov
- Department of Cell Biology, SUNY Downstate Medical Center, Brooklyn, NY, United States
| | - Lauren M McIntyre
- Department of Molecular Genetics & Microbiology, Genetics Institute, University of Florida, Gainesville, FL, United States
| | - Sergey V Nuzhdin
- Institute of Applied Mathematics and Mechanics, Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia.,Molecular and Computation Biology, University of Southern California, Los Angeles, CA, United States
| | - Maria G Samsonova
- Institute of Applied Mathematics and Mechanics, Peter the Great St. Petersburg Polytechnic University, St. Petersburg, Russia
| |
Collapse
|
4
|
Romdhani H, Hwang H, Paradis G, Roy-Gagnon MH, Labbe A. Pathway-based association study of multiple candidate genes and multiple traits using structural equation models. Genet Epidemiol 2014; 39:101-13. [PMID: 25558046 DOI: 10.1002/gepi.21872] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2014] [Revised: 11/05/2014] [Accepted: 11/05/2014] [Indexed: 11/07/2022]
Abstract
There is increasing interest in the joint analysis of multiple genetic variants from multiple genes and multiple correlated quantitative traits in association studies. The classical approach involves testing univariate associations between genotypes and phenotypes and correcting for multiple testing that results in loss of power to detect associations. In this paper, we propose modeling complex relationships between genetic variants in candidate genes and measured correlated traits using structural equation models (SEM), taking advantage of prior knowledge on clinical and genetic pathways. We adopt generalized structured component analysis (GSCA) as an approach to SEM and develop a single association test between multiple genetic variants in a gene and a set of correlated traits, taking into account all available data from other genes and other traits. The performance of this test is investigated by simulations. We apply the proposed method to the Quebec Child and Adolescent Health and Social Survey (1999) data to investigate genetic associations with cardiovascular disease-related traits.
Collapse
Affiliation(s)
- Hela Romdhani
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, Quebec, Canada
| | | | | | | | | |
Collapse
|
5
|
Tintle N, Aschard H, Hu I, Nock N, Wang H, Pugh E. Inflated type I error rates when using aggregation methods to analyze rare variants in the 1000 Genomes Project exon sequencing data in unrelated individuals: summary results from Group 7 at Genetic Analysis Workshop 17. Genet Epidemiol 2011; 35 Suppl 1:S56-60. [PMID: 22128060 PMCID: PMC3249221 DOI: 10.1002/gepi.20650] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
As part of Genetic Analysis Workshop 17 (GAW17), our group considered the application of novel and standard approaches to the analysis of genotype-phenotype association in next-generation sequencing data. Our group identified a major issue in the analysis of the GAW17 next-generation sequencing data: type I error and false-positive report probability rates higher than those expected based on empirical type I error levels (as high as 90%). Two main causes emerged: population stratification and long-range correlation (gametic phase disequilibrium) between rare variants. Population stratification was expected because of the diverse sample. Correlation between rare variants was attributable to both random causes (e.g., nearly 10,000 of 25,000 markers were private variants, and the sample size was small [n = 697]) and nonrandom causes (more correlation was observed than was expected by random chance). Principal components analysis was used to control for population structure and helped to minimize type I errors, but this was at the expense of identifying fewer causal variants. A novel multiple regression approach showed promise to handle correlation between markers. Further work is needed, first, to identify best practices for the control of type I errors in the analysis of sequencing data and then to explore and compare the many promising new aggregating approaches for identifying markers associated with disease phenotypes.
Collapse
Affiliation(s)
- Nathan Tintle
- Department of Mathematics, Statistics, and Computer Science, Dordt College, Sioux Center, IA 51250, USA.
| | | | | | | | | | | |
Collapse
|