1
|
A powerful method for pleiotropic analysis under composite null hypothesis identifies novel shared loci between Type 2 Diabetes and Prostate Cancer. PLoS Genet 2020; 16:e1009218. [PMID: 33290408 PMCID: PMC7748289 DOI: 10.1371/journal.pgen.1009218] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2020] [Revised: 12/18/2020] [Accepted: 10/22/2020] [Indexed: 12/24/2022] Open
Abstract
There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global null hypothesis that there is no association of a genetic variant with any of the traits, the rejection of which does not implicate pleiotropy. In this article, we propose a new statistical approach, PLACO, for specifically detecting pleiotropic loci between two traits by considering an underlying composite null hypothesis that a variant is associated with none or only one of the traits. We propose testing the null hypothesis based on the product of the Z-statistics of the genetic variants across two studies and derive a null distribution of the test statistic in the form of a mixture distribution that allows for fractions of variants to be associated with none or only one of the traits. We borrow approaches from the statistical literature on mediation analysis that allow asymptotic approximation of the null distribution avoiding estimation of nuisance parameters related to mixture proportions and variance components. Simulation studies demonstrate that the proposed method can maintain type I error and can achieve major power gain over alternative simpler methods that are typically used for testing pleiotropy. PLACO allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. Application of PLACO to publicly available summary data from two large case-control GWAS of Type 2 Diabetes and of Prostate Cancer implicated a number of novel shared genetic regions: 3q23 (ZBTB38), 6q25.3 (RGS17), 9p22.1 (HAUS6), 9p13.3 (UBAP2), 11p11.2 (RAPSN), 14q12 (AKAP6), 15q15 (KNL1) and 18q23 (ZNF236). We propose a new approach PLACO that uses aggregate-level genotype-phenotype association statistics—commonly referred to as GWAS summary statistics—to identify genetic variants that influence risk of two traits or diseases. It allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. We demonstrate that PLACO can achieve major power gain over alternative methods that are typically used. We applied PLACO to Type 2 Diabetes and Prostate Cancer summary data from two large case-control studies. Many previous studies have reported an inverse association of these two chronic diseases suggesting shared risk factors; however, shared genetic mechanisms underlying this association is poorly understood. PLACO identified a number of novel shared genetic regions that are not detected by individual trait analysis. Many of the loci implicated by PLACO increase risk for one disease while decreasing risk for the other. PLACO can similarly be used on other traits to shed light on shared genetic risk factors.
Collapse
|
2
|
Ray D, Boehnke M. Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet Epidemiol 2017; 42:134-145. [PMID: 29226385 DOI: 10.1002/gepi.22105] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 10/27/2017] [Accepted: 11/08/2017] [Indexed: 12/21/2022]
Abstract
Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.
Collapse
Affiliation(s)
- Debashree Ray
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
3
|
Ray D, Basu S. A novel association test for multiple secondary phenotypes from a case-control GWAS. Genet Epidemiol 2017; 41:413-426. [PMID: 28393390 DOI: 10.1002/gepi.22045] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 12/22/2016] [Accepted: 02/05/2017] [Indexed: 12/13/2022]
Abstract
In the past decade, many genome-wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case-control design. These GWASs not only collect information on the disease status (primary phenotype, D) and the SNPs (genotypes, X), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case-control data for exploring genetic associations of some additional traits (secondary phenotypes, Y) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non-random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D, even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM-PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM-PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y, X, and D. Finally, we use POM-PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case-control sample from the population-based Metabolic Syndrome in Men (METSIM) study. Only POM-PS analysis of the T2D case-control sample seems to provide valid association signals.
Collapse
Affiliation(s)
- Debashree Ray
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Saonli Basu
- Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
4
|
O'Brien JA, Vega A, Bouguyon E, Krouk G, Gojon A, Coruzzi G, Gutiérrez RA. Nitrate Transport, Sensing, and Responses in Plants. MOLECULAR PLANT 2016; 9:837-56. [PMID: 27212387 DOI: 10.1016/j.molp.2016.05.004] [Citation(s) in RCA: 272] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 05/16/2016] [Accepted: 05/16/2016] [Indexed: 05/20/2023]
Abstract
Nitrogen (N) is an essential macronutrient that affects plant growth and development. N is an important component of chlorophyll, amino acids, nucleic acids, and secondary metabolites. Nitrate is one of the most abundant N sources in the soil. Because nitrate and other N nutrients are often limiting, plants have developed sophisticated mechanisms to ensure adequate supply of nutrients in a variable environment. Nitrate is absorbed in the root and mobilized to other organs by nitrate transporters. Nitrate sensing activates signaling pathways that impinge upon molecular, metabolic, physiological, and developmental responses locally and at the whole plant level. With the advent of genomics technologies and genetic tools, important advances in our understanding of nitrate and other N nutrient responses have been achieved in the past decade. Furthermore, techniques that take advantage of natural polymorphisms present in divergent individuals from a single species have been essential in uncovering new components. However, there are still gaps in our understanding of how nitrate signaling affects biological processes in plants. Moreover, we still lack an integrated view of how all the regulatory factors identified interact or crosstalk to orchestrate the myriad N responses plants typically exhibit. In this review, we provide an updated overview of mechanisms by which nitrate is sensed and transported throughout the plant. We discuss signaling components and how nitrate sensing crosstalks with hormonal pathways for developmental responses locally and globally in the plant. Understanding how nitrate impacts on plant metabolism, physiology, and growth and development in plants is key to improving crops for sustainable agriculture.
Collapse
Affiliation(s)
- José A O'Brien
- Departamento de Genética Molecular y Microbiología, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Systems and Synthetic Biology, Pontificia Universidad Católica de Chile, 8331150, Chile; Departamento de Fruticultura y Enología, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Andrea Vega
- Departamento de Ciencias Vegetales, Pontificia Universidad Católica de Chile, Santiago, 7820436, Chile
| | - Eléonore Bouguyon
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA; Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Gabriel Krouk
- Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Alain Gojon
- Laboratoire de Biochimie et Physiologie Moléculaire des Plantes, Institut de Biologie Intégrative des Plantes 'Claude Grignon', UMR CNRS, INRA, SupAgro, UM, 2 Place Viala, 34060 Montpellier Cedex, France
| | - Gloria Coruzzi
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY 10003, USA
| | - Rodrigo A Gutiérrez
- Departamento de Genética Molecular y Microbiología, FONDAP Center for Genome Regulation, Millennium Nucleus Center for Plant Systems and Synthetic Biology, Pontificia Universidad Católica de Chile, 8331150, Chile.
| |
Collapse
|