Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Pahl R, Schäfer H. PERMORY: an LD-exploiting permutation test algorithm for powerful genome-wide association testing. Bioinformatics 2010;26:2093-100. [PMID: 20605926 DOI: 10.1093/bioinformatics/btq399] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

For:	Pahl R, Schäfer H. PERMORY: an LD-exploiting permutation test algorithm for powerful genome-wide association testing. Bioinformatics 2010;26:2093-100. [PMID: 20605926 DOI: 10.1093/bioinformatics/btq399] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Duangjan C, Arpawong TE, Spatola BN, Curran SP. Hepatic WDR23 proteostasis mediates insulin homeostasis by regulating insulin-degrading enzyme capacity. GeroScience 2024:10.1007/s11357-024-01196-y. [PMID: 38767782 DOI: 10.1007/s11357-024-01196-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/08/2024] [Indexed: 05/22/2024] Open

Powell NR, Geck RC, Lai D, Shugg T, Skaar TC, Dunham M. Functional Analysis of G6PD Variants Associated With Low G6PD Activity in the All of Us Research Program. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.04.12.24305393. [PMID: 38645242 PMCID: PMC11030488 DOI: 10.1101/2024.04.12.24305393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]

Abstract

Glucose-6-phosphate dehydrogenase (G6PD) protects red blood cells against oxidative damage through regeneration of NADPH. Individuals with G6PD polymorphisms (variants) that produce an impaired G6PD enzyme are usually asymptomatic, but at risk of hemolytic anemia from oxidative stressors, including certain drugs and foods. Prevention of G6PD deficiency-related hemolytic anemia is achievable through G6PD genetic testing or whole-genome sequencing (WGS) to identify affected individuals who should avoid hemolytic triggers. However, accurately predicting the clinical consequence of G6PD variants is limited by over 800 G6PD variants which remain of uncertain significance. There also remains significant variability in which deficiency-causing variants are included in pharmacogenomic testing arrays across institutions: many panels only include c.202G>A, even though dozens of other variants can also cause G6PD deficiency. Here, we seek to improve G6PD genotype interpretation using data available in the All of Us Research Program and using a yeast functional assay. We confirm that G6PD coding variants are the main contributor to decreased G6PD activity, and that 13% of individuals in the All of Us data with deficiency-causing variants would be missed if only the c.202G>A variant were tested for. We expand clinical interpretation for G6PD variants of uncertain significance; reporting that c.595A>G, known as G6PD Dagua or G6PD Açores, and the newly identified variant c.430C>G, reduce activity sufficiently to lead to G6PD deficiency. We also provide evidence that five missense variants of uncertain significance are unlikely to lead to G6PD deficiency, since they were seen in hemi- or homozygous individuals without a reduction in G6PD activity. We also applied the new WHO guidelines and were able to classify two synonymous variants as WHO class C. We anticipate these results will improve the accuracy, and prompt increased use, of G6PD genetic tests through a more complete clinical interpretation of G6PD variants. As the All of Us data increases from 245,000 to 1 million participants, and additional functional assays are carried out, we expect this research to serve as a template to enable complete characterization of G6PD deficiency genotypes. With an increased number of interpreted variants, genetic testing of G6PD will be more informative for preemptively identifying individuals at risk for drug- or food-induced hemolytic anemia.

Collapse

Shi Y, Shi W, Wang M, Lee JH, Kang H, Jiang H. Accurate and fast small p-value estimation for permutation tests in high-throughput genomic data analysis with the cross-entropy method. Stat Appl Genet Mol Biol 2023;22:sagmb-2021-0067. [PMID: 37622330 DOI: 10.1515/sagmb-2021-0067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 06/23/2023] [Indexed: 08/26/2023]

Villa O, Stuhr NL, Yen CA, Crimmins EM, Arpawong TE, Curran SP. Genetic variation in ALDH4A1 is associated with muscle health over the lifespan and across species. eLife 2022;11:74308. [PMID: 35470798 PMCID: PMC9106327 DOI: 10.7554/elife.74308] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 04/13/2022] [Indexed: 11/13/2022] Open

Abstract

The influence of genetic variation on the aging process, including the incidence and severity of age-related diseases, is complex. Here, we define the evolutionarily conserved mitochondrial enzyme ALH-6/ALDH4A1 as a predictive biomarker for age-related changes in muscle health by combining Caenorhabditis elegans genetics and a gene-wide association scanning (GeneWAS) from older human participants of the US Health and Retirement Study (HRS). In a screen for mutations that activate oxidative stress responses, specifically in the muscle of C. elegans, we identified 96 independent genetic mutants harboring loss-of-function alleles of alh-6, exclusively. Each of these genetic mutations mapped to the ALH-6 polypeptide and led to the age-dependent loss of muscle health. Intriguingly, genetic variants in ALDH4A1 show associations with age-related muscle-related function in humans. Taken together, our work uncovers mitochondrial alh-6/ALDH4A1 as a critical component to impact normal muscle aging across species and a predictive biomarker for muscle health over the lifespan.

Ageing is inevitable, but what makes one person ‘age well’ and another decline more quickly remains largely unknown. While many aspects of ageing are clearly linked to genetics, the specific genes involved often remain unidentified.

Sarcopenia is an age-related condition affecting the muscles. It involves a gradual loss of muscle mass that becomes faster with age, and is associated with loss of mobility, decreased quality of life, and increased risk of death. Around half of all people aged 80 and over suffer from sarcopenia. Several lifestyle factors, especially poor diet and lack of exercise, are associated with the condition, but genetics is also involved: the condition accelerates more quickly in some people than others, and even fit, physically active individuals can be affected.

To study the genetics of conditions like sarcopenia, researchers often use animals like flies or worms, which have short generation times but share genetic similarities with humans. For example, the worm Caenorhabditis elegans has equivalents of several human muscle genes, including the gene alh-6. In worms, alh-6 is important for maintaining energy supply to the muscles, and mutating it not only leads to muscle damage but also to premature ageing. Given this insight, Villa, Stuhr, Yen et al. wanted to determine if variation in the human version of alh-6, ALDH4A1, also contributes to individual differences in muscle ageing and decline in humans.

Evaluating variation in this gene required a large amount of genetic data from older adults. These were taken from a continuous study that follows >35,000 older adults. Importantly, the study collects not only information on gene sequences but also measures of muscle health and performance over time for each individual. Analysis of these genetic data revealed specific small variations in the DNA of ALDH4A1, all of which associated with reduced muscle health.

Follow-up experiments in worms used genetic engineering techniques to test how variation in the worm alh-6 gene could influence age-related health. The resulting mutant worms developed muscle problems much earlier than their normal counterparts, supporting the role of alh-6/ALDH4A1 in determining muscle health across the lifespan of both worms and humans.

These results have identified a key influencer of muscle health during ageing in worms, and emphasize the importance of validating effects of genetic variation among humans during this process. Villa, Stuhr, Yen et al. hope that this study will help researchers find more genetic ‘markers’ of muscle health, and ultimately allow us to predict an individual’s risk of sarcopenia based on their genetic make-up.

Collapse

Asif H, Alliey-Rodriguez N, Keedy S, Tamminga CA, Sweeney JA, Pearlson G, Clementz BA, Keshavan MS, Buckley P, Liu C, Neale B, Gershon ES. GWAS significance thresholds for deep phenotyping studies can depend upon minor allele frequencies and sample size. Mol Psychiatry 2021;26:2048-2055. [PMID: 32066829 PMCID: PMC7429341 DOI: 10.1038/s41380-020-0670-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 01/28/2020] [Accepted: 01/29/2020] [Indexed: 02/01/2023]

Liu L, He J, Lu X, Yuan Y, Jiang D, Xiao H, Lin S, Xu L, Chen Y. Association of Myopia and Genetic Variants of TGFB2-AS1 and TGFBR1 in the TGF-β Signaling Pathway: A Longitudinal Study in Chinese School-Aged Children. Front Cell Dev Biol 2021;9:628182. [PMID: 33996791 PMCID: PMC8115727 DOI: 10.3389/fcell.2021.628182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 04/06/2021] [Indexed: 11/17/2022] Open

Abstract

Background

Myopia is a complex multifactorial condition which involves several overlapping signaling pathways mediated by distinct genes. This prospective cohort study evaluated the associations of two genetic variants in the TGF-β signaling pathway with the onset and progression of myopia and ocular biometric parameters in Chinese school-aged children.

Methods

A total of 556 second grade children were examined and followed up for 3.5 years. Non-cycloplegic refraction and ocular biometric parameters were measured annually. Multivariate regression analysis was used to assess the effect of the TGFBR1 rs10760673 and TGFB2-AS1 rs7550232 variants on the occurrence and progression of myopia. A 10,000 permutations test was used to correct for multiple testing. Functional annotation of single nucleotide polymorphisms (SNPs) was performed using RegulomeDB, HaploReg, and rVarBase.

Results

A total of 448 children were included in the analysis. After adjustments for gender, age, near work time and outdoor time with 10,000 permutations, the results indicated that the C allele and the AC or CC genotypes of rs7550232 adjacent to TGFB2-AS1 were associated with a significantly increased risk of the onset of myopia in two genetic models (additive: P’ = 0.022; dominant: P’ = 0.025). Additionally, the A allele and the AA or AG genotypes of rs10760673 of TGFBR1 were associated with a significant myopic shift (additive: P’ = 0.008; dominant: P’ = 0.028; recessive: P’ = 0.027). Furthermore, rs10760673 was associated with an increase in axial length (AL) (P’ = 0.013, β = 0.03) and a change in the ratio of AL to the corneal radius of curvature (AL/CRC) (P’ = 0.031, β = 0.003). Analysis using RegulomeDB, HaploReg, and rVarBase indicated that rs7550232 is likely to affect transcription factor binding, any motif, DNase footprint, and DNase peak.

Conclusion

The present study indicated that rs10760673 and rs7550232 may represent susceptibility loci for the progression and onset of myopia, respectively, in school-aged children. Associations of the variants of the TGFBR1 and TGFB2-AS1 genes with myopia may be mediated by the TGF-β signaling pathway; this hypothesis requires validation in functional studies. This trial was registered as ChiCTR1900020584 at www.Chictr.org.cn.

Collapse

Kunert-Graf JM, Sakhanenko NA, Galas DJ. Optimized permutation testing for information theoretic measures of multi-gene interactions. BMC Bioinformatics 2021;22:180. [PMID: 33827420 PMCID: PMC8028212 DOI: 10.1186/s12859-021-04107-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 03/29/2021] [Indexed: 11/17/2022] Open

Abstract

Background

Permutation testing is often considered the “gold standard” for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large.

Results

In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 10³ for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples.

Conclusions

The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts.

Collapse

Hao Z, Jiang L, Gao J, Ye J, Zhao J, Li S, Yang R. Quick approximation of threshold values for genome-wide association studies. Brief Bioinform 2020;20:2217-2223. [PMID: 30219836 DOI: 10.1093/bib/bby082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2018] [Revised: 08/10/2018] [Accepted: 08/14/2018] [Indexed: 11/13/2022] Open

Leem S, Huh I, Park T. Enhanced Permutation Tests via Multiple Pruning. Front Genet 2020;11:509. [PMID: 32670346 PMCID: PMC7330123 DOI: 10.3389/fgene.2020.00509] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 04/27/2020] [Indexed: 11/25/2022] Open

Abstract

Big multi-omics data in bioinformatics often consists of a huge number of features and relatively small numbers of samples. In addition, features from multi-omics data have their own specific characteristics depending on whether they are from genomics, proteomics, metabolomics, etc. Due to these distinct characteristics, standard statistical analyses using parametric-based assumptions may sometimes fail to provide exact asymptotic results. To resolve this issue, permutation tests can be a way to exactly analyze multi-omics data because they are distribution-free and flexible to use. In permutation tests, p-values are evaluated by estimating the locations of test statistics in an empirical null distribution generated by random shuffling. However, the permutation approach can be infeasible when the number of features increases, because more stringent control of type I error is needed for multiple hypothesis testing, and consequently, much larger numbers of permutations are required to reach significance. To address this problem, we propose a well-organized strategy, “ENhanced Permutation tests via multiple Pruning (ENPP).” ENPP prunes the features in every permutation round if they are determined to be non-significant. In other words, if the feature statistics from the permuted datasets exceed the feature statistics from the original dataset, beyond a predetermined threshold, the feature is determined to be non-significant. If so, ENPP removes the feature and iterates the process without the feature in the next permutation round. Our simulation study showed that the ENPP method could remove about 50% of the features at the first permutation round, and, by the 100th permutation round, 98% of the features had been removed and only 7.4% of the computation time with the original unpruned permutation approach had elapsed. In addition, we applied this approach to a real data set (Korea Association REsource: KARE) of 327,872 SNPs to find association with a non-normally distributed phenotype (fasting plasma glucose), interpreted the results, and discussed the feasibility and advantages of the approach.

Collapse

George AW, Verbyla A, Bowden J. Eagle: multi-locus association mapping on a genome-wide scale made routine. Bioinformatics 2020;36:1509-1516. [PMID: 31596455 DOI: 10.1093/bioinformatics/btz759] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 08/19/2019] [Accepted: 10/02/2019] [Indexed: 12/24/2022] Open

Rojano E, Seoane P, Ranea JAG, Perkins JR. Regulatory variants: from detection to predicting impact. Brief Bioinform 2019;20:1639-1654. [PMID: 29893792 PMCID: PMC6917219 DOI: 10.1093/bib/bby039] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 04/18/2018] [Indexed: 02/01/2023] Open

Abstract

Variants within non-coding genomic regions can greatly affect disease. In recent years, increasing focus has been given to these variants, and how they can alter regulatory elements, such as enhancers, transcription factor binding sites and DNA methylation regions. Such variants can be considered regulatory variants. Concurrently, much effort has been put into establishing international consortia to undertake large projects aimed at discovering regulatory elements in different tissues, cell lines and organisms, and probing the effects of genetic variants on regulation by measuring gene expression. Here, we describe methods and techniques for discovering disease-associated non-coding variants using sequencing technologies. We then explain the computational procedures that can be used for annotating these variants using the information from the aforementioned projects, and prediction of their putative effects, including potential pathogenicity, based on rule-based and machine learning approaches. We provide the details of techniques to validate these predictions, by mapping chromatin-chromatin and chromatin-protein interactions, and introduce Clustered Regularly Interspaced Short Palindromic Repeats-Associated Protein 9 (CRISPR-Cas9) technology, which has already been used in this field and is likely to have a big impact on its future evolution. We also give examples of regulatory variants associated with multiple complex diseases. This review is aimed at bioinformaticians interested in the characterization of regulatory variants, molecular biologists and geneticists interested in understanding more about the nature and potential role of such variants from a functional point of views, and clinicians who may wish to learn about variants in non-coding genomic regions associated with a given disease and find out what to do next to uncover how they impact on the underlying mechanisms.

Collapse

Zhang M, Wang J, Wang Y, Wu S, Sandford AJ, Luo J, He JQ. Association of the TLR1 variant rs5743557 with susceptibility to tuberculosis. J Thorac Dis 2019;11:583-594. [PMID: 30963003 DOI: 10.21037/jtd.2019.01.74] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Abstract

Background

Toll-like receptor 1 (TLR1) and TLR6 play important roles in the innate immune response against Mycobacterium tuberculosis (M.TB) via interactions with TIR domain-containing adaptor protein (TIRAP) and myeloid differentiation primary response 88 (MYD88). The aim of this study was to investigate the relationship of TLR1, TLR6, MYD88 and TIRAP polymorphisms with susceptibility to latent tuberculosis infection (LTBI) and tuberculosis (TB).

Methods

In total, 204 uninfected healthy controls (HC), 201 individuals with LTBI and 209 TB patients were enrolled. Two interferon-γ release assays were used to differentiate individuals with LTBI from uninfected controls. TagSNPs of the four genes were genotyped by the SNPscan^TM Kit. The Haploview 4.2 and SHEsis software packages were combined to perform linkage disequilibrium (LD) and haplotype analyses. Multifactor dimensionality reduction (MDR) software was used to investigate gene-gene interaction. The Stata 12.0 software was used to perform meta-analysis of the relationship between rs5743557 and TB susceptibility.

Results

The AA genotype of rs5743557 was associated with reduced TB risk (P=0.006) and the AA/GA genotypes of TLR1 rs5743604 were associated with increased TB risk (P=0.017) when the LTBI group was compared with the TB group. The frequency of TLR1 haplotype rs4833095-rs5743604 CG was significantly higher in the LTBI group than in the TB group (P=0.019877). However, only the relationship between rs5743557 and TB susceptibility remained significant after 1000-fold permutation testing (P=0.023). The meta-analysis suggested that rs5743557_A was associated with decreased TB risk in the Chinese adult population (P<0.001, OR 0.80, 95% CI: 0.72-0.88). No significant gene-gene interactions were found.

Conclusions

The results of our study suggest that the tagSNP rs5743557 of TLR1 is associated with the risk of TB.

Collapse

Yu F, Liang K, Zhang Z, Du D, Zhang X, Zhao H, Ui Haq B, Qiu F. Dissecting the genetic architecture of waterlogging stress-related traits uncovers a key waterlogging tolerance gene in maize. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2018;131:2299-2310. [PMID: 30062652 DOI: 10.1007/s00122-018-3152-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Accepted: 07/23/2018] [Indexed: 06/08/2023]

Al-Rawi MS, Freitas A, Duarte JV, Cunha JP, Castelo-Branco M. Permutations of functional magnetic resonance imaging classification may not be normally distributed. Stat Methods Med Res 2017;26:2567-2585. [PMID: 29251253 DOI: 10.1177/0962280215601707] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Abstract

A fundamental question that often occurs in statistical tests is the normality of distributions. Countless distributions exist in science and life, but one distribution that is obtained via permutations, usually referred to as permutation distribution, is interesting. Although a permutation distribution should behave in accord with the central limit theorem, if both the independence condition and the identical distribution condition are fulfilled, no studies have corroborated this concurrence in functional magnetic resonance imaging data. In this work, we used Anderson-Darling test to evaluate the accordance level of permutation distributions of classification accuracies to normality expected under central limit theorem. A simulation study has been carried out using functional magnetic resonance imaging data collected, while human subjects responded to visual stimulation paradigms. Two scrambling schemes are evaluated: the first based on permuting both the training and the testing sets and the second on permuting only the testing set. The results showed that, while a normal distribution does not adequately fit to permutation distributions most of the times, it tends to be quite well acceptable when mean classification accuracies averaged over a set of different classifiers is considered. The results also showed that permutation distributions can be probabilistically affected by performing motion correction to functional magnetic resonance imaging data, and thus may weaken the approximation of permutation distributions to a normal law. Such findings, however, have no relation to univariate/univoxel analysis of functional magnetic resonance imaging data. Overall, the results revealed a strong dependence across the folds of cross-validation and across functional magnetic resonance imaging runs and that may hinder the reliability of using cross-validation. The obtained p-values and the drawn confidence level intervals exhibited beyond doubt that different permutation schemes may beget different permutation distributions as well as different levels of accord with central limit theorem. We also found that different permutation schemes can lead to different permutation distributions and that may lead to different assessment of the statistical significance of classification accuracy.

Collapse

Segal BD, Braun T, Elliott MR, Jiang H. Fast approximation of small p-values in permutation tests by partitioning the permutations. Biometrics 2017. [PMID: 29542118 DOI: 10.1111/biom.12731] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Lakhal-Chaieb L, Oualkacha K, Richards BJ, Greenwood CM. A rare variant association test in family-based designs and non-normal quantitative traits. Stat Med 2015;35:905-21. [DOI: 10.1002/sim.6750] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Revised: 09/04/2015] [Accepted: 09/05/2015] [Indexed: 12/13/2022]

Lee BY, Lee KN, Lee T, Park JH, Kim SM, Lee HS, Chung DS, Shim HS, Lee HK, Kim H. Bovine Genome-wide Association Study for Genetic Elements to Resist the Infection of Foot-and-mouth Disease in the Field. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2015;28:166-70. [PMID: 25557811 PMCID: PMC4283160 DOI: 10.5713/ajas.14.0383] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Revised: 08/06/2014] [Accepted: 08/21/2014] [Indexed: 12/29/2022]

Yang G, Jiang W, Yang Q, Yu W. PBOOST: a GPU-based tool for parallel permutation tests in genome-wide association studies. ACTA ACUST UNITED AC 2014;31:1460-2. [PMID: 25535244 DOI: 10.1093/bioinformatics/btu840] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 12/16/2014] [Indexed: 11/13/2022]

Abstract

MOTIVATION

The importance of testing associations allowing for interactions has been demonstrated by Marchini et al. (2005). A fast method detecting associations allowing for interactions has been proposed by Wan et al. (2010a). The method is based on likelihood ratio test with the assumption that the statistic follows the χ(2) distribution. Many single nucleotide polymorphism (SNP) pairs with significant associations allowing for interactions have been detected using their method. However, the assumption of χ(2) test requires the expected values in each cell of the contingency table to be at least five. This assumption is violated in some identified SNP pairs. In this case, likelihood ratio test may not be applicable any more. Permutation test is an ideal approach to checking the P-values calculated in likelihood ratio test because of its non-parametric nature. The P-values of SNP pairs having significant associations with disease are always extremely small. Thus, we need a huge number of permutations to achieve correspondingly high resolution for the P-values. In order to investigate whether the P-values from likelihood ratio tests are reliable, a fast permutation tool to accomplish large number of permutations is desirable.

RESULTS

We developed a permutation tool named PBOOST. It is based on GPU with highly reliable P-value estimation. By using simulation data, we found that the P-values from likelihood ratio tests will have relative error of >100% when 50% cells in the contingency table have expected count less than five or when there is zero expected count in any of the contingency table cells. In terms of speed, PBOOST completed 10(7) permutations for a single SNP pair from the Wellcome Trust Case Control Consortium (WTCCC) genome data (Wellcome Trust Case Control Consortium, 2007) within 1 min on a single Nvidia Tesla M2090 device, while it took 60 min in a single CPU Intel Xeon E5-2650 to finish the same task. More importantly, when simultaneously testing 256 SNP pairs for 10(7) permutations, our tool took only 5 min, while the CPU program took 10 h. By permuting on a GPU cluster consisting of 40 nodes, we completed 10(12) permutations for all 280 SNP pairs reported with P-values smaller than 1.6 × 10⁻¹² in the WTCCC datasets in 1 week.

AVAILABILITY AND IMPLEMENTATION

The source code and sample data are available at http://bioinformatics.ust.hk/PBOOST.zip.

CONTACT

gyang@ust.hk; eeyu@ust.hk

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Che R, Jack JR, Motsinger-Reif AA, Brown CC. An adaptive permutation approach for genome-wide association study: evaluation and recommendations for use. BioData Min 2014;7:9. [PMID: 24976866 PMCID: PMC4070098 DOI: 10.1186/1756-0381-7-9] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Accepted: 06/02/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Permutation testing is a robust and popular approach for significance testing in genomic research, which has the broad advantage of estimating significance non-parametrically, thereby safe guarding against inflated type I error rates. However, the computational efficiency remains a challenging issue that limits its wide application, particularly in genome-wide association studies (GWAS). Because of this, adaptive permutation strategies can be employed to make permutation approaches feasible. While these approaches have been used in practice, there is little research into the statistical properties of these approaches, and little guidance into the proper application of such a strategy for accurate p-value estimation at the GWAS level.

METHODS

In this work, we advocate an adaptive permutation procedure that is statistically valid as well as computationally feasible in GWAS. We perform extensive simulation experiments to evaluate the robustness of the approach to violations of modeling assumptions and compare the power of the adaptive approach versus standard approaches. We also evaluate the parameter choices in implementing the adaptive permutation approach to provide guidance on proper implementation in real studies. Additionally, we provide an example of the application of adaptive permutation testing on real data.

RESULTS

The results provide sufficient evidence that the adaptive test is robust to violations of modeling assumptions. In addition, even when modeling assumptions are correct, the power achieved by adaptive permutation is identical to the parametric approach over a range of significance thresholds and effect sizes under the alternative. A framework for proper implementation of the adaptive procedure is also generated.

CONCLUSIONS

While the adaptive permutation approach presented here is not novel, the current study provides evidence of the validity of the approach, and importantly provides guidance on the proper implementation of such a strategy. Additionally, tools are made available to aid investigators in implementing these approaches.

Collapse

MCPerm: a Monte Carlo permutation method for accurately correcting the multiple testing in a meta-analysis of genetic association studies. PLoS One 2014;9:e89212. [PMID: 24586601 PMCID: PMC3931718 DOI: 10.1371/journal.pone.0089212] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Accepted: 01/17/2014] [Indexed: 02/01/2023] Open

De R, Bush WS, Moore JH. Bioinformatics challenges in genome-wide association studies (GWAS). Methods Mol Biol 2014;1168:63-81. [PMID: 24870131 DOI: 10.1007/978-1-4939-0847-9_5] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Rajabli F, Inan G, Ilk O. Power analysis of C-TDT for small sample size genome-wide association studies by the joint use of case-parent trios and pairs. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2013;2013:235825. [PMID: 23737858 PMCID: PMC3659481 DOI: 10.1155/2013/235825] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2013] [Revised: 04/08/2013] [Accepted: 04/13/2013] [Indexed: 11/18/2022]

Sheikh H, Kryski K, Smith H, Hayden E, Singh S. Corticotropin-releasing hormone system polymorphisms are associated with children’s cortisol reactivity. Neuroscience 2013;229:1-11. [DOI: 10.1016/j.neuroscience.2012.10.056] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Revised: 10/26/2012] [Accepted: 10/29/2012] [Indexed: 11/26/2022]

Chapter 11: Genome-wide association studies. PLoS Comput Biol 2012;8:e1002822. [PMID: 23300413 PMCID: PMC3531285 DOI: 10.1371/journal.pcbi.1002822] [Citation(s) in RCA: 621] [Impact Index Per Article: 51.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open

Steiß V, Letschert T, Schäfer H, Pahl R. PERMORY-MPI: a program for high-speed parallel permutation testing in genome-wide association studies. Bioinformatics 2012;28:1168-9. [PMID: 22345620 DOI: 10.1093/bioinformatics/bts086] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Rapid and robust resampling-based multiple-testing correction with application in a genome-wide expression quantitative trait loci study. Genetics 2012;190:1511-20. [PMID: 22298711 DOI: 10.1534/genetics.111.137737] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Li MX, Yeung JMY, Cherny SS, Sham PC. Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets. Hum Genet 2011;131:747-56. [PMID: 22143225 PMCID: PMC3325408 DOI: 10.1007/s00439-011-1118-2] [Citation(s) in RCA: 557] [Impact Index Per Article: 42.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2011] [Accepted: 11/13/2011] [Indexed: 11/25/2022]

Association of milk protein genes with fertilization rate and early embryonic development in Holstein dairy cattle. J DAIRY RES 2011;79:47-52. [DOI: 10.1017/s0022029911000744] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Abstract Concomitant with intensive selection for increased milk yield, reproductive performance of dairy cows has declined in the last decades, in part due to an unfavourable genetic relationship between these traits. Given that the six main milk protein genes (i.e. whey proteins and caseins) are directly involved in milk production and hence have been a target of the strong selection aimed at improving milk yield in dairy cattle, we hypothesized that these genes could show selection footprints associated with fertility traits. In this study, we used an in-vitro fertilization (IVF) system to test genetic association between 66 single nucleotide polymorphisms (SNPs) in the four caseins (αS1-casein, αS2-casein, β-casein and κ-casein) and the two whey protein genes (α-lactalbumin and β-lactoglobulin) with fertilization rate and early embryonic development in the Holstein breed. A total of 6893 in-vitro fertilizations were performed and a total of 4661 IVF embryos were produced using oocytes from 399 ovaries and semen samples from 12 bulls. Associations between SNPs and fertility traits were analysed using a mixed linear model with genotype as fixed effect and ovary and bull as random effects. A multiple testing correction approach was used to account for the correlation between SNPs due to linkage disequilibrium. After correction, polymorphisms in the LALBA and LGB genes showed significant associations with fertilization success and blastocyst rate. No significant associations were detected between SNPs located in the casein region and IVF fertility traits. Although the molecular mechanisms underlying the association between whey protein genes and fertility have not yet been characterized, this study provides the first evidence of association between these genes and fertility traits. Furthermore, these results could shed light on the antagonistic relationship that exists between milk yield and fertility in dairy cattle. Collapse

Gui H, Li M, Sham PC, Cherny SS. Comparisons of seven algorithms for pathway analysis using the WTCCC Crohn's Disease dataset. BMC Res Notes 2011;4:386. [PMID: 21981765 PMCID: PMC3199264 DOI: 10.1186/1756-0500-4-386] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2011] [Accepted: 10/07/2011] [Indexed: 12/20/2022] Open

Abstract

BACKGROUND

Though rooted in genomic expression studies, pathway analysis for genome-wide association studies (GWAS) has gained increasing popularity, since it has the potential to discover hidden disease pathogenic mechanisms by combining statistical methods with biological knowledge. Generally, algorithms or programs proposed recently can be categorized by different types of input data, null hypothesis or counts of analysis stages. Due to complexity caused by SNP, gene and pathway relationships, re-sampling strategies like permutation are always utilized to derive an empirical distribution for test statistics for evaluating the significance of candidate pathways. However, evaluation of these algorithms on real GWAS datasets and real biological pathway databases needs to be addressed before we apply them widely with confidence.

FINDINGS

Two algorithms which use summary statistics from GWAS as input were implemented in KGG, a novel and user-friendly software tool for GWAS pathway analysis. Comparisons of these two algorithms as well as the other five selected algorithms were conducted by analyzing the WTCCC Crohn's Disease dataset utilizing the MsigDB canonical pathways. As a result of using permutation to obtain empirical p-value, most of these methods could control Type I error rate well, although some are conservative. However, the methods varied greatly in terms of power and running time, with the PLINK truncated set-based test being the most powerful and KGG being the fastest.

CONCLUSIONS

Raw data-based algorithms, such as those implemented in PLINK, are preferable for GWAS pathway analysis as long as computational capacity is available. It may be worthwhile to apply two or more pathway analysis algorithms on the same GWAS dataset, since the methods differ greatly in their outputs and might provide complementary findings for the studied complex disease.

Collapse

Genes involved in vasoconstriction and vasodilation system affect salt-sensitive hypertension. PLoS One 2011;6:e19620. [PMID: 21573014 PMCID: PMC3090407 DOI: 10.1371/journal.pone.0019620] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Accepted: 04/12/2011] [Indexed: 01/11/2023] Open

Abstract

The importance of excess salt intake in the pathogenesis of hypertension is widely recognized. Blood pressure is controlled primarily by salt and water balance because of the infinite gain property of the kidney to rapidly eliminate excess fluid and salt. Up to fifty percent of patients with essential hypertension are salt-sensitive, as manifested by a rise in blood pressure with salt loading. We conducted a two-stage genetic analysis in hypertensive patients very accurately phenotyped for their salt-sensitivity. All newly discovered never treated before, essential hypertensives underwent an acute salt load to monitor the simultaneous changes in blood pressure and renal sodium excretion. The first stage consisted in an association analysis of genotyping data derived from genome-wide array on 329 subjects. Principal Component Analysis demonstrated that this population was homogenous. Among the strongest results, we detected a cluster of SNPs located in the first introns of PRKG1 gene (rs7897633, p = 2.34E-05) associated with variation in diastolic blood pressure after acute salt load. We further focused on two genetic loci, SLC24A3 and SLC8A1 (plasma membrane sodium/calcium exchange proteins, NCKX3 and NCX1, respectively) with a functional relationship with the previous gene and associated to variations in systolic blood pressure (the imputed rs3790261, p = 4.55E-06; and rs434082, p = 4.7E-03). In stage 2, we characterized 159 more patients for the SNPs in PRKG1, SLC24A3 and SLC8A1. Combined analysis showed an epistatic interaction of SNPs in SLC24A3 and SLC8A1 on the pressure-natriuresis (p interaction = 1.55E-04, p model = 3.35E-05), supporting their pathophysiological link in cellular calcium homeostasis. In conclusions, these findings point to a clear association between body sodium-blood pressure relations and molecules modulating the contractile state of vascular cells through an increase in cytoplasmic calcium concentration.

Collapse

Johnson RC, Nelson GW, Troyer JL, Lautenberger JA, Kessing BD, Winkler CA, O'Brien SJ. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics 2010;11:724. [PMID: 21176216 PMCID: PMC3023815 DOI: 10.1186/1471-2164-11-724] [Citation(s) in RCA: 189] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2010] [Accepted: 12/22/2010] [Indexed: 11/20/2022] Open