1
|
Liao SY, Tan YD. Sister haplotypes and recombination disequilibrium: a new approach to identify associations of haplotypes with complex diseases. Front Genet 2024; 14:1295327. [PMID: 38292437 PMCID: PMC10825010 DOI: 10.3389/fgene.2023.1295327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 12/13/2023] [Indexed: 02/01/2024] Open
Abstract
Haplotype-based association analysis has several advantages over single-SNP association analysis. However, to date all haplotype-disease associations have not excluded recombination interference among multiple loci and hence some results might be confounded by recombination interference. Association of sister haplotypes with a complex disease, based on recombination disequilibrium (RD) was presented. Sister haplotypes can be determined by translating notation of DNA base haplotypes to notation of genetic genotypes. Sister haplotypes provide haplotype pairs available for haplotype-disease association analysis. After performing RD tests in control and case cohorts, a two-by-two contingency table can be constructed using sister haplotype pair and case-control pair. With this standard two-by-two table, one can perform classical Chi-square test to find statistical haplotype-disease association. Applying this method to a haplotype dataset of Alzheimer disease (AD), association of sister haplotypes containing ApoE3/4 with risk for AD was identified under no RD. Haplotypes within gene IL-13 were not associated with risk for breast cancer in the case of no RD and no association of haplotypes in gene IL-17A with risk for coronary artery disease were detected without RD. The previously reported associations of haplotypes within these genes with risk for these diseases might be due to strong RD and/or inappropriate haplotype pairs.
Collapse
Affiliation(s)
- Shun-Yao Liao
- Institute of Gerontology, Center for Genetics, Sichuan Academy & Sichuan Provincial People Hospital, University of Electronic Science and Technology of China, Chendu, Sichuan, China
| | - Yuan-De Tan
- Inflammatory Bowel and Immunobiology Research Institute, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
2
|
Labib JR, Ibrahem SK, Sleem HM, Ismail MM, Abd El Fatah SA, Salem MR, Abdelaal AA, Al-hanafi H. Diagnostic indicator of acute lung injury for pediatric critically ill patients at a tertiary pediatric hospital. Medicine (Baltimore) 2018; 97:e9929. [PMID: 29517700 PMCID: PMC5882441 DOI: 10.1097/md.0000000000009929] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Early identification of acute lung injury (ALI) in pediatric patients at risk of mortality is important for improving outcome.Assessment of soluble form of receptor for advanced glycation end products (sRAGE) as a valid biomarker for diagnosis of ALI among critically ill, pediatric patients in addition to correlating levels of sRAGE and different outcomes of those patients.A Hospital-based case-control study was conducted in pediatric intensive care units (PICUs) at Cairo University Hospital, along a period of 6 months. Total of 68 pediatric patients following inclusion criteria were classified into: patients with ALI; with both ALI and sepsis; with sepsis and control patients. They were prospectively followed and their laboratory and immunological workup (at days 1 and 9) was done to measure serum sRAGE levels and detect (sRAGE) genotypes.The age of the included children ranged from 8 to 84 months. Plasma level of sRAGE was significantly higher in plasma from patients with ALI regardless of associated sepsis. Plasma sRAGE levels were positively correlated with lung injury score. When assessing sRAGE genotypes, TA and TT genotypes were significant in most of the ALI with and without sepsis patients.Monitoring levels of sRAGE and genotypes can significantly affect the survival of ALI children.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Amaal A. Abdelaal
- Clinical and Chemical Pathology Department, Faculty of Medicine, Cairo University, Egypt
| | - Hadeel Al-hanafi
- Clinical and Chemical Pathology Department, Faculty of Medicine, Cairo University, Egypt
| |
Collapse
|
3
|
Peng T, Wang L, Li G. The analysis of APOL1 genetic variation and haplotype diversity provided by 1000 Genomes project. BMC Nephrol 2017; 18:267. [PMID: 28800731 PMCID: PMC5553676 DOI: 10.1186/s12882-017-0675-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 07/19/2017] [Indexed: 11/20/2022] Open
Abstract
Background The APOL1 gene variants has been shown to be associated with an increased risk of multiple kinds of diseases, particularly in African Americans, but not in Caucasians and Asians. In this study, we explored the single nucleotide polymorphism (SNP) and haplotype diversity of APOL1 gene in different races provided by 1000 Genomes project. Methods Variants of APOL1 gene in 1000 Genome Project were obtained and SNPs located in the regulatory region or coding region were selected for genetic variation analysis. Total 2504 individuals from 26 populations were classified as four groups that included Africa, Europe, Asia and Admixed populations. Tag SNPs were selected to evaluate the haplotype diversities in the four populations by HaploStats software. Results APOL1 gene was surrounded by some of the most polymorphic genes in the human genome, variation of APOL1 gene was common, with up to 613 SNP (1000 Genome Project reported) and 99 of them (16.2%) with MAF ≥ 1%. There were 79 SNPs in the URR and 92 SNPs in 3’UTR. Total 12 SNPs in URR and 24 SNPs in 3’UTR were considered as common variants with MAF ≥ 1%. It is worth noting that URR-1 was presents lower frequencies in European populations, while other three haplotypes taken an opposite pattern; 3’UTR presents several high-frequency variation sites in a short segment, and the differences of its haplotypes among different population were significant (P < 0.01), UTR-1 and UTR-5 presented much higher frequency in African population, while UTR-2, UTR-3 and UTR-4 were much lower. APOL1 coding region showed that two SNP of G1 with higher frequency are actually pull down the haplotype H-1 frequency when considering all populations pooled together, and the diversity among the four populations be widen by the G1 two mutation (P1 = 3.33E-4 vs P2 = 3.61E-30). Conclusions The distributions of APOL1 gene variants and haplotypes were significantly different among the different populations, in either regulatory or coding regions. It could provide clues for the future genetic study of APOL1 related diseases. Electronic supplementary material The online version of this article (doi:10.1186/s12882-017-0675-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ting Peng
- Renal Division and Institute of Nephrology, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, No. 32, West 2nd Duan, 1st Circle Road, Qingyang District, Chengdu, Sichuan, People's Republic of China, 610072
| | - Li Wang
- Renal Division and Institute of Nephrology, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, No. 32, West 2nd Duan, 1st Circle Road, Qingyang District, Chengdu, Sichuan, People's Republic of China, 610072
| | - Guisen Li
- Renal Division and Institute of Nephrology, Sichuan Academy of Medical Sciences and Sichuan Provincial People's Hospital, School of Medicine, University of Electronic Science and Technology of China, No. 32, West 2nd Duan, 1st Circle Road, Qingyang District, Chengdu, Sichuan, People's Republic of China, 610072.
| |
Collapse
|
4
|
Albuquerque D, Manco L, González LM, Gervasini G, Benito GM, González JR, Rodríguez-López R. Polymorphisms in the SNRPN gene are associated with obesity susceptibility in a Spanish population. J Gene Med 2017; 19. [PMID: 28387446 DOI: 10.1002/jgm.2956] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Revised: 03/15/2017] [Accepted: 04/04/2017] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND SNRPN, which codes for the RNA-binding SmN protein, is a candidate gene for Prader-Willi syndrome. One characteristic of this neuroendocrine disorder is hyperphagia resulting in extreme obesity later in life. In the present study, we aimed to assess whether variability within this gene could be implicated in obesity susceptibility. METHODS A case-control study was performed including 265 unrelated patients with nonsyndromic and early-onset severe obesity, belonging to high-risk obesity families from Spanish ancestry; 184 healthy control individuals were included representative of the same genetic background and sex-matched. Forty-nine single nucleotide polymorphisms (SNPs) spanning the entire SNRPN gene were selected and genotyped using the Sequenom MassARRAY platform (Sequenom Inc., San Diego, CA, USA). RESULTS The four SNPs, rs12905653, rs752874, rs1391516 and rs2047433, were found to be nominally associated with obesity (p < 0.03). The diversity haplotype distribution among cases and controls identified the combination rs12905653-T/rs8028366-A/rs4028395-T as being strongly and inversely associated with obesity (odds ratio = 0.49; p = 0.0006). A genetic risk score was built based on rs12905653, rs1391516 and rs2047433 SNPs and each unit increase in genetic risk score increased the obesity risk by 49% (odds ratio = 1.49, 95% confidence interval = 1.24-1.80). CONCLUSIONS To our knowledge, this is the first study reporting an association between variability in the SNRPN gene and the risk of being obese. Interestingly, it was the major allele of each SNP that was found to be associated with the risk of weight gain. Further studies analyzing this locus and the possible additive deleterious capability of SNP combinations could be useful for demonstrating the development of obesity.
Collapse
Affiliation(s)
- David Albuquerque
- Research Center for Anthropology and Health (CIAS), University of Coimbra, Coimbra, Portugal.,Genomics Group, Fundación Investigación Hospital General Universitario de Valencia, Valencia, Spain
| | - Licínio Manco
- Research Center for Anthropology and Health (CIAS), University of Coimbra, Coimbra, Portugal
| | - Luz M González
- Genomics Group, Fundación Investigación Hospital General Universitario de Valencia, Valencia, Spain
| | - Guillermo Gervasini
- Department of Medical & Surgical Therapeutics, Division of Pharmacology, Medical School, University of Extremadura, Badajoz, Spain
| | - Goitzane Marcaida Benito
- Genomics Group, Fundación Investigación Hospital General Universitario de Valencia, Valencia, Spain.,Laboratory of Molecular Genetics, Clinical Analysis Service, Hospital Universitario General de Valencia, Valencia, Spain
| | - Juan R González
- ISGlobal, Centre for Research in Environmental Epidemiology (CREAL), Barcelona, Spain.,CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Raquel Rodríguez-López
- Genomics Group, Fundación Investigación Hospital General Universitario de Valencia, Valencia, Spain.,Laboratory of Molecular Genetics, Clinical Analysis Service, Hospital Universitario General de Valencia, Valencia, Spain
| |
Collapse
|
5
|
Li J, Lange LA, Sabourin J, Duan Q, Valdar W, Willis MS, Li Y, Wilson JG, Lange EM. Genome- and exome-wide association study of serum lipoprotein (a) in the Jackson Heart Study. J Hum Genet 2015; 60:755-61. [DOI: 10.1038/jhg.2015.107] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2015] [Revised: 07/17/2015] [Accepted: 07/21/2015] [Indexed: 11/09/2022]
|
6
|
Wen SH, Tsai MY. Haplotype association analysis of combining unrelated case-control and triads with consideration of population stratification. Front Genet 2014; 5:103. [PMID: 24860592 PMCID: PMC4028876 DOI: 10.3389/fgene.2014.00103] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2014] [Accepted: 04/09/2014] [Indexed: 12/27/2022] Open
Abstract
Combining data when data are collected under different study designs, such as family trios and unrelated case-control samples, gains more power and is cost-effective than analyzing each data separately. However, a potential concern is population stratification (PS) among unrelated case-control samples and analyses integrating data should address this confounding effect. In this paper, we develop a simpler method, haplotype generalized linear model (HGLM), that tests and estimates haplotype effects on disease risk and allows for modification against PS for combining data. We proposed to combine information across aggregations of haplotype weighted-counts estimated from population case-control data and trio data separately, and to perform subsequent GLM analysis. Furthermore, we present a framework of analysis of variance based on haplotype weighted-counts for detecting whether it is appropriate to combine two data sources, as well as the modified HGLM with clustering methods for addressing PS. We evaluate the statistical properties in terms of the accuracy, false positive rate (FPR) and empirical power using simulated data with regard to various disease risks, sample sizes, multi-SNP haplotypes and the presence of PS. Our simulation results indicate that HGLM performs comparably well with the likelihood-based haplotype association analysis, particularly when the haplotype effects are moderate, but may not perform well when dealing with lengthy haplotypes for small sample sizes. In the presence of PS, the modified HGLM remains valid and has satisfactory nominal level and small bias. Overall, HGLM appears to be successful in combining data and is simple to implement in standard statistical software.
Collapse
Affiliation(s)
- Shu-Hui Wen
- Department of Public Health, College of Medicine, Tzu-Chi University Hualien, Taiwan
| | - Miao-Yu Tsai
- Institute of Statistics and Information Science, National Changhua University of Education Chang-Hua, Taiwan
| |
Collapse
|
7
|
Dahgam S, Modig L, Torinsson Naluai Å, Olin AC, Nyberg F. Haplotypes of the inducible nitric oxide synthase gene are strongly associated with exhaled nitric oxide levels in adults: a population-based study. J Med Genet 2014; 51:449-54. [PMID: 24729625 DOI: 10.1136/jmedgenet-2013-101897] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
BACKGROUND Previous genetic association studies have reported evidence for association of single-nucleotide polymorphisms (SNPs) in the NOS2 gene, encoding inducible nitric oxide synthase (iNOS), to variation in levels of fractional exhaled nitric oxide (FENO) in children and adults. In this study, we evaluated 10 SNPs in the region of chromosome 17 from 26.07 Mb to 26.13 Mb to further understand the contribution of NOS2 to variation in levels of FENO. METHODS In a cohort of 5912 adults 25-75 years of age, we investigated the relationship between NOS2 haplotypes and FENO, and effect modification by asthma. RESULTS Seven common (frequency ≥5%) haplotypes (H1-H7) were inferred from all possible haplotype combinations. One haplotype (H3) was significantly associated with lower levels of FENO: -5.8% (95% CI -9.8 to -1.7; p=0.006) compared with the most common baseline haplotype H1. Two haplotypes (H5 and H6) were significantly associated with higher levels of FENO: +10.7% (95% CI 5.0 to 16.7; p=0.0002) and +14.9% (95% CI 10.6 to 19.3; p=7.8×10(-13)), respectively. The effect of haplotype H3 was mainly seen in subjects with asthma (-21.6% (95% CI -33.5 to -5.9)) and was not significant in subjects without asthma (-4.2% (95% CI -8.4 to 0.2)). The p value for interaction between H3 and asthma status was 0.004. CONCLUSIONS Our findings suggest that several common haplotypes in the NOS2 gene contribute to variation in FENO in adults. We also saw some evidence of effect modification by asthma status on haplotype H3.
Collapse
Affiliation(s)
- Santosh Dahgam
- Occupational and Environmental Medicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Lars Modig
- Department of Public Health and Clinical Medicine, Occupational and Environmental Medicine, University of Umeå, Umeå, Sweden
| | - Åsa Torinsson Naluai
- Department of Medical and Clinical Genetics, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Anna-Carin Olin
- Occupational and Environmental Medicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden
| | - Fredrik Nyberg
- Occupational and Environmental Medicine, Sahlgrenska Academy at University of Gothenburg, Gothenburg, Sweden AstraZeneca R&D, Mölndal, Sweden
| |
Collapse
|
8
|
Combined genotype and haplotype tests for region-based association studies. BMC Genomics 2013; 14:569. [PMID: 23964661 PMCID: PMC3852120 DOI: 10.1186/1471-2164-14-569] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2013] [Accepted: 08/13/2013] [Indexed: 12/13/2022] Open
Abstract
Background Although single-SNP analysis has proven to be useful in identifying many disease-associated loci, region-based analysis has several advantages. Empirically, it has been shown that region-based genotype and haplotype approaches may possess much higher power than single-SNP statistical tests. Both high quality haplotypes and genotypes may be available for analysis given the development of next generation sequencing technologies and haplotype assembly algorithms. Results As generally it is unknown whether genotypes or haplotypes are more relevant for identifying an association, we propose to use both of them with the purpose of preserving high power under both genotype and haplotype disease scenarios. We suggest two approaches for a combined association test and investigate the performance of these two approaches based on a theoretical model, population genetics simulations and analysis of a real data set. Conclusions Based on a theoretical model, population genetics simulations and analysis of a central corneal thickness (CCT) Genome Wide Association Study (GWAS) data set we have shown that combined genotype and haplotype approach has a high potential utility for applications in association studies.
Collapse
|
9
|
Oldmeadow C, Riveros C, Holliday EG, Scott R, Moscato P, Wang JJ, Mitchell P, Buitendijk GHS, Vingerling JR, Klaver CCW, Klein R, Attia J. Sifting the wheat from the chaff: prioritizing GWAS results by identifying consistency across analytical methods. Genet Epidemiol 2011; 35:745-54. [PMID: 22125219 DOI: 10.1002/gepi.20622] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The curse of multiple testing has led to the adoption of a stringent Bonferroni threshold for declaring genome-wide statistical significance for any one SNP as standard practice. Although justified in avoiding false positives, this conservative approach has the potential to miss true associations as most studies are drastically underpowered. As an alternative to increasing sample size, we compare results from a typical SNP-by-SNP analysis with three other methods that incorporate regional information in order to boost or dampen an otherwise noisy signal: the haplotype method (Schaid et al. [2002] Am J Hum Genet 70:425-434), the gene-based method (Liu et al. [2010] Am J Hum Genet 87:139-145), and a new method (interaction count) that uses genome-wide screening of pairwise SNP interactions. Using a modestly sized case-control study, we conduct a genome-wide association studies (GWAS) of age-related macular degeneration, and find striking agreement across all methods in regions of known associated variants. We also find strong evidence of novel associated variants in two regions (Chromosome 2p25 and Chromosome 10p15) in which the individual SNP P-values are only suggestive, but where there are very high levels of agreement between all methods. We propose that consistency between different analysis methods may be an alternative to increasingly larger sample sizes in sifting true signals from noise in GWAS.
Collapse
Affiliation(s)
- Christopher Oldmeadow
- School of Medicine and Public Health, University of Newcastle, Newcastle upon Tyne, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Wang Q, Yu D, Pan Y. Association test between haplotypes and longitudinal traits in complex pedigrees. J Anim Breed Genet 2011; 128:376-85. [PMID: 21906183 DOI: 10.1111/j.1439-0388.2011.00931.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Evaluating the association of candidate genes with longitudinal traits would be a useful method to study the genetic basis of complex traits. Haplotypes incorporate more information about the underlying polymorphisms than do genotypes for individual SNP, and have been considered as a more informative format of data in association analysis. In this study, we extended the random regression model to allow analysing haplotype effects in a longitudinal framework and then performed a hierarchical Bayesian method to estimate parameter values. We assessed the performance of the proposed approach and demonstrated its validity and power with simulation. The power of our method was also demonstrated by an example of Meishan pigs, in which one haplotype affecting the total number of piglets born was detected using our method, whereas it cannot be detected using the conventional single SNP-based model. Additionally, the model is flexible to be extended to model a complex network of genetic regulation that includes the interactions between different haplotypes and between haplotypes and environments.
Collapse
Affiliation(s)
- Q Wang
- School of Agriculture and Biology, Shanghai Jiao Tong University, China
| | | | | |
Collapse
|
11
|
Jacobsson JA, Almén MS, Benedict C, Hedberg LA, Michaëlsson K, Brooks S, Kullberg J, Axelsson T, Johansson L, Ahlström H, Fredriksson R, Lind L, Schiöth HB. Detailed analysis of variants in FTO in association with body composition in a cohort of 70-year-olds suggests a weakened effect among elderly. PLoS One 2011; 6:e20158. [PMID: 21637715 PMCID: PMC3103532 DOI: 10.1371/journal.pone.0020158] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2011] [Accepted: 04/26/2011] [Indexed: 11/19/2022] Open
Abstract
Background The rs9939609 single-nucleotide polymorphism (SNP) in the fat mass and obesity (FTO) gene has previously been associated with higher BMI levels in children and young adults. In contrast, this association was not found in elderly men. BMI is a measure of overweight in relation to the individuals' height, but offers no insight into the regional body fat composition or distribution. Objective To examine whether the FTO gene is associated with overweight and body composition-related phenotypes rather than BMI, we measured waist circumference, total fat mass, trunk fat mass, leg fat mass, visceral and subcutaneous adipose tissue, and daily energy intake in 985 humans (493 women) at the age of 70 years. In total, 733 SNPs located in the FTO gene were genotyped in order to examine whether rs9939609 alone or the other SNPs, or their combinations, are linked to obesity-related measures in elderly humans. Design Cross-sectional analysis of the Prospective Investigation of the Vasculature in Uppsala Seniors (PIVUS) cohort. Results Neither a single SNP, such as rs9939609, nor a SNP combination was significantly linked to overweight, body composition-related measures, or daily energy intake in elderly humans. Of note, these observations hold both among men and women. Conclusions Due to the diversity of measurements included in the study, our findings strengthen the view that the effect of FTO on body composition appears to be less profound in later life compared to younger ages and that this is seemingly independent of gender.
Collapse
Affiliation(s)
- Josefin A. Jacobsson
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
| | - Markus Sällman Almén
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
| | - Christian Benedict
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
| | - Lilia A. Hedberg
- Science for Life Laboratory, Royal Institute of Technology (KTH), School of Biotechnology, Solna, Sweden
| | - Karl Michaëlsson
- Department of Surgical Sciences, Uppsala University, Uppsala, Sweden
| | - Samantha Brooks
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
| | - Joel Kullberg
- Department of Oncology, Radiology and Clinical Immunology, Uppsala University, Uppsala, Sweden
| | - Tomas Axelsson
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Lars Johansson
- Department of Oncology, Radiology and Clinical Immunology, Uppsala University, Uppsala, Sweden
- AstraZeneca R&D Mölndal, Mölndal, Sweden
| | - Håkan Ahlström
- Department of Oncology, Radiology and Clinical Immunology, Uppsala University, Uppsala, Sweden
| | - Robert Fredriksson
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
| | - Lars Lind
- Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Helgi B. Schiöth
- Department of Neuroscience, Functional Pharmacology, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
12
|
Amos W, Driscoll E, Hoffman JI. Candidate genes versus genome-wide associations: which are better for detecting genetic susceptibility to infectious disease? Proc Biol Sci 2010; 278:1183-8. [PMID: 20926441 DOI: 10.1098/rspb.2010.1920] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Technological developments allow increasing numbers of markers to be deployed in case-control studies searching for genetic factors that influence disease susceptibility. However, with vast numbers of markers, true 'hits' may become lost in a sea of false positives. This problem may be particularly acute for infectious diseases, where the control group may contain unexposed individuals with susceptible genotypes. To explore this effect, we used a series of stochastic simulations to model a scenario based loosely on bovine tuberculosis. We find that a candidate gene approach tends to have greater statistical power than studies that use large numbers of single nucleotide polymorphisms (SNPs) in genome-wide association tests, almost regardless of the number of SNPs deployed. Both approaches struggle to detect genetic effects when these are either weak or if an appreciable proportion of individuals are unexposed to the disease when modest sample sizes (250 each of cases and controls) are used, but these issues are largely mitigated if sample sizes can be increased to 2000 or more of each class. We conclude that the power of any genotype-phenotype association test will be improved if the sampling strategy takes account of exposure heterogeneity, though this is not necessarily easy to do.
Collapse
Affiliation(s)
- W Amos
- Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK.
| | | | | |
Collapse
|
13
|
MacLeod IM, Hayes BJ, Savin KW, Chamberlain AJ, McPartlan HC, Goddard ME. Power of a genome scan to detect and locate quantitative trait loci in cattle using dense single nucleotide polymorphisms. J Anim Breed Genet 2010; 127:133-42. [PMID: 20433522 DOI: 10.1111/j.1439-0388.2009.00831.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There is increasing use of dense single nucleotide polymorphisms (SNPs) for whole-genome association studies (WGAS) in livestock to map and identify quantitative trait loci (QTL). These studies rely on linkage disequilibrium (LD) to detect an association between SNP genotypes and phenotypes. The power and precision of these WGAS are unknown, and will depend on the extent of LD in the experimental population. One complication for WGAS in livestock populations is that they typically consist of many paternal half-sib families, and in some cases full-sib families; unless this subtle population stratification is accounted for, many spurious associations may be reported. Our aim was to investigate the power, precision and false discovery rates of WGAS for QTL discovery, with a commercial SNP array, given existing patterns of LD in cattle. We also tested the efficiency of selective genotyping animals. A total of 365 cattle were genotyped for 9232 SNPs. We simulated a QTL effect as well as polygenic and environmental effects for all animals. One QTL was simulated on a randomly chosen SNP and accounted for 5%, 10% or 18% of the total variance. The power to detect a moderate-sized additive QTL (5% of the phenotypic variance) with 365 animals genotyped was 37% (p < 0.001). Most importantly, if pedigree structure was not accounted for, the number of false positives significantly increased above those expected by chance alone. Selective genotyping also resulted in a significant increase in false positives, even when pedigree structure was accounted for.
Collapse
Affiliation(s)
- I M MacLeod
- Cooperative Research Centre for Beef Genetic Technologies, Armidale, NSW, Australia.
| | | | | | | | | | | |
Collapse
|
14
|
Tong L, Yang J, Cooper RS. Efficient calculation of P-value and power for quadratic form statistics in multilocus association testing. Ann Hum Genet 2010; 74:275-85. [PMID: 20529017 DOI: 10.1111/j.1469-1809.2010.00574.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
We address the asymptotic and approximate distributions of a large class of test statistics with quadratic forms used in association studies. The statistics of interest take the general form D=X(T)A X, where A is a general similarity matrix which may or may not be positive semi-definite, and X follows the multivariate normal distribution with mean mu and variance matrix Sigma, where Sigma may or may not be singular. We show that D can be written as a linear combination of independent chi(2) random variables with a shift. Furthermore, its distribution can be approximated by a chi(2) or the difference of two chi(2) distributions. In the setting of association testing, our methods are especially useful in two situations. First, when the required significance level is much smaller than 0.05 such as in a genome scan, the estimation of p-values using permutation procedures can be challenging. Second, when an EM algorithm is required to infer haplotype frequencies from un-phased genotype data, the computation can be intensive for a permutation procedure. In either situation, an efficient and accurate estimation procedure would be useful. Our method can be applied to any quadratic form statistic and therefore should be of general interest.
Collapse
Affiliation(s)
- Liping Tong
- Department of Mathematics and Statistics, Loyola University Chicago, Chicago, IL 60660, USA.
| | | | | |
Collapse
|
15
|
Paus T. Population neuroscience: why and how. Hum Brain Mapp 2010; 31:891-903. [PMID: 20496380 PMCID: PMC6871127 DOI: 10.1002/hbm.21069] [Citation(s) in RCA: 90] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Revised: 03/09/2010] [Accepted: 03/10/2010] [Indexed: 12/25/2022] Open
Abstract
Population neuroscience endeavours to identify environmental and genetic factors that shape the function and structure of the human brain; it uses tools and knowledge of genetics, epidemiology, and cognitive neuroscience. Here, I focus on the application of population neuroscience in studies of brain development. By describing in some detail four existing large-scale magnetic resonance (MR) imaging studies of typically developing children and adolescents, I provide an overview of their design, including population sampling and recruitment, assessments of environmental and genetic "exposures," and measurements of brain and behavior "outcomes." I then discuss challenges faced by investigators carrying out such MR-based studies, including quality assurance, quality control and intersite coordination, and provide a brief overview of the achievements made so far. I conclude by outlining future directions vis-à-vis population neuroscience, such as design strategies that can be used to evaluate the presence of absence of causality in associations discovered by observational studies.
Collapse
Affiliation(s)
- Tomás Paus
- Rotman Research Institute, University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|
16
|
Lin WY, Schaid DJ. Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes. Genet Epidemiol 2009; 33:183-97. [PMID: 18814307 DOI: 10.1002/gepi.20364] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Recently, a genomic distance-based regression for multilocus associations was proposed (Wessel and Schork [2006] Am. J. Hum. Genet. 79:792-806) in which either locus or haplotype scoring can be used to measure genetic distance. Although it allows various measures of genomic similarity and simultaneous analyses of multiple phenotypes, its power relative to other methods for case-control analyses is not well known. We compare the power of traditional methods with this new distance-based approach, for both locus-scoring and haplotype-scoring strategies. We discuss the relative power of these association methods with respect to five properties: (1) the marker informativity; (2) the number of markers; (3) the causal allele frequency; (4) the preponderance of the most common high-risk haplotype; (5) the correlation between the causal single-nucleotide polymorphism (SNP) and its flanking markers. We found that locus-based logistic regression and the global score test for haplotypes suffered from power loss when many markers were included in the analyses, due to many degrees of freedom. In contrast, the distance-based approach was not as vulnerable to more markers or more haplotypes. A genotype counting measure was more sensitive to the marker informativity and the correlation between the causal SNP and its flanking markers. After examining the impact of the five properties on power, we found that on average, the genomic distance-based regression that uses a matching measure for diplotypes was the most powerful and robust method among the seven methods we compared.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology, National Taiwan University, Taipei, Taiwan.
| | | |
Collapse
|
17
|
Gao L, Barnes KC. Recent advances in genetic predisposition to clinical acute lung injury. Am J Physiol Lung Cell Mol Physiol 2009; 296:L713-25. [PMID: 19218355 DOI: 10.1152/ajplung.90269.2008] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
It has been well established that acute lung injury (ALI), and the more severe presentation of acute respiratory distress syndrome (ARDS), constitute complex traits characterized by a multigenic and multifactorial etiology. Identification and validation of genetic variants contributing to disease susceptibility and severity has been hampered by the profound heterogeneity of the clinical phenotype and the role of environmental factors, which includes treatment, on outcome. The critical nature of ALI and ARDS, compounded by the impact of phenotypic heterogeneity, has rendered the amassing of sufficiently powered studies especially challenging. Nevertheless, progress has been made in the identification of genetic variants in select candidate genes, which has enhanced our understanding of the specific pathways involved in disease manifestation. Identification of novel candidate genes for which genetic association studies have confirmed a role in disease has been greatly aided by the powerful tool of high-throughput expression profiling. This article will review these studies to date, summarizing candidate genes associated with ALI and ARDS, acknowledging those that have been replicated in independent populations, with a special focus on the specific pathways for which candidate genes identified so far can be clustered.
Collapse
Affiliation(s)
- Li Gao
- The Johns Hopkins Asthma and Allergy Center, Baltimore, MD 21224, USA
| | | |
Collapse
|
18
|
Fisher SA, Lewis CM. Power of genetic association studies in the presence of linkage disequilibrium and allelic heterogeneity. Hum Hered 2008; 66:210-22. [PMID: 18612206 DOI: 10.1159/000143404] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2007] [Accepted: 10/30/2007] [Indexed: 12/21/2022] Open
Abstract
OBJECTIVES The calculation of the power and sample size required for association studies is essential, particularly for follow-up of genome-wide association studies, where much genotyping is required to replicate the original finding and identify the true disease susceptibility mutation. METHODS In this paper, we derive equations for estimation of sample sizes for the transmission disequilibrium test (TDT) and for case-control studies, in the presence of allelic heterogeneity and indirect association - where the genotyped tagging SNP is in linkage disequilibrium (LD) with the true mutation. Using data from NOD2 and PTPN22, we show that the true sample sizes required to detect association may be incorrect when calculated under the assumption of a single mutation and complete LD with the genotyped marker. RESULTS The true sample sizes may be lower when allelic heterogeneity acts in a recessive model across mutations, or increased when mutations lie on different alleles of a common tagging SNP. CONCLUSION Calculating power and sample size under a range of realistic models of LD and allelic heterogeneity is essential to ensure that association studies have sufficient power to detect mutations.
Collapse
Affiliation(s)
- Sheila A Fisher
- Division of Genetics and Molecular Medicine, Institute of Psychiatry, King's College London, London, UK
| | | |
Collapse
|
19
|
Gu CC, Yu K, Rao DC. Characterization of LD structures and the utility of HapMap in genetic association studies. ADVANCES IN GENETICS 2008; 60:407-35. [PMID: 18358328 DOI: 10.1016/s0065-2660(07)00415-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Observed distribution of and variation in linkage disequilibrium (LD) with respect to the evolution history and disease transmission in a population is the driving force behind the current wave of genome-wide association (GWA) studies of complex human diseases. An extensive literature covers topics from haplotype analysis that utilizes local LD structures in candidate genes and regions to genome-wide organization of LD blocks (neighborhood) that led to the development of International HapMap Project and panels of "tagSNPs" used by current GWA studies. In this chapter, we examine the scenarios where each of the major types of analysis methods may be applicable and where the current popular genotyping platforms for GWA might come short. We discuss current association analysis methods by emphasizing their reliance on the local LD structures or the global organization of the LD structures, and highlight the need to consider individual marker information content in large-scale association mapping.
Collapse
Affiliation(s)
- C Charles Gu
- Division of Biostatistics and Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | | | |
Collapse
|
20
|
Abstract
Multi-locus association analyses, including haplotype-based analyses, can sometimes provide greater power than single-locus analyses for detecting disease susceptibility loci. This potential gain, however, can be compromised by the large number of degrees of freedom caused by irrelevant markers. Exhaustive search for the optimal set of markers might be possible for a small number of markers, yet it is computationally inefficient. In this paper, we present a sequential haplotype scan method to search for combinations of adjacent markers that are jointly associated with disease status. When evaluating each marker, we add markers close to it in a sequential manner: a marker is added if its contribution to the haplotype association with disease is warranted, conditional on current haplotypes. This conditional evaluation is based on the well-known Mantel-Haenszel statistic. We propose two permutation based methods to evaluate the growing haplotypes: a haplotype method for the combined markers, and a summary method that sums conditional statistics. We compared our proposed methods, the single-locus method, and a sliding window method using simulated data. We also applied our sequential haplotype scan algorithm to experimental data for CYP2D6. The results indicate that the sequential scan procedure can identify a set of adjacent markers whose haplotypes might have strong genetic effects or be in linkage disequilibrium with disease predisposing variants. As a result, our methods can achieve greater power than the single-locus method, yet is much more computationally efficient than sliding window methods.
Collapse
Affiliation(s)
- Zhaoxia Yu
- Division of Biostatistics, Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, Minnesota 55905, USA
| | | |
Collapse
|
21
|
Liang KH, Wu YJ. Prediction of complex traits based on the epistasis of multiple haplotypes. J Hum Genet 2007; 52:456-463. [PMID: 17427028 DOI: 10.1007/s10038-007-0140-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2006] [Accepted: 03/07/2007] [Indexed: 11/28/2022]
Abstract
Analysis of epistasis, or gene-gene interactions, is of particular importance for revealing the molecular mechanisms of complex human diseases. Multiple genes, each of which has a moderate effect, might interact and produce a complex phenotypic trait. In this paper, we present a novel method of epistasis analysis, utilizing multiple phase-resolved haplotypes residing in different genomic regions. Prediction models can then be derived from the epistasis to indicate the susceptibility of a person to a dichrotomous phenotypic trait. The simulation results showed that the prediction accuracy of this method is dependent on the penetrance rate of the underlying model. The computation cost, on the other hand, is dependent on the number of genomic regions involved for the complex phenotypic trait.
Collapse
Affiliation(s)
- Kung-Hao Liang
- Vita Genomics, Inc., 7F, No.6, Sec.1, Jungshing Rd., Wugu Shiang, Taipei County, 248, Taiwan.
| | - Ying-Jye Wu
- Vita Genomics, Inc., 7F, No.6, Sec.1, Jungshing Rd., Wugu Shiang, Taipei County, 248, Taiwan
| |
Collapse
|
22
|
Lavebratt C, Sengul S. Single nucleotide polymorphism (SNP) allele frequency estimation in DNA pools using Pyrosequencing™. Nat Protoc 2007; 1:2573-82. [PMID: 17406511 DOI: 10.1038/nprot.2006.442] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Identifying the genetic variation underlying complex disease requires analysis of many single nucleotide polymorphisms (SNPs) in a large number of samples. Several high-throughput SNP genotyping techniques are available; however, their cost promotes the use of association screening with pooled DNA. This protocol describes the estimation of SNP allele frequencies in pools of DNA using the quantitative sequencing method Pyrosequencing (PSQ). PSQ is a relatively recently described high-throughput method for genotyping, allele frequency estimation and DNA methylation analysis based on the detection of real-time pyrophosphate release during synthesis of the complementary strand to a PCR product. The protocol involves the following steps: (i) quantity and quality assessment of individual DNA samples; (ii) DNA pooling, which may be undertaken at the pre- or post-PCR stage; (iii) PCR amplification of PSQ template containing the variable sequence region of interest; and (iv) PSQ to determine the frequency of alleles at a particular SNP site. Once the quantity and quality of individual DNA samples has been assessed, the protocol usually requires a few days for setting up pre-PCR pools, depending on sample number. After PCR amplification, preparation and analysis of PCR amplicon by PSQ takes 1 h per plate.
Collapse
Affiliation(s)
- Catharina Lavebratt
- Karolinska Institutet, Neurogenetics Unit, Center for Molecular Medicine (CMM), Karolinska Hospital, 171 76 Stockholm, Sweden.
| | | |
Collapse
|
23
|
Hsieh CH, Liang KH, Hung YJ, Huang LC, Pei D, Liao YT, Kuo SW, Bey MSJ, Chen JL, Chen EY. Analysis of epistasis for diabetic nephropathy among type 2 diabetic patients. Hum Mol Genet 2006; 15:2701-8. [PMID: 16893912 DOI: 10.1093/hmg/ddl203] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Diabetic nephropathy (DN) is one of the most serious complications of diabetes, accounting for the majority of patients with end-stage renal disease. The molecular pathogenesis of DN involves multiple pathways in a complex, partially resolved manner. The paper presents an exploratory epistatic study for DN. Association analysis were performed on 231 SNP loci in a cohort of 264 type 2 diabetes patients, followed by the epistasis analysis using the multifactor dimensionality reduction and the genetic algorithm with Boolean algebra. A two-locus epistatic effect of EGFR and RXRG was identified, with a cross-validation consistency of 91.7%.
Collapse
Affiliation(s)
- Chang-Hsun Hsieh
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Tri-Service General Hospital, Taipei, Taiwan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Lemire M. SUP: an extension to SLINK to allow a larger number of marker loci to be simulated in pedigrees conditional on trait values. BMC Genet 2006; 7:40. [PMID: 16803631 PMCID: PMC1524809 DOI: 10.1186/1471-2156-7-40] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2006] [Accepted: 07/03/2006] [Indexed: 12/22/2022] Open
Abstract
Background With the recent advances in high-throughput genotyping technologies that allow for large-scale association mapping of human complex traits, promising statistical designs and methods have been emerging. Efficient simulation software are key elements for the evaluation of the properties of new statistical tests. SLINK is a flexible simulation tool that has been widely used to generate the segregation and recombination processes of markers linked to, and possibly associated with, a trait locus, conditional on trait values in arbitrary pedigrees. In practice, its most serious limitation is the small number of loci that can be simulated, since the complexity of the algorithm scales exponentially with this number. Results I describe the implementation of a two-step algorithm to be used in conjunction with SLINK to enable the simulation of a large number of marker loci linked to a trait locus and conditional on trait values in families, with the possibility for the loci to be in linkage disequilibrium. SLINK is used in the first step to simulate genotypes at the trait locus conditional on the observed trait values, and also to generate an indicator of the descent path of the simulated alleles. In the second step, marker alleles or haplotypes are generated in the founders, conditional on the trait locus genotypes simulated in the first step. Then the recombination process between the marker loci takes place conditionally on the descent path and on the trait locus genotypes. This two-step implementation is often computationally faster than other software that are designed to generate marker data linked to, and possibly associated with, a trait locus. Conclusion Because the proposed method uses SLINK to simulate the segregation process, it benefits from its flexibility: the trait may be qualitative with the possibility of defining different liability classes (which allows for the simulation of gene-environment interactions or even the simulation of multi-locus effects between unlinked susceptibility regions) or it may be quantitative and normally distributed. In particular, this implementation is the only one available that can generate a large number of marker loci conditional on the set of observed quantitative trait values in pedigrees.
Collapse
Affiliation(s)
- Mathieu Lemire
- McGill University and Genome Quebec Innovation Centre, Montreal, Canada.
| |
Collapse
|