1
|
Li J, Lee M, Davis BW, Lamichhaney S, Dorshorst BJ, Siegel PB, Andersson L. Mutations Upstream of the TBX5 and PITX1 Transcription Factor Genes Are Associated with Feathered Legs in the Domestic Chicken. Mol Biol Evol 2021; 37:2477-2486. [PMID: 32344431 PMCID: PMC7475036 DOI: 10.1093/molbev/msaa093] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Feathered leg is a trait in domestic chickens that has undergone intense selection by fancy breeders. Previous studies have shown that two major loci controlling feathered leg are located on chromosomes 13 and 15. Here, we present genetic evidence for the identification of candidate causal mutations at these loci. This was accomplished by combining classical linkage mapping using an experimental cross segregating for feathered leg and high-resolution identical-by-descent mapping using whole-genome sequence data from 167 samples of chicken with or without feathered legs. The first predicted causal mutation is a single-base change located 25 kb upstream of the gene for the forelimb-specific transcription factor TBX5 on chromosome 15. The second is a 17.7-kb deletion located ∼200 kb upstream of the gene for the hindlimb-specific transcription factor PITX1 on chromosome 13. These mutations are predicted to activate TBX5 and repress PITX1 expression, respectively. The study reveals a remarkable convergence in the evolution of the feathered-leg phenotype in domestic chickens and domestic pigeons, as this phenotype is caused by noncoding mutations upstream of the same two genes. Furthermore, the PITX1 causal variants are large overlapping deletions, 17.7 kb in chicken and 44 kb in pigeons. The results of the present study are consistent with the previously proposed model for pigeon that feathered leg is caused by reduced PITX1 expression and ectopic expression of TBX5 in hindlimb buds resulting in a shift of limb identity from hindlimb to more forelimb-like identity.
Collapse
Affiliation(s)
- Jingyi Li
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX.,Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA
| | - MiOk Lee
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX
| | - Brian W Davis
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX
| | - Sangeet Lamichhaney
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Ben J Dorshorst
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA
| | - Paul B Siegel
- Department of Animal and Poultry Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA
| | - Leif Andersson
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, TX.,Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.,Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
2
|
Li J, Bed’hom B, Marthey S, Valade M, Dureux A, Moroldo M, Péchoux C, Coville J, Gourichon D, Vieaud A, Dorshorst B, Andersson L, Tixier‐Boichard M. A missense mutation in
TYRP1
causes the chocolate plumage color in chicken and alters melanosome structure. Pigment Cell Melanoma Res 2018; 32:381-390. [DOI: 10.1111/pcmr.12753] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 09/19/2018] [Accepted: 10/02/2018] [Indexed: 12/30/2022]
Affiliation(s)
- Jingyi Li
- Department of Animal and Poultry Sciences Virginia Tech Blacksburg Virginia
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences Texas A&M University College Station Texas
| | - Bertrand Bed’hom
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Sylvain Marthey
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Mathieu Valade
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Audrey Dureux
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Marco Moroldo
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Christine Péchoux
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Jean‐Luc Coville
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | | | - Agathe Vieaud
- GABI, AgroParisTech, INRA Université Paris‐Saclay Jouy‐en‐Josas France
| | - Ben Dorshorst
- Department of Animal and Poultry Sciences Virginia Tech Blacksburg Virginia
| | - Leif Andersson
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences Texas A&M University College Station Texas
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology Uppsala University Uppsala Sweden
- Department of Animal Breeding and Genetics Swedish University of Agricultural Sciences Uppsala Sweden
| | | |
Collapse
|
3
|
Fowdar JY, Grealy R, Lu Y, Griffiths LR. A genome-wide association study of essential hypertension in an Australian population using a DNA pooling approach. Mol Genet Genomics 2016; 292:307-324. [PMID: 27866268 DOI: 10.1007/s00438-016-1274-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 11/10/2016] [Indexed: 01/11/2023]
Abstract
Despite the success of genome-wide association studies (GWAS) in detecting genetic loci involved in complex traits, few susceptibility genes have been detected for essential hypertension (EH). We aimed to use pooled DNA GWAS approach to identify and validate novel genomic loci underlying EH susceptibility in an Australian case-control population. Blood samples and questionnaires detailing medical history, blood pressure, and prescribed medications were collected for 409 hypertensives and 409 age-, sex- and ethnicity-matched normotensive controls. Case and control DNA were pooled in quadruplicate and hybridized to Illumina 1 M-Duo arrays. Allele frequencies agreed with those reported in reference data and known EH association signals were represented in the top-ranked SNPs more frequently than expected by chance. Validation showed that pooled DNA GWAS gave reliable estimates of case and control allele frequencies. Although no markers reached Bonferroni-corrected genome-wide significance levels (5.0 × 10-8), the top marker rs34870220 near ASGR1 approached significance (p = 4.32 × 10-7), as did several candidate loci (p < 1 × 10-6) on chromosomes 2, 4, 6, 9, 12, and 17. Four markers (located in or near genes NHSL1, NKFB1, GLI2, and LRRC10) from the top ten ranked SNPs were individually genotyped in pool samples and were tested for association between cases and controls using the χ 2 test. Of these, rs1599961 (NFKB1) and rs12711538 (GLI2) showed significant difference between cases and controls (p < 0.01). Additionally, four top-ranking markers within NFKB1 were found to be in LD, suggesting a single strong association signal for this gene.
Collapse
Affiliation(s)
- Javed Y Fowdar
- School of Medical Science, Griffith University, Gold Coast, Australia
| | - Rebecca Grealy
- School of Medical Science, Griffith University, Gold Coast, Australia
| | - Yi Lu
- Genetic Epidemiology Department, Queensland Institute of Medical Research, Brisbane, Australia
| | - Lyn R Griffiths
- Genomics Research Centre, Institute of Health and Biomedical Innovation, School of Biomedical Sciences, Queensland University of Technology, 60 Musk Ave, Kelvin Grove, Brisbane, QLD, 4059, Australia.
| |
Collapse
|
4
|
Mafra F, Mazzotti D, Pellegrino R, Bianco B, Barbosa CP, Hakonarson H, Christofolini D. Copy number variation analysis reveals additional variants contributing to endometriosis development. J Assist Reprod Genet 2016; 34:117-124. [PMID: 27817035 DOI: 10.1007/s10815-016-0822-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Accepted: 09/22/2016] [Indexed: 01/21/2023] Open
Abstract
PURPOSE Endometriosis is a gynecological disease influenced by multiple genetic and environmental factors. The aim of the current study was to use SNP-array technology to identify genomic aberrations that may possibly contribute to the development of endometriosis. METHODS We performed an SNP-array genotyping of pooled DNA samples from both patients (n = 100) and controls (n = 50). Copy number variation (CNV) calling and association analyses were performed using PennCNV software. MLPA and TaqMan Copy-Number assays were used for validation of CNVs discovered. RESULTS We detected 49 CNV loci that were present in patients with endometriosis and absent in the control group. After validation procedures, we confirmed six CNV loci in the subtelomeric regions, including 1p36.33, 16p13.3, 19p13.3, and 20p13, representing gains, while 17q25.3 and 20q13.33 showed losses. Among the intrachromosomal regions, our results revealed duplication at 19q13.1 within the FCGBP gene (p = 0.007). CONCLUSIONS We identified CNVs previously associated with endometriosis, together with six suggestive novel loci possibly involved in this disease. The intergenic locus on chromosome 19q13.1 shows strong association with endometriosis and is under further functional investigation.
Collapse
Affiliation(s)
- Fernanda Mafra
- Collective Health Department, Division of Sexual and Reproductive Health Care and Population Genetics, Faculdade de Medicina do ABC, Santo André, SP, Brazil.
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA.
| | - Diego Mazzotti
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Renata Pellegrino
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Bianca Bianco
- Collective Health Department, Division of Sexual and Reproductive Health Care and Population Genetics, Faculdade de Medicina do ABC, Santo André, SP, Brazil
| | - Caio Parente Barbosa
- Collective Health Department, Division of Sexual and Reproductive Health Care and Population Genetics, Faculdade de Medicina do ABC, Santo André, SP, Brazil
| | - Hakon Hakonarson
- Center for Applied Genomics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Denise Christofolini
- Collective Health Department, Division of Sexual and Reproductive Health Care and Population Genetics, Faculdade de Medicina do ABC, Santo André, SP, Brazil
| |
Collapse
|
5
|
Wein SA, Laviano A, Wolffram S. Quercetin induces hepatic γ-glutamyl hydrolase expression in rats by suppressing hepatic microRNA rno-miR-125b-3p. J Nutr Biochem 2015; 26:1660-3. [PMID: 26432773 DOI: 10.1016/j.jnutbio.2015.08.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 08/05/2015] [Accepted: 08/06/2015] [Indexed: 12/17/2022]
Abstract
Exogenous factors such as food components including the flavonoid quercetin are suspected to influence micro RNA (miRNA) concentrations and thus possibly target enzymes involved in xenobiotic metabolism. This study therefore investigates the influence of orally administered quercetin on hepatic miRNA and the identification of enzyme target mRNAs relevant in drug metabolism. Male Wistar rats (n=16) were fed either a diet without (C) or with (Q) the addition of 100-ppm quercetin for 7 weeks and subsequently euthanized at the end of the dark phase. To avoid strong effects of food deprivation on hepatic metabolism, food was not removed until 5 h prior to the procedure. Liver was immediately dissected and snap-frozen in liquid nitrogen. Concentrations of 352 hepatic miRNA were measured in pool samples of each dietary group (n=8) using the RT(2) miRNA PCR Array System. Differential expression of miRNAs was assumed with fold changes ≥3. Target genes of differentially expressed miRNAs were identified using the database TargetScan. Because rno-miR-125b-3p showed the most prominent fold-change (-9) we further analyzed the expression of its top predicted target gene gamma-glutamyl hydrolase (GGH) by quantitative real-time PCR using hypoxanthine phosphoribosyltransferase 1 (hprt1) as endogenous control. Compared to controls, 23 miRNAs were differentially expressed in rats fed quercetin. A ninefold reduction in hepatic miRNA rno-miR-125b-3p was paralleled by significant induction of GGH mRNA in liver of quercetin fed rats. Because increased GGH expressions were repeatedly associated with resistance to methotrexate, concomitant intake with quercetin should be monitored carefully.
Collapse
Affiliation(s)
- Silvia Anette Wein
- Institute of Animal Nutrition & Physiology, Christian-Albrechts-University of Kiel, Hermann-Rodewald-Str. 9, 24118 Kiel, Germany.
| | - Alessandro Laviano
- Department of Clinical Medicine, Sapienza University, Viale del Policlinico 155, 00161 Rome, Italy.
| | - Siegfried Wolffram
- Institute of Animal Nutrition & Physiology, Christian-Albrechts-University of Kiel, Hermann-Rodewald-Str. 9, 24118 Kiel, Germany.
| |
Collapse
|
6
|
Valverde G, Zhou H, Lippold S, de Filippo C, Tang K, López Herráez D, Li J, Stoneking M. A novel candidate region for genetic adaptation to high altitude in Andean populations. PLoS One 2015; 10:e0125444. [PMID: 25961286 PMCID: PMC4427407 DOI: 10.1371/journal.pone.0125444] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 03/12/2015] [Indexed: 02/07/2023] Open
Abstract
Humans living at high altitude (≥2,500 meters above sea level) have acquired unique abilities to survive the associated extreme environmental conditions, including hypoxia, cold temperature, limited food availability and high levels of free radicals and oxidants. Long-term inhabitants of the most elevated regions of the world have undergone extensive physiological and/or genetic changes, particularly in the regulation of respiration and circulation, when compared to lowland populations. Genome scans have identified candidate genes involved in altitude adaption in the Tibetan Plateau and the Ethiopian highlands, in contrast to populations from the Andes, which have not been as intensively investigated. In the present study, we focused on three indigenous populations from Bolivia: two groups of Andean natives, Aymara and Quechua, and the low-altitude control group of Guarani from the Gran Chaco lowlands. Using pooled samples, we identified a number of SNPs exhibiting large allele frequency differences over 900,000 genotyped SNPs. A region in chromosome 10 (within the cytogenetic bands q22.3 and q23.1) was significantly differentiated between highland and lowland groups. We resequenced ~1.5 Mb surrounding the candidate region and identified strong signals of positive selection in the highland populations. A composite of multiple signals like test localized the signal to FAM213A and a related enhancer; the product of this gene acts as an antioxidant to lower oxidative stress and may help to maintain bone mass. The results suggest that positive selection on the enhancer might increase the expression of this antioxidant, and thereby prevent oxidative damage. In addition, the most significant signal in a relative extended haplotype homozygosity analysis was localized to the SFTPD gene, which encodes a surfactant pulmonary-associated protein involved in normal respiration and innate host defense. Our study thus identifies two novel candidate genes and associated pathways that may be involved in high-altitude adaptation in Andean populations.
Collapse
Affiliation(s)
- Guido Valverde
- Australian Centre for Ancient DNA, School of Earth & Environmental Sciences, The University of Adelaide, Adelaide, Australia
| | - Hang Zhou
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
| | - Sebastian Lippold
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Cesare de Filippo
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Kun Tang
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
| | - David López Herráez
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research—UFZ, Leipzig, Germany
- * E-mail: (DLH); (JL); (MS)
| | - Jing Li
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
- * E-mail: (DLH); (JL); (MS)
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- * E-mail: (DLH); (JL); (MS)
| |
Collapse
|
7
|
Lu LQ, Liao W. Screening and functional pathway analysis of genes associated with pediatric allergic asthma using a DNA microarray. Mol Med Rep 2015; 11:4197-203. [PMID: 25633562 PMCID: PMC4394950 DOI: 10.3892/mmr.2015.3277] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Accepted: 01/02/2015] [Indexed: 12/29/2022] Open
Abstract
The present study aimed to identify differentially expressed genes (DEGs) associated with pediatric allergic asthma, and to analyze the functional pathways of the selected target genes, in order to explore the pathogenesis of the disease. The GSE18965 gene expression profile was downloaded from the Gene Expression Omnibus database and was preprocessed. This gene expression profile consisted of seven normal samples and nine samples from patients with pediatric allergic asthma. The DEGs between the normal and pediatric allergic asthma samples were screened using limma package in R, and the cut‑off value was set at false discovery rate <0.05 and log fold change >1. Following hierarchical clustering of the DEGs based on the expression profiles, the up‑ and downregulated genes underwent a functional enrichment analysis by topological approach (P<0.05), using the Database for Annotation, Visualization and Integrated Discovery. A total of 127 DEGs were identified between the normal and pediatric allergic asthma samples. The up‑ and downregulated genes were significantly enriched in the actin filament‑based process and the monosaccharide metabolic process, respectively. Seven downregulated DEGs (M6PR, TPP1, GLB1, NEU1, ACP2, LAMP1 and HGSNAT) were identified in the lysosomal pathway, with P=6.4x10(‑9). These results suggested that variation in lysosomal function, triggered by the seven downregulated genes, may lead to aberrant functioning of the T lymphocytes, resulting in asthma. Further research regarding the treatment of pediatric allergic asthma through targeting lysosomal function is required.
Collapse
Affiliation(s)
- Li-Qun Lu
- Department of Pediatrics, First Hospital Affiliated to Chengdu Medical College, Chengdu, Sichuan 610500, P.R. China
| | - Wei Liao
- Department of Pediatrics, Southwest Hospital, The Third Military Medical University, Chongqing 400038, P.R. China
| |
Collapse
|
8
|
Guo Y, Cai Q, Li C, Li J, Courtney R, Zheng W, Long J. An evaluation of allele frequency estimation accuracy using pooled sequencing data. ACTA ACUST UNITED AC 2013; 6:279-93. [PMID: 24088264 DOI: 10.1504/ijcbdd.2013.056709] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Next generation sequencing technology has matured, and with its current affordability, will replace the SNP chip as the genotyping tool of choice. Even with the current affordability of NGS, large scale studies will require careful study design to reduce cost. In this study, we designed an experiment to assess the accuracy of allele frequency estimated from pooled sequencing data. We compared the allele frequency estimated from sequencing data with the allele frequency estimated from individual SNP chip data and observed high correlations between them. However, by calculating error rate, we found that many SNPs had their allele frequency estimated from sequencing data significantly different from allele frequency estimated from SNP chip data. In conclusion, we found correlation is not an ideal measurement for comparing allele frequencies. And for the purpose of estimating allele frequency, we do not recommend using pooling with NGS as a cheaper alternative to genotype each sample individually.
Collapse
Affiliation(s)
- Yan Guo
- Department of Cancer Biology, Vanderbilt University, Nashville TN 37232, USA
| | | | | | | | | | | | | |
Collapse
|
9
|
Evaluation of allele frequency estimation using pooled sequencing data simulation. ScientificWorldJournal 2013; 2013:895496. [PMID: 23476151 PMCID: PMC3582166 DOI: 10.1155/2013/895496] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 12/30/2012] [Indexed: 11/17/2022] Open
Abstract
Next-generation sequencing (NGS) technology has provided researchers with opportunities to study the genome in unprecedented detail. In particular, NGS is applied to disease association studies. Unlike genotyping chips, NGS is not limited to a fixed set of SNPs. Prices for NGS are now comparable to the SNP chip, although for large studies the cost can be substantial. Pooling techniques are often used to reduce the overall cost of large-scale studies. In this study, we designed a rigorous simulation model to test the practicability of estimating allele frequency from pooled sequencing data. We took crucial factors into consideration, including pool size, overall depth, average depth per sample, pooling variation, and sampling variation. We used real data to demonstrate and measure reference allele preference in DNAseq data and implemented this bias in our simulation model. We found that pooled sequencing data can introduce high levels of relative error rate (defined as error rate divided by targeted allele frequency) and that the error rate is more severe for low minor allele frequency SNPs than for high minor allele frequency SNPs. In order to overcome the error introduced by pooling, we recommend a large pool size and high average depth per sample.
Collapse
|
10
|
The efficacy of detecting variants with small effects on the Affymetrix 6.0 platform using pooled DNA. Hum Genet 2011; 130:607-21. [PMID: 21424828 DOI: 10.1007/s00439-011-0974-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2010] [Accepted: 03/06/2011] [Indexed: 01/10/2023]
Abstract
Genome-wide genotyping of a cohort using pools rather than individual samples has long been proposed as a cost-saving alternative for performing genome-wide association (GWA) studies. However, successful disease gene mapping using pooled genotyping has thus far been limited to detecting common variants with large effect sizes, which tend not to exist for many complex common diseases or traits. Therefore, for DNA pooling to be a viable strategy for conducting GWA studies, it is important to determine whether commonly used genome-wide SNP array platforms such as the Affymetrix 6.0 array can reliably detect common variants of small effect sizes using pooled DNA. Taking obesity and age at menarche as examples of human complex traits, we assessed the feasibility of genome-wide genotyping of pooled DNA as a single-stage design for phenotype association. By individually genotyping the top associations identified by pooling, we obtained a 14- to 16-fold enrichment of SNPs nominally associated with the phenotype, but we likely missed the top true associations. In addition, we assessed whether genotyping pooled DNA can serve as an inexpensive screen as the second stage of a multi-stage design with a large number of samples by comparing the most cost-effective 3-stage designs with 80% power to detect common variants with genotypic relative risk of 1.1, with and without pooling. Given the current state of the specific technology we employed and the associated genotyping costs, we showed through simulation that a design involving pooling would be 1.07 times more expensive than a design without pooling. Thus, while a significant amount of information exists within the data from pooled DNA, our analysis does not support genotyping pooled DNA as a means to efficiently identify common variants contributing small effects to phenotypes of interest. While our conclusions were based on the specific technology and study design we employed, the approach presented here will be useful for evaluating the utility of other or future genome-wide genotyping platforms in pooled DNA studies.
Collapse
|
11
|
Ricci G, Astolfi A, Remondini D, Cipriani F, Formica S, Dondi A, Pession A. Pooled genome-wide analysis to identify novel risk loci for pediatric allergic asthma. PLoS One 2011; 6:e16912. [PMID: 21359210 PMCID: PMC3040188 DOI: 10.1371/journal.pone.0016912] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2010] [Accepted: 01/03/2011] [Indexed: 11/22/2022] Open
Abstract
Background Genome-wide association studies of pooled DNA samples were shown to be a valuable tool to identify candidate SNPs associated to a phenotype. No such study was up to now applied to childhood allergic asthma, even if the very high complexity of asthma genetics is an appropriate field to explore the potential of pooled GWAS approach. Methodology/Principal Findings We performed a pooled GWAS and individual genotyping in 269 children with allergic respiratory diseases comparing allergic children with and without asthma. We used a modular approach to identify the most significant loci associated with asthma by combining silhouette statistics and physical distance method with cluster-adapted thresholding. We found 97% concordance between pooled GWAS and individual genotyping, with 36 out of 37 top-scoring SNPs significant at individual genotyping level. The most significant SNP is located inside the coding sequence of C5, an already identified asthma susceptibility gene, while the other loci regulate functions that are relevant to bronchial physiopathology, as immune- or inflammation-mediated mechanisms and airway smooth muscle contraction. Integration with gene expression data showed that almost half of the putative susceptibility genes are differentially expressed in experimental asthma mouse models. Conclusion/Significance Combined silhouette statistics and cluster-adapted physical distance threshold analysis of pooled GWAS data is an efficient method to identify candidate SNP associated to asthma development in an allergic pediatric population.
Collapse
Affiliation(s)
- Giampaolo Ricci
- Pediatric Unit, Department of Gynecologic, Obstetric and Pediatric Sciences, University of Bologna, Bologna, Italy.
| | | | | | | | | | | | | |
Collapse
|
12
|
Identification of a common variant affecting human episodic memory performance using a pooled genome-wide association approach: a case study of disease gene identification. Methods Mol Biol 2011; 700:261-9. [PMID: 21204039 DOI: 10.1007/978-1-61737-954-3_17] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Genome-wide association studies (GWAS) are an important tool for discovering novel genes associated with disease or traits. Careful design of case-control groups greatly facilitates the efficacy of these studies. Here we describe a pooled GWAS study undertaken to find novel genes associated with human episodic memory performance. A genomic locus for the WW and C2 domain-containing 1 protein, KIBRA (also known as WWC1), was found to be associated with memory performance in three cognitively normal cohorts from Switzerland and the USA. This result was further supported by correlation of KIBRA genotype and differences in hippocampal activation as measured by functional magnetic resonance imaging (fMRI). These findings provide an excellent example of the application of GWAS using a pooled genomic DNA approach to successfully identify a locus with strong effects on human memory.
Collapse
|
13
|
Rueppell O, Metheny JD, Linksvayer T, Fondrk MK, Page RE, Amdam GV. Genetic architecture of ovary size and asymmetry in European honeybee workers. Heredity (Edinb) 2010; 106:894-903. [PMID: 21048673 DOI: 10.1038/hdy.2010.138] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
The molecular basis of complex traits is increasingly understood but a remaining challenge is to identify their co-regulation and inter-dependence. Pollen hoarding (pln) in honeybees is a complex trait associated with a well-characterized suite of linked behavioral and physiological traits. In European honeybee stocks bidirectionally selected for pln, worker (sterile helper) ovary size is pleiotropically affected by quantitative trait loci that were initially identified for their effect on foraging behavior. To gain a better understanding of the genetic architecture of worker ovary size in this model system, we analyzed a series of crosses between the selected strains. The crossing results were heterogeneous and suggested non-additive effects. Three significant and three suggestive quantitative trait loci of relatively large effect sizes were found in two reciprocal backcrosses. These loci are not located in genome regions of known effects on foraging behavior but contain several interesting candidate genes that may specifically affect worker-ovary size. Thus, the genetic architecture of this life history syndrome may be comprised of pleiotropic, central regulators that influence several linked traits and other genetic factors that may be downstream and trait specific.
Collapse
Affiliation(s)
- O Rueppell
- Department of Biology, University of North Carolina at Greensboro, 1000 Spring Garden Street, Greensboro, NC 27403, USA.
| | | | | | | | | | | |
Collapse
|
14
|
DNA methylation profiling using bisulfite-based epityping of pooled genomic DNA. Methods 2010; 52:255-8. [DOI: 10.1016/j.ymeth.2010.06.017] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2010] [Revised: 06/24/2010] [Accepted: 06/25/2010] [Indexed: 12/16/2022] Open
|
15
|
LI HH, ZHANG LY, WANG JK. Analysis and Answers to Frequently Asked Questions in Quantitative Trait Locus Mapping. ZUOWU XUEBAO 2010. [DOI: 10.3724/sp.j.1006.2010.00918] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
16
|
Yang HC, Lin HC, Huang MC, Li LH, Pan WH, Wu JY, Chen YT. A new analysis tool for individual-level allele frequency for genomic studies. BMC Genomics 2010; 11:415. [PMID: 20602748 PMCID: PMC2996943 DOI: 10.1186/1471-2164-11-415] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/05/2010] [Indexed: 01/23/2023] Open
Abstract
Background Allele frequency is one of the most important population indices and has been broadly applied to genetic/genomic studies. Estimation of allele frequency using genotypes is convenient but may lose data information and be sensitive to genotyping errors. Results This study utilizes a unified intensity-measuring approach to estimating individual-level allele frequencies for 1,104 and 1,270 samples genotyped with the single-nucleotide-polymorphism arrays of the Affymetrix Human Mapping 100K and 500K Sets, respectively. Allele frequencies of all samples are estimated and adjusted by coefficients of preferential amplification/hybridization (CPA), and large ethnicity-specific and cross-ethnicity databases of CPA and allele frequency are established. The results show that using the CPA significantly improves the accuracy of allele frequency estimates; moreover, this paramount factor is insensitive to the time of data acquisition, effect of laboratory site, type of gene chip, and phenotypic status. Based on accurate allele frequency estimates, analytic methods based on individual-level allele frequencies are developed and successfully applied to discover genomic patterns of allele frequencies, detect chromosomal abnormalities, classify sample groups, identify outlier samples, and estimate the purity of tumor samples. The methods are packaged into a new analysis tool, ALOHA (Allele-frequency/Loss-of-heterozygosity/Allele-imbalance). Conclusions This is the first time that these important genetic/genomic applications have been simultaneously conducted by the analyses of individual-level allele frequencies estimated by a unified intensity-measuring approach. We expect that additional practical applications for allele frequency analysis will be found. The developed databases and tools provide useful resources for human genome analysis via high-throughput single-nucleotide-polymorphism arrays. The ALOHA software was written in R and R GUI and can be downloaded at http://www.stat.sinica.edu.tw/hsinchou/genetics/aloha/ALOHA.htm.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan.
| | | | | | | | | | | | | |
Collapse
|
17
|
Viding E, Hanscombe KB, Curtis CJC, Davis OSP, Meaburn EL, Plomin R. In search of genes associated with risk for psychopathic tendencies in children: a two-stage genome-wide association study of pooled DNA. J Child Psychol Psychiatry 2010; 51:780-8. [PMID: 20345837 DOI: 10.1111/j.1469-7610.2010.02236.x] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
BACKGROUND Quantitative genetic data from our group indicates that antisocial behaviour (AB) is strongly heritable when coupled with psychopathic, callous-unemotional (CU) personality traits. We have also demonstrated that the genetic influences for AB and CU overlap considerably. We conducted a genome-wide association scan that capitalises on these findings in an attempt to identify quantitative trait loci (QTLs) that may increase risk for psychopathic tendencies (AB+/CU+). METHODS Teacher ratings at age 7 were used to screen 8374 twins with available DNA samples for individuals that were high vs. low on both AB and CU. In Stage 1, we screened for allele frequency differences in 642,432 autosomal single-nucleotide polymorphisms (SNPs) using the Affymetrix 6.0 GeneChip with pooled DNA for high-scoring (AB+/CU+) versus low-scoring children (N = approximately 300/group). In Stage 2, we tested the 3000 most strongly associated SNPs from Stage 1 for association in the same direction in a second sample of high- versus low-scoring children from the same twin study (18% co-twins). RESULTS Using allele frequencies estimated from pooled DNA, we found suggestive evidence for enrichment of association in the second stage of our two-stage genome-wide association design and focus on reporting the 30 top-ranking SNPs nominally associated with psychopathic tendencies. These SNPs include neurodevelopmental genes such as ROBO2. CONCLUSIONS Although none of the SNPs reached genome-wide statistical significance we have generated a list of SNPs that are potentially associated with psychopathic tendencies, which we believe warrant verification and replication in large independent and clinical samples.
Collapse
Affiliation(s)
- Essi Viding
- Division of Psychology and Language Sciences, University College London, UK.
| | | | | | | | | | | |
Collapse
|
18
|
Rapid assessment of genetic ancestry in populations of unknown origin by genome-wide genotyping of pooled samples. PLoS Genet 2010; 6:e1000866. [PMID: 20221249 PMCID: PMC2832667 DOI: 10.1371/journal.pgen.1000866] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 01/30/2010] [Indexed: 01/04/2023] Open
Abstract
As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB). Many association studies have been published looking for genetic variants contributing to a variety of human traits such as obesity, diabetes, and height. Because the frequency of genetic variants can differ across populations, it is important to have estimates of genetic ancestry in the individuals being studied. In this study, we were able to measure genetic ancestry in populations of mixed ancestry by genotyping pooled, rather than individual, DNA samples. This represents a rapid and inexpensive means for modeling genetic ancestry and thus could facilitate future association or population-genetic studies in populations of unknown ancestry for which whole-genome data do not already exist.
Collapse
|
19
|
Docherty SJ, Davis OSP, Kovas Y, Meaburn EL, Dale PS, Petrill SA, Schalkwyk LC, Plomin R. A genome-wide association study identifies multiple loci associated with mathematics ability and disability. GENES, BRAIN, AND BEHAVIOR 2010; 9:234-47. [PMID: 20039944 PMCID: PMC2855870 DOI: 10.1111/j.1601-183x.2009.00553.x] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Revised: 09/08/2009] [Accepted: 11/02/2009] [Indexed: 12/01/2022]
Abstract
Numeracy is as important as literacy and exhibits a similar frequency of disability. Although its etiology is relatively poorly understood, quantitative genetic research has demonstrated mathematical ability to be moderately heritable. In this first genome-wide association study (GWAS) of mathematical ability and disability, 10 out of 43 single nucleotide polymorphism (SNP) associations nominated from two high- vs. low-ability (n = 600 10-year-olds each) scans of pooled DNA were validated (P < 0.05) in an individually genotyped sample of (*)2356 individuals spanning the entire distribution of mathematical ability, as assessed by teacher reports and online tests. Although the effects are of the modest sizes now expected for complex traits and require further replication, interesting candidate genes are implicated such as NRCAM which encodes a neuronal cell adhesion molecule. When combined into a set, the 10 SNPs account for 2.9% (F = 56.85; df = 1 and 1881; P = 7.277e-14) of the phenotypic variance. The association is linear across the distribution consistent with a quantitative trait locus (QTL) hypothesis; the third of children in our sample who harbour 10 or more of the 20 risk alleles identified are nearly twice as likely (OR = 1.96; df = 1; P = 3.696e-07) to be in the lowest performing 15% of the distribution. Our results correspond with those of quantitative genetic research in indicating that mathematical ability and disability are influenced by many genes generating small effects across the entire spectrum of ability, implying that more highly powered studies will be needed to detect and replicate these QTL associations.
Collapse
Affiliation(s)
- S J Docherty
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, UK.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Anantharaman R, Chew FT. Validation of pooled genotyping on the Affymetrix 500 k and SNP6.0 genotyping platforms using the polynomial-based probe-specific correction. BMC Genet 2009; 10:82. [PMID: 20003400 PMCID: PMC2806376 DOI: 10.1186/1471-2156-10-82] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 12/14/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of pooled DNA on SNP microarrays (SNP-MaP) has been shown to be a cost effective and rapid manner to perform whole-genome association evaluations. While the accuracy of SNP-MaP was extensively evaluated on the early Affymetrix 10 k and 100 k platforms, there have not been as many similarly comprehensive studies on more recent platforms. In the present study, we used the data generated from the full Affymetrix 500 k SNP set together with the polynomial-based probe-specific correction (PPC) to derive allele frequency estimates. These estimates were compared to genotyping results of the same individuals on the same platform, as the basis to evaluate the reliability and accuracy of pooled genotyping on these high-throughput platforms. We subsequently extended this comparison to the new SNP6.0 platform capable of genotyping 1.8 million genetic variants. RESULTS We showed that pooled genotyping on the 500 k platform performed as well as those previously shown on the relatively lower throughput 10 k and 100 k array sets, with high levels of accuracy (correlation coefficient: 0.988) and low median error (0.036) in allele frequency estimates. Similar results were also obtained from the SNP6.0 array set. A novel pooling strategy of overlapping sub-pools was attempted and comparison of estimated allele frequencies showed this strategy to be as reliable as replicate pools. The importance of an appropriate reference genotyping data set for the application of the PPC algorithm was also evaluated; reference samples with similar ethnic background to the pooled samples were found to improve estimation of allele frequencies. CONCLUSION We conclude that use of the PPC algorithm to estimate allele frequencies obtained from pooled genotyping on the high throughput 500 k and SNP6.0 platforms is highly accurate and reproducible especially when a suitable reference sample set is used to estimate the beta values for PPC.
Collapse
Affiliation(s)
- Ramani Anantharaman
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| |
Collapse
|
21
|
Ronald A, Butcher LM, Docherty S, Davis OSP, Schalkwyk LC, Craig IW, Plomin R. A genome-wide association study of social and non-social autistic-like traits in the general population using pooled DNA, 500 K SNP microarrays and both community and diagnosed autism replication samples. Behav Genet 2009; 40:31-45. [PMID: 20012890 PMCID: PMC2797846 DOI: 10.1007/s10519-009-9308-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2009] [Accepted: 10/14/2009] [Indexed: 10/28/2022]
Abstract
Two separate genome-wide association studies were conducted to identify single nucleotide polymorphisms (SNPs) associated with social and nonsocial autistic-like traits. We predicted that we would find SNPs associated with social and non-social autistic-like traits and that different SNPs would be associated with social and nonsocial. In Stage 1, each study screened for allele frequency differences in approximately 430,000 autosomal SNPs using pooled DNA on microarrays in high-scoring versus low-scoring boys from a general population sample (N = approximately 400/group). In Stage 2, 22 and 20 SNPs in the social and non-social studies, respectively, were tested for QTL association by individually genotyping an independent community sample of 1,400 boys. One SNP (rs11894053) was nominally associated (P < .05, uncorrected for multiple testing) with social autistic-like traits. When the sample was increased by adding females, 2 additional SNPs were nominally significant (P < .05). These 3 SNPs, however, showed no significant association in transmission disequilibrium analyses of diagnosed ASD families.
Collapse
Affiliation(s)
- Angelica Ronald
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, De Crespigny Park, London SE5 8AF, UK.
| | | | | | | | | | | | | |
Collapse
|
22
|
Thomas DC, Casey G, Conti DV, Haile RW, Lewinger JP, Stram DO. Methodological Issues in Multistage Genome-wide Association Studies. Stat Sci 2009; 24:414-429. [PMID: 20607129 DOI: 10.1214/09-sts288] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Because of the high cost of commercial genotyping chip technologies, many investigations have used a two-stage design for genome-wide association studies, using part of the sample for an initial discovery of "promising" SNPs at a less stringent significance level and the remainder in a joint analysis of just these SNPs using custom genotyping. Typical cost savings of about 50% are possible with this design to obtain comparable levels of overall type I error and power by using about half the sample for stage I and carrying about 0.1% of SNPs forward to the second stage, the optimal design depending primarily upon the ratio of costs per genotype for stages I and II. However, with the rapidly declining costs of the commercial panels, the generally low observed ORs of current studies, and many studies aiming to test multiple hypotheses and multiple endpoints, many investigators are abandoning the two-stage design in favor of simply genotyping all available subjects using a standard high-density panel. Concern is sometimes raised about the absence of a "replication" panel in this approach, as required by some high-profile journals, but it must be appreciated that the two-stage design is not a discovery/replication design but simply a more efficient design for discovery using a joint analysis of the data from both stages. Once a subset of highly-significant associations has been discovered, a truly independent "exact replication" study is needed in a similar population of the same promising SNPs using similar methods. This can then be followed by (1) "generalizability" studies to assess the full scope of replicated associations across different races, different endpoints, different interactions, etc.; (2) fine-mapping or re-sequencing to try to identify the causal variant; and (3) experimental studies of the biological function of these genes. Multistage sampling designs may be more useful at this stage, say for selecting subsets of subjects for deep re-sequencing of regions identified in the GWAS.
Collapse
Affiliation(s)
- Duncan C Thomas
- Department of Preventive Medicine, University of Southern California
| | | | | | | | | | | |
Collapse
|
23
|
Kirov G, Zaharieva I, Georgieva L, Moskvina V, Nikolov I, Cichon S, Hillmer A, Toncheva D, Owen MJ, O'Donovan MC. A genome-wide association study in 574 schizophrenia trios using DNA pooling. Mol Psychiatry 2009; 14:796-803. [PMID: 18332876 DOI: 10.1038/mp.2008.33] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The cost of genome-wide association (GWA) studies can be prohibitively high when large samples are genotyped. We conducted a GWA study on schizophrenia (SZ) and to reduce the cost, we used DNA pooling. We used a parent-offspring trios design to avoid the potential problems of population stratification. We constructed pools from 605 unaffected controls, 574 SZ patients and a third pool from all the parents of the patients. We hybridized each pool eight times on Illumina HumanHap550 arrays. We estimated the allele frequencies of each pool from the averaged intensities of the arrays. The significance level of results in the trios sample was estimated on the basis of the allele frequencies in cases and non-transmitted pseudocontrols, taking into account the technical variability of the data. We selected the highest ranked SNPs for individual genotyping, after excluding poorly performing SNPs and those that showed a trend in the opposite direction in the control pool. We genotyped 63 SNPs in 574 trios and analysed the results with the transmission disequilibrium test. Forty of those were significant at P<0.05, with the best result at P=1.2 x 10(-6) for rs11064768. This SNP is within the gene CCDC60, a coiled-coil domain gene. The third best SNP (P=0.00016) is rs893703, within RBP1, a candidate gene for schizophrenia.
Collapse
Affiliation(s)
- G Kirov
- Department of Psychological Medicine, Cardiff University, Henry Wellcome Building, Cardiff, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Identification of a genetic variant associated with abdominal aortic aneurysms on chromosome 3p12.3 by genome wide association. J Vasc Surg 2009; 49:1525-31. [DOI: 10.1016/j.jvs.2009.01.041] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2008] [Revised: 01/12/2009] [Accepted: 01/18/2009] [Indexed: 11/19/2022]
|
25
|
Docherty SJ, Davis OSP, Haworth CMA, Plomin R, Mill J. Bisulfite-based epityping on pooled genomic DNA provides an accurate estimate of average group DNA methylation. Epigenetics Chromatin 2009; 2:3. [PMID: 19284538 PMCID: PMC2657899 DOI: 10.1186/1756-8935-2-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2008] [Accepted: 03/10/2009] [Indexed: 12/15/2022] Open
Abstract
Background DNA methylation plays a vital role in normal cellular function, with aberrant methylation signatures being implicated in a growing number of human pathologies and complex human traits. Methods based on the modification of genomic DNA with sodium bisulfite are considered the 'gold-standard' for DNA methylation profiling on genomic DNA; however, they require relatively large amounts of DNA and may be prohibitively expensive when used on the large sample sizes necessary to detect small effects. We propose that a high-throughput DNA pooling approach will facilitate the use of emerging methylomic profiling techniques in large samples. Results Compared with data generated from 89 individual samples, our analysis of 205 CpG sites spanning nine independent regions of the genome demonstrates that DNA pools can be used to provide an accurate and reliable quantitative estimate of average group DNA methylation. Comparison of data generated from the pooled DNA samples with results averaged across the individual samples comprising each pool revealed highly significant correlations for individual CpG sites across all nine regions, with an average overall correlation across all regions and pools of 0.95 (95% bootstrapped confidence intervals: 0.94 to 0.96). Conclusion In this study we demonstrate the validity of using pooled DNA samples to accurately assess group DNA methylation averages. Such an approach can be readily applied to the assessment of disease phenotypes reducing the time, cost and amount of DNA starting material required for large-scale epigenetic analyses.
Collapse
Affiliation(s)
- Sophia J Docherty
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, Denmark Hill, London, SE5 8AF, UK.
| | | | | | | | | |
Collapse
|
26
|
Zhao Y, Wang S. Optimal DNA pooling-based two-stage designs in case-control association studies. Hum Hered 2008; 67:46-56. [PMID: 18931509 DOI: 10.1159/000164398] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2007] [Accepted: 01/03/2007] [Indexed: 11/19/2022] Open
Abstract
Study cost remains the major limiting factor for genome-wide association studies due to the necessity of genotyping a large number of SNPs for a large number of subjects. Both DNA pooling strategies and two-stage designs have been proposed to reduce genotyping costs. In this study, we propose a cost-effective, two-stage approach with a DNA pooling strategy. During stage I, all markers are evaluated on a subset of individuals using DNA pooling. The most promising set of markers is then evaluated with individual genotyping for all individuals during stage II. The goal is to determine the optimal parameters (pi(p)(sample ), the proportion of samples used during stage I with DNA pooling; and pi(p)(marker ), the proportion of markers evaluated during stage II with individual genotyping) that minimize the cost of a two-stage DNA pooling design while maintaining a desired overall significance level and achieving a level of power similar to that of a one-stage individual genotyping design. We considered the effects of three factors on optimal two-stage DNA pooling designs. Our results suggest that, under most scenarios considered, the optimal two-stage DNA pooling design may be much more cost-effective than the optimal two-stage individual genotyping design, which use individual genotyping during both stages.
Collapse
Affiliation(s)
- Yihong Zhao
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, N.Y., USA
| | | |
Collapse
|
27
|
Abstract
The analysis of genome wide variation offers the possibility of unravelling the genes involved in the pathogenesis of disease. Genome wide association studies are also particularly useful for identifying and validating targets for therapeutic intervention as well as for detecting markers for drug efficacy and side effects. The cost of such large-scale genetic association studies may be reduced substantially by the analysis of pooled DNA from multiple individuals. However, experimental errors inherent in pooling studies lead to a potential increase in the false positive rate and a loss in power compared to individual genotyping. Here we quantify various sources of experimental error using empirical data from typical pooling experiments and corresponding individual genotyping counts using two statistical methods. We provide analytical formulas for calculating these different errors in the absence of complete information, such as replicate pool formation, and for adjusting for the errors in the statistical analysis. We demonstrate that DNA pooling has the potential of estimating allele frequencies accurately, and adjusting the pooled allele frequency estimates for differential allelic amplification considerably improves accuracy. Estimates of the components of error show that differential allelic amplification is the most important contributor to the error variance in absolute allele frequency estimation, followed by allele frequency measurement and pool formation errors. Our results emphasise the importance of minimising experimental errors and obtaining correct error estimates in genetic association studies.
Collapse
Affiliation(s)
- A Jawaid
- Research & Development Genetics, AstraZeneca Pharmaceuticals, Macclesfield Cheshire SK104TG, UK.
| | | |
Collapse
|
28
|
Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies. J Neural Transm (Vienna) 2008; 115:1573-85. [PMID: 18839057 DOI: 10.1007/s00702-008-0119-3] [Citation(s) in RCA: 277] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2008] [Accepted: 08/25/2008] [Indexed: 12/25/2022]
Abstract
A genome-wide association (GWA) study with pooled DNA in adult attention-deficit/hyperactivity disorder (ADHD) employing approximately 500K SNP markers identifies novel risk genes and reveals remarkable overlap with findings from recent GWA scans in substance use disorders. Comparison with results from our previously reported high-resolution linkage scan in extended pedigrees confirms several chromosomal loci, including 16q23.1-24.3 which also reached genome-wide significance in a recent meta-analysis of seven linkage studies (Zhou et al. in Am J Med Genet Part B, 2008). The findings provide additional support for a common effect of genes coding for cell adhesion molecules (e.g., CDH13, ASTN2) and regulators of synaptic plasticity (e.g., CTNNA2, KALRN) despite the complex multifactorial etiologies of adult ADHD and addiction vulnerability.
Collapse
|
29
|
Abraham R, Moskvina V, Sims R, Hollingworth P, Morgan A, Georgieva L, Dowzell K, Cichon S, Hillmer AM, O'Donovan MC, Williams J, Owen MJ, Kirov G. A genome-wide association study for late-onset Alzheimer's disease using DNA pooling. BMC Med Genomics 2008; 1:44. [PMID: 18823527 PMCID: PMC2570675 DOI: 10.1186/1755-8794-1-44] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2008] [Accepted: 09/29/2008] [Indexed: 02/05/2023] Open
Abstract
Background Late-onset Alzheimer's disease (LOAD) is an age related neurodegenerative disease with a high prevalence that places major demands on healthcare resources in societies with increasingly aged populations. The only extensively replicable genetic risk factor for LOAD is the apolipoprotein E gene. In order to identify additional genetic risk loci we have conducted a genome-wide association (GWA) study in a large LOAD case – control sample, reducing costs through the use of DNA pooling. Methods DNA samples were collected from 1,082 individuals with LOAD and 1,239 control subjects. Age at onset ranged from 60 to 95 and Controls were matched for age (mean = 76.53 years, SD = 33), gender and ethnicity. Equimolar amounts of each DNA sample were added to either a case or control pool. The pools were genotyped using Illumina HumanHap300 and Illumina Sentrix HumanHap240S arrays testing 561,494 SNPs. 114 of our best hit SNPs from the pooling data were identified and then individually genotyped in the case – control sample used to construct the pools. Results Highly significant association with LOAD was observed at the APOE locus confirming the validity of the pooled genotyping approach. For 109 SNPs outside the APOE locus, we obtained uncorrected p-values ≤ 0.05 for 74 after individual genotyping. To further test these associations, we added control data from 1400 subjects from the 1958 Birth Cohort with the evidence for association increasing to 3.4 × 10-6 for our strongest finding, rs727153. rs727153 lies 13 kb from the start of transcription of lecithin retinol acyltransferase (phosphatidylcholine – retinol O-acyltransferase, LRAT). Five of seven tag SNPs chosen to cover LRAT showed significant association with LOAD with a SNP in intron 2 of LRAT, showing greatest evidence of association (rs201825, p-value = 6.1 × 10-7). Conclusion We have validated the pooling method for GWA studies by both identifying the APOE locus and by observing a strong enrichment for significantly associated SNPs. We provide evidence for LRAT as a novel candidate gene for LOAD. LRAT plays a prominent role in the Vitamin A cascade, a system that has been previously implicated in LOAD.
Collapse
Affiliation(s)
- Richard Abraham
- Department of Psychological Medicine, Cardiff University School of Medicine, Heath Park, Cardiff, CF14 4XN, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Slate J, Gratten J, Beraldi D, Stapley J, Hale M, Pemberton JM. Gene mapping in the wild with SNPs: guidelines and future directions. Genetica 2008; 136:97-107. [PMID: 18780148 DOI: 10.1007/s10709-008-9317-z] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2008] [Accepted: 08/18/2008] [Indexed: 10/21/2022]
Abstract
One of the biggest challenges facing evolutionary biologists is to identify and understand loci that explain fitness variation in natural populations. This review describes how genetic (linkage) mapping with single nucleotide polymorphism (SNP) markers can lead to great progress in this area. Strategies for SNP discovery and SNP genotyping are described and an overview of how to model SNP genotype information in mapping studies is presented. Finally, the opportunity afforded by new generation sequencing and typing technologies to map fitness genes by genome-wide association studies is discussed.
Collapse
Affiliation(s)
- Jon Slate
- Department of Animal & Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| | | | | | | | | | | |
Collapse
|
31
|
Valdes AM, Loughlin J, Timms KM, van Meurs JJ, Southam L, Wilson SG, Doherty S, Lories RJ, Luyten FP, Gutin A, Abkevich V, Ge D, Hofman A, Uitterlinden AG, Hart DJ, Zhang F, Zhai G, Egli RJ, Doherty M, Lanchbury J, Spector TD. Genome-wide association scan identifies a prostaglandin-endoperoxide synthase 2 variant involved in risk of knee osteoarthritis. Am J Hum Genet 2008; 82:1231-40. [PMID: 18471798 PMCID: PMC2427208 DOI: 10.1016/j.ajhg.2008.04.006] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2008] [Revised: 04/11/2008] [Accepted: 04/21/2008] [Indexed: 10/22/2022] Open
Abstract
Osteoarthritis (OA), the most prevalent form of arthritis in the elderly, is characterized by the degradation of articular cartilage and has a strong genetic component. Our aim was to identify genetic variants involved in risk of knee OA in women. A pooled genome-wide association scan with the Illumina550 Duo array was performed in 255 controls and 387 cases. Twenty-eight variants with p < 1 x 10(-5) were estimated to have probabilities of being false positives <or=0.5 and were genotyped individually in the original samples and in replication cohorts from the UK and the U.S. (599 and 272 cases, 1530 and 258 controls, respectively). The top seven associations were subsequently tested in samples from the Netherlands (306 cases and 584 controls). rs4140564 on chromosome 1 mapping 5' to both the PTGS2 and PLA2G4A genes was associated with risk of knee OA in all the cohorts studied (overall odds ratio OR(mh) = 1.55 95% C.I. 1.30-1.85, p < 6.9 x 10(-7)). Differential allelic expression analysis of PTGS2 with mRNA extracted from the cartilage of joint-replacement surgery OA patients revealed a significant difference in allelic expression (p < 1.0 x 10(-6)). These results suggest the existence of cis-acting regulatory polymorphisms that are in, or near to, PTGS2 and in modest linkage disequilibrium with rs4140564. Our results and previous studies on the role of the cyclooxygenase 2 enzyme encoded by PTGS2 underscore the importance of this signaling pathway in the pathogenesis of knee OA.
Collapse
|
32
|
Butcher LM, Davis OSP, Craig IW, Plomin R. Genome-wide quantitative trait locus association scan of general cognitive ability using pooled DNA and 500K single nucleotide polymorphism microarrays. GENES BRAIN AND BEHAVIOR 2008; 7:435-46. [PMID: 18067574 PMCID: PMC2408663 DOI: 10.1111/j.1601-183x.2007.00368.x] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
General cognitive ability (g), which refers to what cognitive abilities have in common, is an important target for molecular genetic research because multivariate quantitative genetic analyses have shown that the same set of genes affects diverse cognitive abilities as well as learning disabilities. In this first autosomal genome-wide association scan of g, we used a two-stage quantitative trait locus (QTL) design with pooled DNA to screen more than 500 000 single nucleotide polymorphisms (SNPs) on microarrays, selecting from a sample of 7000 7-year-old children. In stage 1, we screened for allele frequency differences between groups pooled for low and high g. In stage 2, 47 SNPs nominated in stage 1 were tested by individually genotyping an independent sample of 3195 individuals, representative of the entire distribution of g scores in the full 7000 7-year-old children. Six SNPs yielded significant associations across the normal distribution of g, although only one SNP remained significant after a false discovery rate of 0.05 was imposed. However, none of these SNPs accounted for more than 0.4% of the variance of g, despite 95% power to detect associations of that size. It is likely that QTL effect sizes, even for highly heritable traits such as cognitive abilities and disabilities, are much smaller than previously assumed. Nonetheless, an aggregated ‘SNP set’ of the six SNPs correlated 0.11 (P < 0.00000003) with g. This shows that future SNP sets that will incorporate many more SNPs could be useful for predicting genetic risk and for investigating functional systems of effects from genes to brain to behavior.
Collapse
Affiliation(s)
- L M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, London, UK
| | | | | | | |
Collapse
|
33
|
Sebastiani P, Zhao Z, Abad-Grau MM, Riva A, Hartley SW, Sedgewick AE, Doria A, Montano M, Melista E, Terry D, Perls TT, Steinberg MH, Baldwin CT. A hierarchical and modular approach to the discovery of robust associations in genome-wide association studies from pooled DNA samples. BMC Genet 2008; 9:6. [PMID: 18194558 PMCID: PMC2248205 DOI: 10.1186/1471-2156-9-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2007] [Accepted: 01/14/2008] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND One of the challenges of the analysis of pooling-based genome wide association studies is to identify authentic associations among potentially thousands of false positive associations. RESULTS We present a hierarchical and modular approach to the analysis of genome wide genotype data that incorporates quality control, linkage disequilibrium, physical distance and gene ontology to identify authentic associations among those found by statistical association tests. The method is developed for the allelic association analysis of pooled DNA samples, but it can be easily generalized to the analysis of individually genotyped samples. We evaluate the approach using data sets from diverse genome wide association studies including fetal hemoglobin levels in sickle cell anemia and a sample of centenarians and show that the approach is highly reproducible and allows for discovery at different levels of synthesis. CONCLUSION Results from the integration of Bayesian tests and other machine learning techniques with linkage disequilibrium data suggest that we do not need to use too stringent thresholds to reduce the number of false positive associations. This method yields increased power even with relatively small samples. In fact, our evaluation shows that the method can reach almost 70% sensitivity with samples of only 100 subjects.
Collapse
Affiliation(s)
- Paola Sebastiani
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Zhenming Zhao
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Maria M Abad-Grau
- Department of Software Engineering, University of Granada, Granada 18071, Spain
| | - Alberto Riva
- Department of Molecular Genetics, University of Florida at Gainesville, Gainesville 32611 FL, USA
| | - Stephen W Hartley
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Amanda E Sedgewick
- Bioinformatics Program, Boston University School of Engineering, Boston 02116 MA, USA
| | - Alessandro Doria
- Joslin Diabetes Center, Harvard Medical School, Boston 02215 MA, USA
| | - Monty Montano
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Efthymia Melista
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Dellara Terry
- Geriatric Section, Boston Medical Center, Boston 02118 MA, USA
| | - Thomas T Perls
- Geriatric Section, Boston Medical Center, Boston 02118 MA, USA
| | - Martin H Steinberg
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Clinton T Baldwin
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| |
Collapse
|