1
|
Valverde G, Zhou H, Lippold S, de Filippo C, Tang K, López Herráez D, Li J, Stoneking M. A novel candidate region for genetic adaptation to high altitude in Andean populations. PLoS One 2015; 10:e0125444. [PMID: 25961286 PMCID: PMC4427407 DOI: 10.1371/journal.pone.0125444] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2014] [Accepted: 03/12/2015] [Indexed: 02/07/2023] Open
Abstract
Humans living at high altitude (≥2,500 meters above sea level) have acquired unique abilities to survive the associated extreme environmental conditions, including hypoxia, cold temperature, limited food availability and high levels of free radicals and oxidants. Long-term inhabitants of the most elevated regions of the world have undergone extensive physiological and/or genetic changes, particularly in the regulation of respiration and circulation, when compared to lowland populations. Genome scans have identified candidate genes involved in altitude adaption in the Tibetan Plateau and the Ethiopian highlands, in contrast to populations from the Andes, which have not been as intensively investigated. In the present study, we focused on three indigenous populations from Bolivia: two groups of Andean natives, Aymara and Quechua, and the low-altitude control group of Guarani from the Gran Chaco lowlands. Using pooled samples, we identified a number of SNPs exhibiting large allele frequency differences over 900,000 genotyped SNPs. A region in chromosome 10 (within the cytogenetic bands q22.3 and q23.1) was significantly differentiated between highland and lowland groups. We resequenced ~1.5 Mb surrounding the candidate region and identified strong signals of positive selection in the highland populations. A composite of multiple signals like test localized the signal to FAM213A and a related enhancer; the product of this gene acts as an antioxidant to lower oxidative stress and may help to maintain bone mass. The results suggest that positive selection on the enhancer might increase the expression of this antioxidant, and thereby prevent oxidative damage. In addition, the most significant signal in a relative extended haplotype homozygosity analysis was localized to the SFTPD gene, which encodes a surfactant pulmonary-associated protein involved in normal respiration and innate host defense. Our study thus identifies two novel candidate genes and associated pathways that may be involved in high-altitude adaptation in Andean populations.
Collapse
Affiliation(s)
- Guido Valverde
- Australian Centre for Ancient DNA, School of Earth & Environmental Sciences, The University of Adelaide, Adelaide, Australia
| | - Hang Zhou
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
| | - Sebastian Lippold
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Cesare de Filippo
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Kun Tang
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
| | - David López Herráez
- Department Effect-Directed Analysis, Helmholtz Centre for Environmental Research—UFZ, Leipzig, Germany
- * E-mail: (DLH); (JL); (MS)
| | - Jing Li
- Department of Computational Regulatory Genomics, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai, China
- * E-mail: (DLH); (JL); (MS)
| | - Mark Stoneking
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
- * E-mail: (DLH); (JL); (MS)
| |
Collapse
|
2
|
Ozerov M, Vasemägi A, Wennevik V, Niemelä E, Prusov S, Kent M, Vähä JP. Cost-effective genome-wide estimation of allele frequencies from pooled DNA in Atlantic salmon (Salmo salar L.). BMC Genomics 2013; 14:12. [PMID: 23324082 PMCID: PMC3575319 DOI: 10.1186/1471-2164-14-12] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 01/02/2013] [Indexed: 12/13/2022] Open
Abstract
Background New sequencing technologies have tremendously increased the number of known molecular markers (single nucleotide polymorphisms; SNPs) in a variety of species. Concurrently, improvements to genotyping technology have now made it possible to efficiently genotype large numbers of genome-wide distributed SNPs enabling genome wide association studies (GWAS). However, genotyping significant numbers of individuals with large number of SNPs remains prohibitively expensive for many research groups. A possible solution to this problem is to determine allele frequencies from pooled DNA samples, such ‘allelotyping’ has been presented as a cost-effective alternative to individual genotyping and has become popular in human GWAS. In this article we have tested the effectiveness of DNA pooling to obtain accurate allele frequency estimates for Atlantic salmon (Salmo salar L.) populations using an Illumina SNP-chip. Results In total, 56 Atlantic salmon DNA pools from 14 populations were analyzed on an Atlantic salmon SNP-chip containing probes for 5568 SNP markers, 3928 of which were bi-allelic. We developed an efficient quality control filter which enables exclusion of loci showing high error rate and minor allele frequency (MAF) close to zero. After applying multiple quality control filters we obtained allele frequency estimates for 3631 bi-allelic loci. We observed high concordance (r > 0.99) between allele frequency estimates derived from individual genotyping and DNA pools. Our results also indicate that even relatively small DNA pools (35 individuals) can provide accurate allele frequency estimates for a given sample. Conclusions Despite of higher level of variation associated with array replicates compared to pool construction, we suggest that both sources of variation should be taken into account. This study demonstrates that DNA pooling allows fast and high-throughput determination of allele frequencies in Atlantic salmon enabling cost-efficient identification of informative markers for discrimination of populations at various geographical scales, as well as identification of loci controlling ecologically and economically important traits.
Collapse
Affiliation(s)
- Mikhail Ozerov
- Kevo Subarctic Research Institute, University of Turku, Turku 20014, Finland
| | | | | | | | | | | | | |
Collapse
|
3
|
Bercovici S, Geiger D. Admixture Aberration Analysis: Application to Mapping in Admixed Population Using Pooled DNA. J Comput Biol 2011; 18:237-49. [DOI: 10.1089/cmb.2010.0250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Sivan Bercovici
- Computer Science Department, Technion–Israel Institute of Technology, Haifa, Israel
| | - Dan Geiger
- Computer Science Department, Technion–Israel Institute of Technology, Haifa, Israel
| |
Collapse
|
4
|
Schosser A, Pirlo K, Gaysina D, Cohen-Woods S, Schalkwyk LC, Elkin A, Korszun A, Gunasinghe C, Gray J, Jones L, Meaburn E, Farmer AE, Craig IW, McGuffin P. Utility of the pooling approach as applied to whole genome association scans with high-density Affymetrix microarrays. BMC Res Notes 2010; 3:274. [PMID: 21040578 PMCID: PMC2984392 DOI: 10.1186/1756-0500-3-274] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 11/01/2010] [Indexed: 11/20/2022] Open
Abstract
Background We report an attempt to extend the previously successful approach of combining SNP (single nucleotide polymorphism) microarrays and DNA pooling (SNP-MaP) employing high-density microarrays. Whereas earlier studies employed a range of Affymetrix SNP microarrays comprising from 10 K to 500 K SNPs, this most recent investigation used the 6.0 chip which displays 906,600 SNP probes and 946,000 probes for the interrogation of CNVs (copy number variations). The genotyping assay using the Affymetrix SNP 6.0 array is highly demanding on sample quality due to the small feature size, low redundancy, and lack of mismatch probes. Findings In the first study published so far using this microarray on pooled DNA, we found that pooled cheek swab DNA could not accurately predict real allele frequencies of the samples that comprised the pools. In contrast, the allele frequency estimates using blood DNA pools were reasonable, although inferior compared to those obtained with previously employed Affymetrix microarrays. However, it might be possible to improve performance by developing improved analysis methods. Conclusions Despite the decreasing costs of genome-wide individual genotyping, the pooling approach may have applications in very large-scale case-control association studies. In such cases, our study suggests that high-quality DNA preparations and lower density platforms should be preferred.
Collapse
Affiliation(s)
- Alexandra Schosser
- MRC SGDP Centre, Institute of Psychiatry, King's College London, London, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
5
|
Anantharaman R, Chew FT. Validation of pooled genotyping on the Affymetrix 500 k and SNP6.0 genotyping platforms using the polynomial-based probe-specific correction. BMC Genet 2009; 10:82. [PMID: 20003400 PMCID: PMC2806376 DOI: 10.1186/1471-2156-10-82] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 12/14/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of pooled DNA on SNP microarrays (SNP-MaP) has been shown to be a cost effective and rapid manner to perform whole-genome association evaluations. While the accuracy of SNP-MaP was extensively evaluated on the early Affymetrix 10 k and 100 k platforms, there have not been as many similarly comprehensive studies on more recent platforms. In the present study, we used the data generated from the full Affymetrix 500 k SNP set together with the polynomial-based probe-specific correction (PPC) to derive allele frequency estimates. These estimates were compared to genotyping results of the same individuals on the same platform, as the basis to evaluate the reliability and accuracy of pooled genotyping on these high-throughput platforms. We subsequently extended this comparison to the new SNP6.0 platform capable of genotyping 1.8 million genetic variants. RESULTS We showed that pooled genotyping on the 500 k platform performed as well as those previously shown on the relatively lower throughput 10 k and 100 k array sets, with high levels of accuracy (correlation coefficient: 0.988) and low median error (0.036) in allele frequency estimates. Similar results were also obtained from the SNP6.0 array set. A novel pooling strategy of overlapping sub-pools was attempted and comparison of estimated allele frequencies showed this strategy to be as reliable as replicate pools. The importance of an appropriate reference genotyping data set for the application of the PPC algorithm was also evaluated; reference samples with similar ethnic background to the pooled samples were found to improve estimation of allele frequencies. CONCLUSION We conclude that use of the PPC algorithm to estimate allele frequencies obtained from pooled genotyping on the high throughput 500 k and SNP6.0 platforms is highly accurate and reproducible especially when a suitable reference sample set is used to estimate the beta values for PPC.
Collapse
Affiliation(s)
- Ramani Anantharaman
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| |
Collapse
|
6
|
Kirov G, Zaharieva I, Georgieva L, Moskvina V, Nikolov I, Cichon S, Hillmer A, Toncheva D, Owen MJ, O'Donovan MC. A genome-wide association study in 574 schizophrenia trios using DNA pooling. Mol Psychiatry 2009; 14:796-803. [PMID: 18332876 DOI: 10.1038/mp.2008.33] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The cost of genome-wide association (GWA) studies can be prohibitively high when large samples are genotyped. We conducted a GWA study on schizophrenia (SZ) and to reduce the cost, we used DNA pooling. We used a parent-offspring trios design to avoid the potential problems of population stratification. We constructed pools from 605 unaffected controls, 574 SZ patients and a third pool from all the parents of the patients. We hybridized each pool eight times on Illumina HumanHap550 arrays. We estimated the allele frequencies of each pool from the averaged intensities of the arrays. The significance level of results in the trios sample was estimated on the basis of the allele frequencies in cases and non-transmitted pseudocontrols, taking into account the technical variability of the data. We selected the highest ranked SNPs for individual genotyping, after excluding poorly performing SNPs and those that showed a trend in the opposite direction in the control pool. We genotyped 63 SNPs in 574 trios and analysed the results with the transmission disequilibrium test. Forty of those were significant at P<0.05, with the best result at P=1.2 x 10(-6) for rs11064768. This SNP is within the gene CCDC60, a coiled-coil domain gene. The third best SNP (P=0.00016) is rs893703, within RBP1, a candidate gene for schizophrenia.
Collapse
Affiliation(s)
- G Kirov
- Department of Psychological Medicine, Cardiff University, Henry Wellcome Building, Cardiff, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Davis OSP, Plomin R, Schalkwyk LC. The SNPMaP package for R: a framework for genome-wide association using DNA pooling on microarrays. Bioinformatics 2008; 25:281-3. [PMID: 19008252 PMCID: PMC2639010 DOI: 10.1093/bioinformatics/btn587] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Summary: Large-scale genome-wide association (GWA) studies using thousands of high-density SNP microarrays are becoming an essential tool in the search for loci related to heritable variation in many phenotypes. However, the cost of GWA remains beyond the reach of many researchers. Fortunately, the majority of statistical power can still be obtained by estimating allele frequencies from DNA pools, reducing the cost to that of tens, rather than thousands of arrays. We present a set of software tools for processing SNPMaP (SNP microarrays and pooling) data from CEL files to Relative Allele Scores in the rich R statistical computing environment. Availability: The SNPMaP package is available from http://cran.r-project.org/ under the GNU General Public License version 3 or later. Contact:snpmap@iop.kcl.ac.uk Supplementary information: Additional resources and test datasets are available at http://sgdp.iop.kcl.ac.uk/snpmap/
Collapse
Affiliation(s)
- Oliver S P Davis
- Social, Genetic & Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, London, UK.
| | | | | |
Collapse
|
8
|
Abstract
The analysis of genome wide variation offers the possibility of unravelling the genes involved in the pathogenesis of disease. Genome wide association studies are also particularly useful for identifying and validating targets for therapeutic intervention as well as for detecting markers for drug efficacy and side effects. The cost of such large-scale genetic association studies may be reduced substantially by the analysis of pooled DNA from multiple individuals. However, experimental errors inherent in pooling studies lead to a potential increase in the false positive rate and a loss in power compared to individual genotyping. Here we quantify various sources of experimental error using empirical data from typical pooling experiments and corresponding individual genotyping counts using two statistical methods. We provide analytical formulas for calculating these different errors in the absence of complete information, such as replicate pool formation, and for adjusting for the errors in the statistical analysis. We demonstrate that DNA pooling has the potential of estimating allele frequencies accurately, and adjusting the pooled allele frequency estimates for differential allelic amplification considerably improves accuracy. Estimates of the components of error show that differential allelic amplification is the most important contributor to the error variance in absolute allele frequency estimation, followed by allele frequency measurement and pool formation errors. Our results emphasise the importance of minimising experimental errors and obtaining correct error estimates in genetic association studies.
Collapse
Affiliation(s)
- A Jawaid
- Research & Development Genetics, AstraZeneca Pharmaceuticals, Macclesfield Cheshire SK104TG, UK.
| | | |
Collapse
|
9
|
Yang HC, Huang MC, Li LH, Lin CH, Yu ALT, Diccianni MB, Wu JY, Chen YT, Fann CSJ. MPDA: microarray pooled DNA analyzer. BMC Bioinformatics 2008; 9:196. [PMID: 18412951 PMCID: PMC2387178 DOI: 10.1186/1471-2105-9-196] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Accepted: 04/15/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray-based pooled DNA experiments that combine the merits of DNA pooling and gene chip technology constitute a pivotal advance in biotechnology. This new technique uses pooled DNA, thereby reducing costs associated with the typing of DNA from numerous individuals. Moreover, use of an oligonucleotide gene chip reduces costs related to processing various DNA segments (e.g., primers, reagents). Thus, the technique provides an overall cost-effective solution for large-scale genomic/genetic research. However, few publicly shared tools are available to systematically analyze the rapidly accumulating volume of whole-genome pooled DNA data. RESULTS We propose a generalized concept of pooled DNA and present a user-friendly tool named Microarray Pooled DNA Analyzer (MPDA) that we developed to analyze hybridization intensity data from microarray-based pooled DNA experiments. MPDA enables whole-genome DNA preferential amplification/hybridization analysis, allele frequency estimation, association mapping, allelic imbalance detection, and permits integration with shared data resources online. Graphic and numerical outputs from MPDA support global and detailed inspection of large amounts of genomic data. Four whole-genome data analyses are used to illustrate the major functionalities of MPDA. The first analysis shows that MPDA can characterize genomic patterns of preferential amplification/hybridization and provide calibration information for pooled DNA data analysis. The second analysis demonstrates that MPDA can accurately estimate allele frequencies. The third analysis indicates that MPDA is cost-effective and reliable for association mapping. The final analysis shows that MPDA can identify regions of chromosomal aberration in cancer without paired-normal tissue. CONCLUSION MPDA, the software that integrates pooled DNA association analysis and allelic imbalance analysis, provides a convenient analysis system for extensive whole-genome pooled DNA data analysis. The software, user manual and illustrated examples are freely available online at the MPDA website listed in the Availability and requirements section.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan.
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Cross J, Peters G, Wu Z, Brohede J, Hannan GN. Resolution of trisomic mosaicism in prenatal diagnosis: estimated performance of a 50K SNP microarray. Prenat Diagn 2008; 27:1197-204. [PMID: 17994637 DOI: 10.1002/pd.1884] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
OBJECTIVE To evaluate the ability of a DNA single nucleotide polymorphism (SNP) microarray to detect chromosome mosaicism for trisomy in prenatal samples in order to compare this with conventional cytogenetics. METHOD We created a dilution series of mock mosaic samples, by mixing measured amounts of fibroblast cells containing trisomy 8 from a male with aliquots of cells with a normal female karyotype. DNAs were extracted from these mosaic mixtures, then analysed on the Affymetrix 50K Xba SNP chip. Duplicate aliquots of each mosaic sample were probed using interphase FISH, with centromeric probes for chromosomes X, Y and 8, to estimate independently the proportion of male trisomy 8 in each sample. Data from the arrays were analysed using publicly available analysis tools. Statistical calculations were then performed using a Student's t-test to determine if there was a significant difference between the copy numbers of each chromosome. RESULTS These experiments using the Affymetrix 50K Xba SNP microarray showed mosaicism to be obvious at 20% and with additional statistical calculations, the lower limit for detection is about 10%. CONCLUSION The SNP microarray platform tested can detect mosaicism for trisomy in prenatal samples at levels comparable with conventional cytogenetic techniques in routine use.
Collapse
Affiliation(s)
- Jillian Cross
- Department of Cytogenetics, Children's Hospital at Westmead, NSW, Australia.
| | | | | | | | | |
Collapse
|
11
|
Sebastiani P, Zhao Z, Abad-Grau MM, Riva A, Hartley SW, Sedgewick AE, Doria A, Montano M, Melista E, Terry D, Perls TT, Steinberg MH, Baldwin CT. A hierarchical and modular approach to the discovery of robust associations in genome-wide association studies from pooled DNA samples. BMC Genet 2008; 9:6. [PMID: 18194558 PMCID: PMC2248205 DOI: 10.1186/1471-2156-9-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2007] [Accepted: 01/14/2008] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND One of the challenges of the analysis of pooling-based genome wide association studies is to identify authentic associations among potentially thousands of false positive associations. RESULTS We present a hierarchical and modular approach to the analysis of genome wide genotype data that incorporates quality control, linkage disequilibrium, physical distance and gene ontology to identify authentic associations among those found by statistical association tests. The method is developed for the allelic association analysis of pooled DNA samples, but it can be easily generalized to the analysis of individually genotyped samples. We evaluate the approach using data sets from diverse genome wide association studies including fetal hemoglobin levels in sickle cell anemia and a sample of centenarians and show that the approach is highly reproducible and allows for discovery at different levels of synthesis. CONCLUSION Results from the integration of Bayesian tests and other machine learning techniques with linkage disequilibrium data suggest that we do not need to use too stringent thresholds to reduce the number of false positive associations. This method yields increased power even with relatively small samples. In fact, our evaluation shows that the method can reach almost 70% sensitivity with samples of only 100 subjects.
Collapse
Affiliation(s)
- Paola Sebastiani
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Zhenming Zhao
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Maria M Abad-Grau
- Department of Software Engineering, University of Granada, Granada 18071, Spain
| | - Alberto Riva
- Department of Molecular Genetics, University of Florida at Gainesville, Gainesville 32611 FL, USA
| | - Stephen W Hartley
- Department of Biostatistics, Boston University School of Public Health, Boston 02118 MA, USA
| | - Amanda E Sedgewick
- Bioinformatics Program, Boston University School of Engineering, Boston 02116 MA, USA
| | - Alessandro Doria
- Joslin Diabetes Center, Harvard Medical School, Boston 02215 MA, USA
| | - Monty Montano
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Efthymia Melista
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Dellara Terry
- Geriatric Section, Boston Medical Center, Boston 02118 MA, USA
| | - Thomas T Perls
- Geriatric Section, Boston Medical Center, Boston 02118 MA, USA
| | - Martin H Steinberg
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| | - Clinton T Baldwin
- Department of Medicine, Boston University School of Medicine, Boston 02118 MA, USA
| |
Collapse
|
12
|
Cambien F, Tiret L. Genetics of cardiovascular diseases: from single mutations to the whole genome. Circulation 2007; 116:1714-24. [PMID: 17923582 DOI: 10.1161/circulationaha.106.661751] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- François Cambien
- INSERM UMR S 525 and Université Pierre et Marie Curie, Paris, France.
| | | |
Collapse
|
13
|
Docherty SJ, Butcher LM, Schalkwyk LC, Plomin R. Applicability of DNA pools on 500 K SNP microarrays for cost-effective initial screens in genomewide association studies. BMC Genomics 2007; 8:214. [PMID: 17610740 PMCID: PMC1925094 DOI: 10.1186/1471-2164-8-214] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Accepted: 07/04/2007] [Indexed: 01/02/2023] Open
Abstract
Background Genetic influences underpinning complex traits are thought to involve multiple quantitative trait loci (QTLs) of small effect size. Detection of such QTL associations requires systematic screening of large numbers of DNA markers within large sample populations. Using pooled DNA on SNP microarrays to screen for allelic frequency differences between groups such as cases and controls (called SNP Microarray and Pooling, or SNP-MaP) has been validated as an efficient solution on both 10 k and 100 k platforms. We demonstrate that this approach can be effectively applied to the truly genomewide Affymetrix GeneChip® Mapping 500 K Array. Results In comparisons between five independent DNA pools (N ~200 per pool) on separate Affymetrix GeneChip® Mapping 500 K Array sets, we show that, for SNPs with minor allele frequencies > 0.05, the reliability of the rank order of estimated allele frequencies, assessed as the average correlation between allele frequency estimates across the DNA pools, was 0.948 (average mean difference across the five pools = 0.069). Similarly, validity of the SNP-MaP approach was demonstrated by a rank-order correlation of 0.937 (average mean difference = 0.095) between the average DNA pool allele frequency estimates and the allele frequencies of an independent (CEPH) sample of 60 unrelated individually genotyped subjects. Conclusion We conclude that SNP-MaP can be extended for use on the Affymetrix GeneChip® Mapping 500 K Array, providing a cost-effective, reliable and valid initial screen of 500 K SNP microarrays in genomewide association scans.
Collapse
Affiliation(s)
- Sophia J Docherty
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Lee M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Leonard C Schalkwyk
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Robert Plomin
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| |
Collapse
|