1
|
Arca M, Mary-Huard T, Gouesnard B, Bérard A, Bauland C, Combes V, Madur D, Charcosset A, Nicolas SD. Deciphering the Genetic Diversity of Landraces With High-Throughput SNP Genotyping of DNA Bulks: Methodology and Application to the Maize 50k Array. FRONTIERS IN PLANT SCIENCE 2021; 11:568699. [PMID: 33488638 PMCID: PMC7817617 DOI: 10.3389/fpls.2020.568699] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 11/12/2020] [Indexed: 05/13/2023]
Abstract
Genebanks harbor original landraces carrying many original favorable alleles for mitigating biotic and abiotic stresses. Their genetic diversity remains, however, poorly characterized due to their large within genetic diversity. We developed a high-throughput, cheap and labor saving DNA bulk approach based on single-nucleotide polymorphism (SNP) Illumina Infinium HD array to genotype landraces. Samples were gathered for each landrace by mixing equal weights from young leaves, from which DNA was extracted. We then estimated allelic frequencies in each DNA bulk based on fluorescent intensity ratio (FIR) between two alleles at each SNP using a two step-approach. We first tested either whether the DNA bulk was monomorphic or polymorphic according to the two FIR distributions of individuals homozygous for allele A or B, respectively. If the DNA bulk was polymorphic, we estimated its allelic frequency by using a predictive equation calibrated on FIR from DNA bulks with known allelic frequencies. Our approach: (i) gives accurate allelic frequency estimations that are highly reproducible across laboratories, (ii) protects against false detection of allele fixation within landraces. We estimated allelic frequencies of 23,412 SNPs in 156 landraces representing American and European maize diversity. Modified Roger's genetic Distance between 156 landraces estimated from 23,412 SNPs and 17 simple sequence repeats using the same DNA bulks were highly correlated, suggesting that the ascertainment bias is low. Our approach is affordable, easy to implement and does not require specific bioinformatics support and laboratory equipment, and therefore should be highly relevant for large-scale characterization of genebanks for a wide range of species.
Collapse
Affiliation(s)
- Mariangela Arca
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Tristan Mary-Huard
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Brigitte Gouesnard
- AGAP, Univ Montpellier, CIRAD, INRAE, Institut Agro, Montpellier, France
| | - Aurélie Bérard
- Université Paris-Saclay, INRAE, Etude du Polymorphisme des Génomes Végétaux, Evry-Courcouronnes, France
| | - Cyril Bauland
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Valérie Combes
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Delphine Madur
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Alain Charcosset
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| | - Stéphane D. Nicolas
- Université Paris-Saclay, INRAE, CNRS, AgroParisTech, GQE – Le Moulon, Gif-sur-Yvette, France
| |
Collapse
|
2
|
Teumer A, Ernst FD, Wiechert A, Uhr K, Nauck M, Petersmann A, Völzke H, Völker U, Homuth G. Comparison of genotyping using pooled DNA samples (allelotyping) and individual genotyping using the affymetrix genome-wide human SNP array 6.0. BMC Genomics 2013; 14:506. [PMID: 23885805 PMCID: PMC3727995 DOI: 10.1186/1471-2164-14-506] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 07/23/2013] [Indexed: 12/26/2022] Open
Abstract
Background Genome-wide association studies (GWAS) using array-based genotyping technology are widely used to identify genetic loci associated with complex diseases or other phenotypes. The costs of GWAS projects based on individual genotyping are still comparatively high and increase with the size of study populations. Genotyping using pooled DNA samples, as also being referred as to allelotyping approach, offers an alternative at affordable costs. In the present study, data from 100 DNA samples individually genotyped with the Affymetrix Genome-Wide Human SNP Array 6.0 were used to estimate the error of the pooling approach by comparing the results with those obtained using the same array type but DNA pools each composed of 50 of the same samples. Newly developed and established methods for signal intensity correction were applied. Furthermore, the relative allele intensity signals (RAS) obtained by allelotyping were compared to the corresponding values derived from individual genotyping. Similarly, differences in RAS values between pools were determined and compared. Results Regardless of the intensity correction method applied, the pooling-specific error of the pool intensity values was larger for single pools than for the comparison of the intensity values of two pools, which reflects the scenario of a case–control study. Using 50 pooled samples and analyzing 10,000 SNPs with a minor allele frequency of >1% and applying the best correction method for the corresponding type of comparison, the 90% quantile (median) of the pooling-specific absolute error of the RAS values for single sub-pools and the SNP-specific difference in allele frequency comparing two pools was 0.064 (0.026) and 0.056 (0.021), respectively. Conclusions Correction of the RAS values reduced the error of the RAS values when analyzing single pool intensities. We developed a new correction method with high accuracy but low computational costs. Correction of RAS, however, only marginally reduced the error of true differences between two sample groups and those obtained by allelotyping. Exclusion of SNPs with a minor allele frequency of ≤1% notably reduced the pooling-specific error. Our findings allow for improving the estimation of the pooling-specific error and may help in designing allelotyping studies using the Affymetrix Genome-Wide Human SNP Array 6.0.
Collapse
Affiliation(s)
- Alexander Teumer
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, 17487 Greifswald, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Chailurkit L, Chanprasertyothin S, Charoenkiatkul S, Krisnamara N, Rajatanavin R, Ongphiphadhanakul B. Malic enzyme gene polymorphism is associated with responsiveness in circulating parathyroid hormone after long-term calcium supplementation. J Nutr Health Aging 2012; 16:246-51. [PMID: 22456781 DOI: 10.1007/s12603-011-0343-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
Abstract
OBJECTIVE To identify genetic variations associated with parathyroid hormone (PTH) suppression after long-term calcium supplementation. DESIGN AND PARTICIPANTS For high throughput SNP screening, subjects consisted of 171 postmenopausal women without osteoporosis at the lumbar spine. A separate group of 19 premenpausal women were recruited for calcium absorption study. Postmenopausal women in the screening group were given 500 mg/day calcium supplementation. SETTING Bangkok, Thailand. MEASUREMENTS Parathyroid hormone (PTH) and bone mineral density (BMD) were measured at baseline and 2 years after calcium supplementation. High throughput single-nucleotide polymorphism (SNP) screening was performed by comparing estimated allele frequencies derived from hybridization signal intensities of pooled DNA samples on Affymetrix's 10K SNP genotyping microarrays based responsiveness in PTH after calcium supplementation. Genotyping of SNP rs1112482 in malic enzyme gene (ME1) gene, a SNP among those with highest odds ratio of being related to PTH suppression after calcium, was performed in all postmenopausal subjects in the screening group and premenopausal women in the calcium absorption study group in which fractional calcium absorption was assessed by stable isotope dilution. Data were expressed as mean +/- SEM. RESULTS PTH significantly decreased after 2 years of calcium supplementation (4.7 ± 1.9 vs. 4.4 ± 1.6 pmol/L, P < 0.01). There was a significant increase in lumbar spine BMD (1.03 ± 0.01 vs. 1.01 ± 0.01 g/cm2, P < 0.001) but not femoral neck BMD. In 108 subjects whose PTH levels decreased after calcium, the suppression of PTH was higher in those with at least one C allele in rs1112482 of ME1 gene (-26.3 ± 2.1 vs. -16.9 ± 1.4%, P < 0.001). Fractional calcium absorption also tends to the higher in subjects in the calcium absorption study group with at least one C allele (n = 6) compared to those without the C allele (n = 13) (58.0 ± 4.9 vs. 49.3 ± 2.8%, P = 0.054). CONCLUSION Cytosolic malic enzyme 1 gene polymorphism is associated with the degree of suppression of parathyroid hormone after long-term calcium supplementation. The effect is probably mediated through an increase in intestinal calcium absorption.
Collapse
Affiliation(s)
- L Chailurkit
- Division of Endocrinology and Metabolism, Department of Medicine, Faculty of Medicine, Ramathibiodi Hospital, Mahidol University, Bangkok, Thailand.
| | | | | | | | | | | |
Collapse
|
4
|
The role of genetic variation near interferon-kappa in systemic lupus erythematosus. J Biomed Biotechnol 2010; 2010. [PMID: 20706608 PMCID: PMC2914299 DOI: 10.1155/2010/706825] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2010] [Revised: 04/21/2010] [Accepted: 05/13/2010] [Indexed: 11/29/2022] Open
Abstract
Systemic lupus erythematosus (SLE) is a systemic autoimmune disease characterized by increased type I interferons (IFNs) and multiorgan inflammation frequently targeting the skin. IFN-kappa is a type I IFN expressed in skin. A pooled genome-wide scan implicated the IFNK locus in SLE susceptibility. We studied IFNK single nucleotide polymorphisms (SNPs) in 3982 SLE cases and 4275 controls, composed of European (EA), African-American (AA), and Asian ancestry. rs12553951C was associated with SLE in EA males (odds ratio = 1.93, P = 2.5 × 10−4), but not females. Suggestive associations with skin phenotypes in EA and AA females were found, and these were also sex-specific. IFNK SNPs were associated with increased serum type I IFN in EA and AA SLE patients. Our data suggest a sex-dependent association between IFNK SNPs and SLE and skin phenotypes. The serum IFN association suggests that IFNK variants could influence type I IFN producing plasmacytoid dendritic cells in affected skin.
Collapse
|
5
|
Yang HC, Lin HC, Huang MC, Li LH, Pan WH, Wu JY, Chen YT. A new analysis tool for individual-level allele frequency for genomic studies. BMC Genomics 2010; 11:415. [PMID: 20602748 PMCID: PMC2996943 DOI: 10.1186/1471-2164-11-415] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/05/2010] [Indexed: 01/23/2023] Open
Abstract
Background Allele frequency is one of the most important population indices and has been broadly applied to genetic/genomic studies. Estimation of allele frequency using genotypes is convenient but may lose data information and be sensitive to genotyping errors. Results This study utilizes a unified intensity-measuring approach to estimating individual-level allele frequencies for 1,104 and 1,270 samples genotyped with the single-nucleotide-polymorphism arrays of the Affymetrix Human Mapping 100K and 500K Sets, respectively. Allele frequencies of all samples are estimated and adjusted by coefficients of preferential amplification/hybridization (CPA), and large ethnicity-specific and cross-ethnicity databases of CPA and allele frequency are established. The results show that using the CPA significantly improves the accuracy of allele frequency estimates; moreover, this paramount factor is insensitive to the time of data acquisition, effect of laboratory site, type of gene chip, and phenotypic status. Based on accurate allele frequency estimates, analytic methods based on individual-level allele frequencies are developed and successfully applied to discover genomic patterns of allele frequencies, detect chromosomal abnormalities, classify sample groups, identify outlier samples, and estimate the purity of tumor samples. The methods are packaged into a new analysis tool, ALOHA (Allele-frequency/Loss-of-heterozygosity/Allele-imbalance). Conclusions This is the first time that these important genetic/genomic applications have been simultaneously conducted by the analyses of individual-level allele frequencies estimated by a unified intensity-measuring approach. We expect that additional practical applications for allele frequency analysis will be found. The developed databases and tools provide useful resources for human genome analysis via high-throughput single-nucleotide-polymorphism arrays. The ALOHA software was written in R and R GUI and can be downloaded at http://www.stat.sinica.edu.tw/hsinchou/genetics/aloha/ALOHA.htm.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan.
| | | | | | | | | | | | | |
Collapse
|
6
|
Rapid assessment of genetic ancestry in populations of unknown origin by genome-wide genotyping of pooled samples. PLoS Genet 2010; 6:e1000866. [PMID: 20221249 PMCID: PMC2832667 DOI: 10.1371/journal.pgen.1000866] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2009] [Accepted: 01/30/2010] [Indexed: 01/04/2023] Open
Abstract
As we move forward from the current generation of genome-wide association (GWA) studies, additional cohorts of different ancestries will be studied to increase power, fine map association signals, and generalize association results to additional populations. Knowledge of genetic ancestry as well as population substructure will become increasingly important for GWA studies in populations of unknown ancestry. Here we propose genotyping pooled DNA samples using genome-wide SNP arrays as a viable option to efficiently and inexpensively estimate admixture proportion and identify ancestry informative markers (AIMs) in populations of unknown origin. We constructed DNA pools from African American, Native Hawaiian, Latina, and Jamaican samples and genotyped them using the Affymetrix 6.0 array. Aided by individual genotype data from the African American cohort, we established quality control filters to remove poorly performing SNPs and estimated allele frequencies for the remaining SNPs in each panel. We then applied a regression-based method to estimate the proportion of admixture in each cohort using the allele frequencies estimated from pooling and populations from the International HapMap Consortium as reference panels, and identified AIMs unique to each population. In this study, we demonstrated that genotyping pooled DNA samples yields estimates of admixture proportion that are both consistent with our knowledge of population history and similar to those obtained by genotyping known AIMs. Furthermore, through validation by individual genotyping, we demonstrated that pooling is quite effective for identifying SNPs with large allele frequency differences (i.e., AIMs) and that these AIMs are able to differentiate two closely related populations (HapMap JPT and CHB). Many association studies have been published looking for genetic variants contributing to a variety of human traits such as obesity, diabetes, and height. Because the frequency of genetic variants can differ across populations, it is important to have estimates of genetic ancestry in the individuals being studied. In this study, we were able to measure genetic ancestry in populations of mixed ancestry by genotyping pooled, rather than individual, DNA samples. This represents a rapid and inexpensive means for modeling genetic ancestry and thus could facilitate future association or population-genetic studies in populations of unknown ancestry for which whole-genome data do not already exist.
Collapse
|
7
|
Anantharaman R, Chew FT. Validation of pooled genotyping on the Affymetrix 500 k and SNP6.0 genotyping platforms using the polynomial-based probe-specific correction. BMC Genet 2009; 10:82. [PMID: 20003400 PMCID: PMC2806376 DOI: 10.1186/1471-2156-10-82] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 12/14/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of pooled DNA on SNP microarrays (SNP-MaP) has been shown to be a cost effective and rapid manner to perform whole-genome association evaluations. While the accuracy of SNP-MaP was extensively evaluated on the early Affymetrix 10 k and 100 k platforms, there have not been as many similarly comprehensive studies on more recent platforms. In the present study, we used the data generated from the full Affymetrix 500 k SNP set together with the polynomial-based probe-specific correction (PPC) to derive allele frequency estimates. These estimates were compared to genotyping results of the same individuals on the same platform, as the basis to evaluate the reliability and accuracy of pooled genotyping on these high-throughput platforms. We subsequently extended this comparison to the new SNP6.0 platform capable of genotyping 1.8 million genetic variants. RESULTS We showed that pooled genotyping on the 500 k platform performed as well as those previously shown on the relatively lower throughput 10 k and 100 k array sets, with high levels of accuracy (correlation coefficient: 0.988) and low median error (0.036) in allele frequency estimates. Similar results were also obtained from the SNP6.0 array set. A novel pooling strategy of overlapping sub-pools was attempted and comparison of estimated allele frequencies showed this strategy to be as reliable as replicate pools. The importance of an appropriate reference genotyping data set for the application of the PPC algorithm was also evaluated; reference samples with similar ethnic background to the pooled samples were found to improve estimation of allele frequencies. CONCLUSION We conclude that use of the PPC algorithm to estimate allele frequencies obtained from pooled genotyping on the high throughput 500 k and SNP6.0 platforms is highly accurate and reproducible especially when a suitable reference sample set is used to estimate the beta values for PPC.
Collapse
Affiliation(s)
- Ramani Anantharaman
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| | - Fook Tim Chew
- Department of Biological Sciences, National University of Singapore, Science Drive 4, Singapore 117543
| |
Collapse
|
8
|
Ronald A, Butcher LM, Docherty S, Davis OSP, Schalkwyk LC, Craig IW, Plomin R. A genome-wide association study of social and non-social autistic-like traits in the general population using pooled DNA, 500 K SNP microarrays and both community and diagnosed autism replication samples. Behav Genet 2009; 40:31-45. [PMID: 20012890 PMCID: PMC2797846 DOI: 10.1007/s10519-009-9308-6] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2009] [Accepted: 10/14/2009] [Indexed: 10/28/2022]
Abstract
Two separate genome-wide association studies were conducted to identify single nucleotide polymorphisms (SNPs) associated with social and nonsocial autistic-like traits. We predicted that we would find SNPs associated with social and non-social autistic-like traits and that different SNPs would be associated with social and nonsocial. In Stage 1, each study screened for allele frequency differences in approximately 430,000 autosomal SNPs using pooled DNA on microarrays in high-scoring versus low-scoring boys from a general population sample (N = approximately 400/group). In Stage 2, 22 and 20 SNPs in the social and non-social studies, respectively, were tested for QTL association by individually genotyping an independent community sample of 1,400 boys. One SNP (rs11894053) was nominally associated (P < .05, uncorrected for multiple testing) with social autistic-like traits. When the sample was increased by adding females, 2 additional SNPs were nominally significant (P < .05). These 3 SNPs, however, showed no significant association in transmission disequilibrium analyses of diagnosed ASD families.
Collapse
Affiliation(s)
- Angelica Ronald
- Social Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, De Crespigny Park, London SE5 8AF, UK.
| | | | | | | | | | | | | |
Collapse
|
9
|
Knight J, Saccone SF, Zhang Z, Ballinger DG, Rice JP. A comparison of association statistics between pooled and individual genotypes. Hum Hered 2009; 67:219-25. [PMID: 19172081 DOI: 10.1159/000194975] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Accepted: 07/25/2008] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Markers for individual genotyping can be selected using quantitative genotyping of pooled DNA. This strategy saves time and money. METHODS To determine the efficacy of this approach, we investigated the bivariate distribution of association test statistics from pooled and individual genotypes. We used a sample of approximately 1,000 samples with individual and pooled genotyping on 40,000 SNPs. RESULTS AND CONCLUSIONS We found that the distribution of the joint test statistics can be modelled as a mixture of two bivariate normal distributions. One distribution has a correlation of zero, and is probably due to SNPs whose pooled genotyping was unsuccessful. The other distribution has a correlation of approximately 0.65 in our data. This latter distribution is probably accounted for by SNPs whose pooled genotyping accurately predicts the underlying allele frequency. Approximately 87% of the data belongs to this distribution. We also derived a method to investigate the effect of both the correlation and selection cut-off on the relative power of pooling studies. We demonstrate that pooled genotyping has good power to detect SNPs that are truly associated with disease-causing variants for SNPs showing good correlation between pooled and individual genotyping. Therefore, this approach is a cost effective tool for association studies.
Collapse
Affiliation(s)
- Jo Knight
- Social Genetic & Developmental Psychiatry MRC Centre, Institute of Psychiatry, Kings College London, London, UK.
| | | | | | | | | |
Collapse
|
10
|
Yin BC, Li H, Ye BC. Microarray-based estimation of SNP allele-frequency in pooled DNA using the Langmuir kinetic model. BMC Genomics 2008; 9:605. [PMID: 19087310 PMCID: PMC2640397 DOI: 10.1186/1471-2164-9-605] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2008] [Accepted: 12/16/2008] [Indexed: 11/20/2022] Open
Abstract
Background High throughput genotyping of single nucleotide polymorphisms (SNPs) for genome-wide association requires technologies for generating millions of genotypes with relative ease but also at a reasonable cost and with high accuracy. In this work, we have developed a theoretical approach to estimate allele frequency in pooled DNA samples, based on the physical principles of DNA immobilization and hybridization on solid surface using the Langmuir kinetic model and quantitative analysis of the allelic signals. Results This method can successfully distinguish allele frequencies differing by 0.01 in the actual pool of clinical samples, and detect alleles with a frequency as low as 2%. The accuracy of measuring known allele frequencies is very high, with the strength of correlation between measured and actual frequencies having an r2 = 0.9992. These results demonstrated that this method could allow the accurate estimation of absolute allele frequencies in pooled samples of DNA in a feasible and inexpensive way. Conclusion We conclude that this novel strategy for quantitative analysis of the ratio of SNP allelic sequences in DNA pools is an inexpensive and feasible alternative for detecting polymorphic differences in candidate gene association studies and genome-wide linkage disequilibrium scans.
Collapse
Affiliation(s)
- Bin-Cheng Yin
- Laboratory of Biosystems and Microanalysis, State Key Laboratory of Bioreactor Engineering, East China University of Science & Technology, Shanghai, PR China.
| | | | | |
Collapse
|
11
|
Davis OSP, Plomin R, Schalkwyk LC. The SNPMaP package for R: a framework for genome-wide association using DNA pooling on microarrays. Bioinformatics 2008; 25:281-3. [PMID: 19008252 PMCID: PMC2639010 DOI: 10.1093/bioinformatics/btn587] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Summary: Large-scale genome-wide association (GWA) studies using thousands of high-density SNP microarrays are becoming an essential tool in the search for loci related to heritable variation in many phenotypes. However, the cost of GWA remains beyond the reach of many researchers. Fortunately, the majority of statistical power can still be obtained by estimating allele frequencies from DNA pools, reducing the cost to that of tens, rather than thousands of arrays. We present a set of software tools for processing SNPMaP (SNP microarrays and pooling) data from CEL files to Relative Allele Scores in the rich R statistical computing environment. Availability: The SNPMaP package is available from http://cran.r-project.org/ under the GNU General Public License version 3 or later. Contact:snpmap@iop.kcl.ac.uk Supplementary information: Additional resources and test datasets are available at http://sgdp.iop.kcl.ac.uk/snpmap/
Collapse
Affiliation(s)
- Oliver S P Davis
- Social, Genetic & Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, London, UK.
| | | | | |
Collapse
|
12
|
Simpson CL, Lemmens R, Miskiewicz K, Broom WJ, Hansen VK, van Vught PWJ, Landers JE, Sapp P, Van Den Bosch L, Knight J, Neale BM, Turner MR, Veldink JH, Ophoff RA, Tripathi VB, Beleza A, Shah MN, Proitsi P, Van Hoecke A, Carmeliet P, Horvitz HR, Leigh PN, Shaw CE, van den Berg LH, Sham PC, Powell JF, Verstreken P, Brown RH, Robberecht W, Al-Chalabi A. Variants of the elongator protein 3 (ELP3) gene are associated with motor neuron degeneration. Hum Mol Genet 2008; 18:472-81. [PMID: 18996918 PMCID: PMC2638803 DOI: 10.1093/hmg/ddn375] [Citation(s) in RCA: 205] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a spontaneous, relentlessly progressive motor neuron disease, usually resulting in death from respiratory failure within 3 years. Variation in the genes SOD1 and TARDBP accounts for a small percentage of cases, and other genes have shown association in both candidate gene and genome-wide studies, but the genetic causes remain largely unknown. We have performed two independent parallel studies, both implicating the RNA polymerase II component, ELP3, in axonal biology and neuronal degeneration. In the first, an association study of 1884 microsatellite markers, allelic variants of ELP3 were associated with ALS in three human populations comprising 1483 people (P = 1.96 × 10−9). In the second, an independent mutagenesis screen in Drosophila for genes important in neuronal communication and survival identified two different loss of function mutations, both in ELP3 (R475K and R456K). Furthermore, knock down of ELP3 protein levels using antisense morpholinos in zebrafish embryos resulted in dose-dependent motor axonal abnormalities [Pearson correlation: −0.49, P = 1.83 × 10−12 (start codon morpholino) and −0.46, P = 4.05 × 10−9 (splice-site morpholino), and in humans, risk-associated ELP3 genotypes correlated with reduced brain ELP3 expression (P = 0.01). These findings add to the growing body of evidence implicating the RNA processing pathway in neurodegeneration and suggest a critical role for ELP3 in neuron biology and of ELP3 variants in ALS.
Collapse
Affiliation(s)
- Claire L Simpson
- Department of Neurology, King's College London, London SE5 8AF, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Chen HH, Jou YS, Lee WJ, Pan WH. Applying polynomial standard curve method to correct bias encountered in estimating allele frequencies using DNA pooling strategy. Genomics 2008; 92:429-35. [PMID: 18793711 DOI: 10.1016/j.ygeno.2008.08.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Revised: 08/15/2008] [Accepted: 08/18/2008] [Indexed: 11/25/2022]
Abstract
DNA pooling approach is a cost-saving strategy which is crucial for multiple-SNP association study and particularly for laboratories with limited budget. However, the biased allele frequency estimates cannot be completely abolished by kappa correction. Using the SNaPshottrade mark, we systematically examined the relations between actual minor allele frequencies (AMiAFs) levels and estimates obtained from the pooling process for all six types of SNPs. We applied principle of polynomial standard curves method (PSCM) to produce allele frequency estimates in pooled DNA samples and compared it with the kappa method. The results showed that estimates derived from the PSCM were in general closer to AMiAFs than those from the kappa method, particularly for C/G and G/T polymorphisms at the range of AMiAF between 20-40%. We demonstrated that applying PSCM in the SNaPshottrade mark platform is suitable for multiple-SNP association study using pooling strategy, due to its cost effectiveness and estimation accuracy.
Collapse
Affiliation(s)
- Hsin-Hung Chen
- Institute of Biomedical Sciences, Academia Sinica, Taiwan, ROC
| | | | | | | |
Collapse
|
14
|
Meaburn EL, Harlaar N, Craig IW, Schalkwyk LC, Plomin R. Quantitative trait locus association scan of early reading disability and ability using pooled DNA and 100K SNP microarrays in a sample of 5760 children. Mol Psychiatry 2008; 13:729-40. [PMID: 17684495 DOI: 10.1038/sj.mp.4002063] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Quantitative genetic research suggests that reading disability is the quantitative extreme of the same genetic and environmental factors responsible for normal variation in reading ability. This finding warrants a quantitative trait locus (QTL) strategy that compares low versus high extremes of the normal distribution of reading in the search for QTLs associated with variation throughout the distribution. A low reading ability group (N=755) and a high reading group (N=747) were selected from a representative UK sample of 7-year-olds assessed on two measures of reading that we have shown to be highly heritable and highly genetically correlated. The low and high reading ability groups were each divided into 10 independent DNA pools and the 20 pools were assayed on 100 K single nucleotide polymorphism (SNP) microarrays to screen for the largest allele frequency differences between the low and high reading ability groups. Seventy five of these nominated SNPs were individually genotyped in an independent sample of low (N=452) and high (N=452) reading ability children selected from a second sample of 4258 7-year-olds. Nine of the seventy-five SNPs were nominally significant (P<0.05) in the predicted direction. These 9 SNPs and 14 other SNPs showing low versus high allele frequency differences in the predicted direction were genotyped in the rest of the second sample to test the QTL hypothesis. Ten SNPs yielded nominally significant linear associations in the expected direction across the distribution of reading ability. However, none of these SNP associations accounted for more than 0.5% of the variance of reading ability, despite 99% power to detect them. We conclude that QTL effect sizes, even for highly heritable common disorders and quantitative traits such as early reading disability and ability, might be much smaller than previously considered.
Collapse
Affiliation(s)
- E L Meaburn
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College, London, UK.
| | | | | | | | | |
Collapse
|
15
|
Yang HC, Huang MC, Li LH, Lin CH, Yu ALT, Diccianni MB, Wu JY, Chen YT, Fann CSJ. MPDA: microarray pooled DNA analyzer. BMC Bioinformatics 2008; 9:196. [PMID: 18412951 PMCID: PMC2387178 DOI: 10.1186/1471-2105-9-196] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Accepted: 04/15/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray-based pooled DNA experiments that combine the merits of DNA pooling and gene chip technology constitute a pivotal advance in biotechnology. This new technique uses pooled DNA, thereby reducing costs associated with the typing of DNA from numerous individuals. Moreover, use of an oligonucleotide gene chip reduces costs related to processing various DNA segments (e.g., primers, reagents). Thus, the technique provides an overall cost-effective solution for large-scale genomic/genetic research. However, few publicly shared tools are available to systematically analyze the rapidly accumulating volume of whole-genome pooled DNA data. RESULTS We propose a generalized concept of pooled DNA and present a user-friendly tool named Microarray Pooled DNA Analyzer (MPDA) that we developed to analyze hybridization intensity data from microarray-based pooled DNA experiments. MPDA enables whole-genome DNA preferential amplification/hybridization analysis, allele frequency estimation, association mapping, allelic imbalance detection, and permits integration with shared data resources online. Graphic and numerical outputs from MPDA support global and detailed inspection of large amounts of genomic data. Four whole-genome data analyses are used to illustrate the major functionalities of MPDA. The first analysis shows that MPDA can characterize genomic patterns of preferential amplification/hybridization and provide calibration information for pooled DNA data analysis. The second analysis demonstrates that MPDA can accurately estimate allele frequencies. The third analysis indicates that MPDA is cost-effective and reliable for association mapping. The final analysis shows that MPDA can identify regions of chromosomal aberration in cancer without paired-normal tissue. CONCLUSION MPDA, the software that integrates pooled DNA association analysis and allelic imbalance analysis, provides a convenient analysis system for extensive whole-genome pooled DNA data analysis. The software, user manual and illustrated examples are freely available online at the MPDA website listed in the Availability and requirements section.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan.
| | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Jongjaroenprasert W, Chanprasertyotin S, Butadej S, Nakasatien S, Charatcharoenwitthaya N, Himathongkam T, Ongphiphadhanakul B. Association of genetic variants in GABRA3 gene and thyrotoxic hypokalaemic periodic paralysis in Thai population. Clin Endocrinol (Oxf) 2008; 68:646-51. [PMID: 17970773 DOI: 10.1111/j.1365-2265.2007.03083.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
BACKGROUND Genetic predisposition has been suggested to play role in the pathogenesis of thyrotoxic hypokalaemic periodic paralysis (THPP). OBJECTIVES In this study, we assessed the differences of single-nucleotide polymorphisms (SNP) allelic frequency between THPP patients and well-characterized controls in order to find the susceptibility genetic variants related to THPP using microarray-based assessments on pooled DNA. METHODS Fifty cases of THPP and 50 male hyperthyroid patients without hypokalaemia as controls were recruited. Equal amounts of individual genomic DNA were pooled from each group. Estimated allele frequencies of SNPs were derived by averaging relative allele signal score obtained by Affymetrix GeneChip(R) Mapping 10K Arrays. RESULTS Sixty-nine loci that display robust allele frequency differences between THPP and controls were identified. SNP rs750841 (A > T) in intron 3 of the gamma-aminobutyric acid (GABA) receptor alpha3 subunit (GABRA3) gene possessed the most significant difference in allele frequency (27% in THPP case and 5% in controls, P = 0.007). Actual allele frequencies obtained from genotyping in each individual were very similar to the estimated frequency from the pools (28% in THPP and 2% in controls, and P = 0.0002). Nearby DNA sequences of GABRA3 were sequenced and an additional two SNPs were found (A > C at exon 1 and G > T of rs12688128). Allele A of rs750841 and allele G of rs12688128 in intron 3 were predominantly found in THPP with significant genetic relative risk of 19 (P < 0.0002; 95%CI 2.4-151.6). CONCLUSIONS Whole-genome scanning on pooled DNA provides an accurate, useful screening tool for elucidating genetic underpinnings of THPP. SNPs at intron 3 of GABRA3 are found to be associated with THPP.
Collapse
|
17
|
Butcher LM, Plomin R. The nature of nurture: a genomewide association scan for family chaos. Behav Genet 2008; 38:361-71. [PMID: 18360741 PMCID: PMC2480594 DOI: 10.1007/s10519-008-9198-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2007] [Accepted: 02/25/2008] [Indexed: 11/25/2022]
Abstract
Widely used measures of the environment, especially the family environment of children, show genetic influence in dozens of twin and adoption studies. This phenomenon is known as gene-environment correlation in which genetically driven influences of individuals affect their environments. We conducted the first genome-wide association (GWA) analysis of an environmental measure. We used a measure called CHAOS which assesses 'environmental confusion' in the home, a measure that is more strongly associated with cognitive development in childhood than any other environmental measure. CHAOS was assessed by parental report when the children were 3 years and again when the children were 4 years; a composite CHAOS measure was constructed across the 2 years. We screened 490,041 autosomal single-nucleotide polymorphisms (SNPs) in a two-stage design in which children in low chaos families (N = 469) versus high chaos families (N = 369) from 3,000 families of 4-year-old twins were screened in Stage 1 using pooled DNA. In Stage 2, following SNP quality control procedures, 41 nominated SNPs were tested for association with family chaos by individual genotyping an independent representative sample of 3,529. Despite having 99% power to detect associations that account for more than 0.5% of the variance, none of the 41 nominated SNPs met conservative criteria for replication. Similar to GWA analyses of other complex traits, it is likely that most of the heritable variation in environmental measures such as family chaos is due to many genes of very small effect size.
Collapse
Affiliation(s)
- Lee M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, Box Number P082, De Crespigny Park, London, UK.
| | | |
Collapse
|
18
|
Butcher LM, Davis OSP, Craig IW, Plomin R. Genome-wide quantitative trait locus association scan of general cognitive ability using pooled DNA and 500K single nucleotide polymorphism microarrays. GENES BRAIN AND BEHAVIOR 2008; 7:435-46. [PMID: 18067574 PMCID: PMC2408663 DOI: 10.1111/j.1601-183x.2007.00368.x] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
General cognitive ability (g), which refers to what cognitive abilities have in common, is an important target for molecular genetic research because multivariate quantitative genetic analyses have shown that the same set of genes affects diverse cognitive abilities as well as learning disabilities. In this first autosomal genome-wide association scan of g, we used a two-stage quantitative trait locus (QTL) design with pooled DNA to screen more than 500 000 single nucleotide polymorphisms (SNPs) on microarrays, selecting from a sample of 7000 7-year-old children. In stage 1, we screened for allele frequency differences between groups pooled for low and high g. In stage 2, 47 SNPs nominated in stage 1 were tested by individually genotyping an independent sample of 3195 individuals, representative of the entire distribution of g scores in the full 7000 7-year-old children. Six SNPs yielded significant associations across the normal distribution of g, although only one SNP remained significant after a false discovery rate of 0.05 was imposed. However, none of these SNPs accounted for more than 0.4% of the variance of g, despite 95% power to detect associations of that size. It is likely that QTL effect sizes, even for highly heritable traits such as cognitive abilities and disabilities, are much smaller than previously assumed. Nonetheless, an aggregated ‘SNP set’ of the six SNPs correlated 0.11 (P < 0.00000003) with g. This shows that future SNP sets that will incorporate many more SNPs could be useful for predicting genetic risk and for investigating functional systems of effects from genes to brain to behavior.
Collapse
Affiliation(s)
- L M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, London, UK
| | | | | | | |
Collapse
|
19
|
Abstract
The genetic dissection of complex disorders via genetic marker data has gained popularity in the postgenome era. Methods for typing genetic markers on human chromosomes continue to improve. Compared with the popular individual genotyping experiment, a pooled-DNA experiment (alleotyping experiment) is more cost effective when carrying out genetic typing. This chapter provides an overview of association mapping using pooled DNA and describes a five-stage study design including the preliminary calibration of peak intensities, estimation of allele frequency, single-locus association mapping, multilocus association mapping, and a confirmation study. Software and an analysis of authentic data are presented. The strengths and weaknesses of pooled-DNA analyses, as well as possible future applications for this method, are discussed.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | | |
Collapse
|
20
|
Docherty SJ, Butcher LM, Schalkwyk LC, Plomin R. Applicability of DNA pools on 500 K SNP microarrays for cost-effective initial screens in genomewide association studies. BMC Genomics 2007; 8:214. [PMID: 17610740 PMCID: PMC1925094 DOI: 10.1186/1471-2164-8-214] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2007] [Accepted: 07/04/2007] [Indexed: 01/02/2023] Open
Abstract
Background Genetic influences underpinning complex traits are thought to involve multiple quantitative trait loci (QTLs) of small effect size. Detection of such QTL associations requires systematic screening of large numbers of DNA markers within large sample populations. Using pooled DNA on SNP microarrays to screen for allelic frequency differences between groups such as cases and controls (called SNP Microarray and Pooling, or SNP-MaP) has been validated as an efficient solution on both 10 k and 100 k platforms. We demonstrate that this approach can be effectively applied to the truly genomewide Affymetrix GeneChip® Mapping 500 K Array. Results In comparisons between five independent DNA pools (N ~200 per pool) on separate Affymetrix GeneChip® Mapping 500 K Array sets, we show that, for SNPs with minor allele frequencies > 0.05, the reliability of the rank order of estimated allele frequencies, assessed as the average correlation between allele frequency estimates across the DNA pools, was 0.948 (average mean difference across the five pools = 0.069). Similarly, validity of the SNP-MaP approach was demonstrated by a rank-order correlation of 0.937 (average mean difference = 0.095) between the average DNA pool allele frequency estimates and the allele frequencies of an independent (CEPH) sample of 60 unrelated individually genotyped subjects. Conclusion We conclude that SNP-MaP can be extended for use on the Affymetrix GeneChip® Mapping 500 K Array, providing a cost-effective, reliable and valid initial screen of 500 K SNP microarrays in genomewide association scans.
Collapse
Affiliation(s)
- Sophia J Docherty
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Lee M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Leonard C Schalkwyk
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| | - Robert Plomin
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, DeCrispigny Park, London, SE5 8AF, UK
| |
Collapse
|
21
|
Coon KD, Dunckley TL, Stephan DA. A generic research paradigm for identification and validation of early molecular diagnostics and new therapeutics in common disorders. Mol Diagn Ther 2007; 11:1-14. [PMID: 17286446 DOI: 10.1007/bf03256218] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Genetically complex disorders continue to confound investigators because of their many underlying factors, both genetic and environmental. In order to tease apart the heritable from the non-heritable contributions to disease, clinicians are relying on researchers in the rapidly expanding fields of high-throughput genomics to identify surrogate clinical endpoints, called biomarkers, that provide a measure of the probability that an individual will succumb to the disease in question. The goals of current biomedical research into complex disorders are to identify and utilize these biomarkers, not only for early detection, but also for personalized treatment with knowledge-guided therapeutics. As the identification of these biomarkers is basically a problem of discovery, we discuss new insights into biomarker detection utilizing the most current genomic technologies available. Additionally, we present here a generic paradigm for the validation of such molecular diagnostics as well as new treatment modalities for complex and increasingly common diseases. Lastly, we delve into the ways genomic biomarkers might be implemented in a clinical setting to allow the subsequent application of targeted therapeutics, which can help the ever expanding groups of individuals experiencing these insidious diseases.
Collapse
Affiliation(s)
- Keith D Coon
- Neurogenomics Division, The Translational Genomics Research Institute, Phoenix, AZ 85004, USA
| | | | | |
Collapse
|
22
|
Wilkening S, Chen B, Wirtenberger M, Burwinkel B, Försti A, Hemminki K, Canzian F. Allelotyping of pooled DNA with 250 K SNP microarrays. BMC Genomics 2007; 8:77. [PMID: 17367522 PMCID: PMC1839100 DOI: 10.1186/1471-2164-8-77] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2007] [Accepted: 03/16/2007] [Indexed: 12/02/2022] Open
Abstract
Background Genotyping technologies for whole genome association studies are now available. To perform such studies to an affordable price, pooled DNA can be used. Recent studies have shown that GeneChip Human Mapping 10 K and 50 K arrays are suitable for the estimation of the allele frequency in pooled DNA. In the present study, we tested the accuracy of the 250 K Nsp array, which is part of the 500 K array set representing 500,568 SNPs. Furthermore, we compared different algorithms to estimate allele frequencies of pooled DNA. Results We could confirm that the polynomial based probe specific correction (PPC) was the most accurate method for allele frequency estimation. However, a simple k-correction, using the relative allele signal (RAS) of heterozygous individuals, performed only slightly worse and provided results for more SNPs. Using four replicates of the 250 K array and the k-correction using heterozygous RAS values, we obtained results for 104.141 SNPs. The correlation between estimated and real allele frequency was 0.983 and the average error was 0.046, which was comparable to the results obtained with the 10 K array. Furthermore, we could show how the estimation accuracy depended on the SNP type (average error for A/T SNPs: 0.043 and for G/C SNPs: 0.052). Conclusion The combination of DNA pooling and analysis of single nucleotide polymorphisms (SNPs) on high density microarrays is a promising tool for whole genome association studies.
Collapse
Affiliation(s)
- Stefan Wilkening
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Bowang Chen
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Michael Wirtenberger
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Barbara Burwinkel
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Helmholtz University Group Molecular Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| | - Asta Försti
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Center for Family Medicine, Karolinska Institute, SE-14183 Huddinge, Sweden
| | - Kari Hemminki
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
- Center for Family Medicine, Karolinska Institute, SE-14183 Huddinge, Sweden
| | - Federico Canzian
- Department of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), 69120 Heidelberg, Germany
| |
Collapse
|
23
|
Pearson JV, Huentelman MJ, Halperin RF, Tembe WD, Melquist S, Homer N, Brun M, Szelinger S, Coon KD, Zismann VL, Webster JA, Beach T, Sando SB, Aasly JO, Heun R, Jessen F, Kolsch H, Tsolaki M, Daniilidou M, Reiman EM, Papassotiropoulos A, Hutton ML, Stephan DA, Craig DW. Identification of the genetic basis for complex disorders by use of pooling-based genomewide single-nucleotide-polymorphism association studies. Am J Hum Genet 2007; 80:126-39. [PMID: 17160900 PMCID: PMC1785308 DOI: 10.1086/510686] [Citation(s) in RCA: 126] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2006] [Accepted: 11/07/2006] [Indexed: 01/06/2023] Open
Abstract
We report the development and validation of experimental methods, study designs, and analysis software for pooling-based genomewide association (GWA) studies that use high-throughput single-nucleotide-polymorphism (SNP) genotyping microarrays. We first describe a theoretical framework for establishing the effectiveness of pooling genomic DNA as a low-cost alternative to individually genotyping thousands of samples on high-density SNP microarrays. Next, we describe software called "GenePool," which directly analyzes SNP microarray probe intensity data and ranks SNPs by increased likelihood of being genetically associated with a trait or disorder. Finally, we apply these methods to experimental case-control data and demonstrate successful identification of published genetic susceptibility loci for a rare monogenic disease (sudden infant death with dysgenesis of the testes syndrome), a rare complex disease (progressive supranuclear palsy), and a common complex disease (Alzheimer disease) across multiple SNP genotyping platforms. On the basis of these theoretical calculations and their experimental validation, our results suggest that pooling-based GWA studies are a logical first step for determining whether major genetic associations exist in diseases with high heritability.
Collapse
Affiliation(s)
- John V Pearson
- Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Johnson T. Bayesian method for gene detection and mapping, using a case and control design and DNA pooling. Biostatistics 2006; 8:546-65. [PMID: 16984977 DOI: 10.1093/biostatistics/kxl028] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Association mapping studies aim to determine the genetic basis of a trait. A common experimental design uses a sample of unrelated individuals classified into 2 groups, for example cases and controls. If the trait has a complex genetic basis, consisting of many quantitative trait loci (QTLs), each group needs to be large. Each group must be genotyped at marker loci covering the region of interest; for dense coverage of a large candidate region, or a whole-genome scan, the number of markers will be very large. The total amount of genotyping required for such a study is formidable. A laboratory effort efficient technique called DNA pooling could reduce the amount of genotyping required, but the data generated are less informative and require novel methods for efficient analysis. In this paper, a Bayesian statistical analysis of the classic model of McPeek and Strahs is proposed. In contrast to previous work on this model, I assume that data are collected using DNA pooling, so individual genotypes are not directly observed, and also account for experimental errors. A complete analysis can be performed using analytical integration, a propagation algorithm for a hidden Markov model, and quadrature. The method developed here is both statistically and computationally efficient. It allows simultaneous detection and mapping of a QTL, in a large-scale association mapping study, using data from pooled DNA. The method is shown to perform well on data sets simulated under a realistic coalescent-with-recombination model, and is shown to outperform classical single-point methods. The method is illustrated on data consisting of 27 markers in an 880-kb region around the CYP2D6 gene.
Collapse
Affiliation(s)
- Toby Johnson
- School of Biological Sciences, The University of Edinburgh, Edinburgh EH9 3JT, UK.
| |
Collapse
|
25
|
Yang HC, Liang YJ, Huang MC, Li LH, Lin CH, Wu JY, Chen YT, Fann C. A genome-wide study of preferential amplification/hybridization in microarray-based pooled DNA experiments. Nucleic Acids Res 2006; 34:e106. [PMID: 16931491 PMCID: PMC1616968 DOI: 10.1093/nar/gkl446] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Revised: 05/05/2006] [Accepted: 06/09/2006] [Indexed: 01/27/2023] Open
Abstract
Microarray-based pooled DNA methods overcome the cost bottleneck of simultaneously genotyping more than 100 000 markers for numerous study individuals. The success of such methods relies on the proper adjustment of preferential amplification/hybridization to ensure accurate and reliable allele frequency estimation. We performed a hybridization-based genome-wide single nucleotide polymorphisms (SNPs) genotyping analysis to dissect preferential amplification/hybridization. The majority of SNPs had less than 2-fold signal amplification or suppression, and the lognormal distributions adequately modeled preferential amplification/hybridization across the human genome. Comparative analyses suggested that the distributions of preferential amplification/hybridization differed among genotypes and the GC content. Patterns among different ethnic populations were similar; nevertheless, there were striking differences for a small proportion of SNPs, and a slight ethnic heterogeneity was observed. To fulfill appropriate and gratuitous adjustments, databases of preferential amplification/hybridization for African Americans, Caucasians and Asians were constructed based on the Affymetrix GeneChip Human Mapping 100 K Set. The robustness of allele frequency estimation using this database was validated by a pooled DNA experiment. This study provides a genome-wide investigation of preferential amplification/hybridization and suggests guidance for the reliable use of the database. Our results constitute an objective foundation for theoretical development of preferential amplification/hybridization and provide important information for future pooled DNA analyses.
Collapse
Affiliation(s)
- H.-C. Yang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - Y.-J. Liang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - M.-C. Huang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - L.-H. Li
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - C.-H. Lin
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - J.-Y. Wu
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - Y.-T. Chen
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - C.S.J. Fann
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| |
Collapse
|
26
|
Yang HC, Pan CC, Lin CY, Fann CSJ. PDA: Pooled DNA analyzer. BMC Bioinformatics 2006; 7:233. [PMID: 16643673 PMCID: PMC1539032 DOI: 10.1186/1471-2105-7-233] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2005] [Accepted: 04/28/2006] [Indexed: 11/19/2022] Open
Abstract
Background Association mapping using abundant single nucleotide polymorphisms is a powerful tool for identifying disease susceptibility genes for complex traits and exploring possible genetic diversity. Genotyping large numbers of SNPs individually is performed routinely but is cost prohibitive for large-scale genetic studies. DNA pooling is a reliable and cost-saving alternative genotyping method. However, no software has been developed for complete pooled-DNA analyses, including data standardization, allele frequency estimation, and single/multipoint DNA pooling association tests. This motivated the development of the software, 'PDA' (Pooled DNA Analyzer), to analyze pooled DNA data. Results We develop the software, PDA, for the analysis of pooled-DNA data. PDA is originally implemented with the MATLAB® language, but it can also be executed on a Windows system without installing the MATLAB®. PDA provides estimates of the coefficient of preferential amplification and allele frequency. PDA considers an extended single-point association test, which can compare allele frequencies between two DNA pools constructed under different experimental conditions. Moreover, PDA also provides novel chromosome-wide multipoint association tests based on p-value combinations and a sliding-window concept. This new multipoint testing procedure overcomes a computational bottleneck of conventional haplotype-oriented multipoint methods in DNA pooling analyses and can handle data sets having a large pool size and/or large numbers of polymorphic markers. All of the PDA functions are illustrated in the four bona fide examples. Conclusion PDA is simple to operate and does not require that users have a strong statistical background. The software is available at .
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, 115, Taiwan
| | - Chia-Ching Pan
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, 115, Taiwan
| | - Chin-Yu Lin
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, 115, Taiwan
| | - Cathy SJ Fann
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, 115, Taiwan
| |
Collapse
|
27
|
Macgregor S, Visscher PM, Montgomery G. Analysis of pooled DNA samples on high density arrays without prior knowledge of differential hybridization rates. Nucleic Acids Res 2006; 34:e55. [PMID: 16627870 PMCID: PMC1440945 DOI: 10.1093/nar/gkl136] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Array based DNA pooling techniques facilitate genome-wide scale genotyping of large samples. We describe a structured analysis method for pooled data using internal replication information in large scale genotyping sets. The method takes advantage of information from single nucleotide polymorphisms (SNPs) typed in parallel on a high density array to construct a test statistic with desirable statistical properties. We utilize a general linear model to appropriately account for the structured multiple measurements available with array data. The method does not require the use of additional arrays for the estimation of unequal hybridization rates and hence scales readily to accommodate arrays with several hundred thousand SNPs. Tests for differences between cases and controls can be conducted with very few arrays. We demonstrate the method on 384 endometriosis cases and controls, typed using Affymetrix Genechip© HindIII 50 K arrays. For a subset of this data there were accurate measures of hybridization rates available. Assuming equal hybridization rates is shown to have a negligible effect upon the results. With a total of only six arrays, the method extracted one-third of the information (in terms of equivalent sample size) available with individual genotyping (requiring 768 arrays). With 20 arrays (10 for cases, 10 for controls), over half of the information could be extracted from this sample.
Collapse
Affiliation(s)
- Stuart Macgregor
- Genetic Epidemiology, Queensland Institute of Medical Research, Brisbane, Australia.
| | | | | |
Collapse
|
28
|
Kirov G, Nikolov I, Georgieva L, Moskvina V, Owen MJ, O'Donovan MC. Pooled DNA genotyping on Affymetrix SNP genotyping arrays. BMC Genomics 2006; 7:27. [PMID: 16480507 PMCID: PMC1382214 DOI: 10.1186/1471-2164-7-27] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2005] [Accepted: 02/15/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genotyping technology has advanced such that genome-wide association studies of complex diseases based upon dense marker maps are now technically feasible. However, the cost of such projects remains high. Pooled DNA genotyping offers the possibility of applying the same technologies at a fraction of the cost, and there is some evidence that certain ultra-high throughput platforms also perform with an acceptable accuracy. However, thus far, this conclusion is based upon published data concerning only a small number of SNPs. RESULTS In the current study we prepared DNA pools from the parents and from the offspring of 30 parent-child trios that have been extensively genotyped by the HapMap project. We analysed the two pools with Affymetrix 10 K Xba 142 2.0 Arrays. The availability of the HapMap data allowed us to validate the performance of 6843 SNPs for which we had both complete individual and pooled genotyping data. Pooled analyses averaged over 5-6 microarrays resulted in highly reproducible results. Moreover, the accuracy of estimating differences in allele frequency between pools using this ultra-high throughput system was comparable with previous reports of pooling based upon lower throughput platforms, with an average error for the predicted allelic frequencies differences between the two pools of 1.37% and with 95% of SNPs showing an error of < 3.2%. CONCLUSION Genotyping thousands of SNPs with DNA pooling using Affymetrix microarrays produces highly accurate results and can be used for genome-wide association studies.
Collapse
Affiliation(s)
- George Kirov
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Ivan Nikolov
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Lyudmila Georgieva
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Valentina Moskvina
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Michael J Owen
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| | - Michael C O'Donovan
- Department of Psychological Medicine, Henry Wellcome Building, Cardiff University, Heath Park, Cardiff CF14 4XN, UK
| |
Collapse
|
29
|
Meaburn E, Butcher LM, Schalkwyk LC, Plomin R. Genotyping pooled DNA using 100K SNP microarrays: a step towards genomewide association scans. Nucleic Acids Res 2006; 34:e27. [PMID: 16478714 PMCID: PMC1368655 DOI: 10.1093/nar/gnj027] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The identification of quantitative trait loci (QTLs) of small effect size that underlie complex traits poses a particular challenge for geneticists due to the large sample sizes and large numbers of genetic markers required for genomewide association scans. An efficient solution for screening purposes is to combine single nucleotide polymorphism (SNP) microarrays and DNA pooling (SNP-MaP), an approach that has been shown to be valid, reliable and accurate in deriving relative allele frequency estimates from pooled DNA for groups such as cases and controls for 10K SNP microarrays. However, in order to conduct a genomewide association study many more SNP markers are needed. To this end, we assessed the validity and reliability of the SNP-MaP method using Affymetrix GeneChip® Mapping 100K Array set. Interpretable results emerged for 95% of the SNPs (nearly 110 000 SNPs). We found that SNP-MaP allele frequency estimates correlated 0.939 with allele frequencies for 97 605 SNPs that were genotyped individually in an independent population; the correlation was 0.971 for 26 SNPs that were genotyped individually for the 1028 individuals used to construct the DNA pools. We conclude that extending the SNP-MaP method to the Affymetrix GeneChip® Mapping 100K Array set provides a useful screen of >100 000 SNP markers for QTL association scans.
Collapse
Affiliation(s)
- Emma Meaburn
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London, De Crespigny Park, London, SE5 8AF, UK.
| | | | | | | |
Collapse
|
30
|
Wilkening S, Hemminki K, Thirumaran RK, Bermejo JL, Bonn S, Försti A, Kumar R. Determination of allele frequency in pooled DNA: comparison of three PCR-based methods. Biotechniques 2005; 39:853-8. [PMID: 16382903 DOI: 10.2144/000112027] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Determination of allele frequency in pooled DNA samples is a powerful and efficient tool for large-scale association studies. In this study, we tested and compared three PCR-based methods for accuracy, reproducibility, cost, and convenience. The methods compared were: (i) real-time PCR with allele-specific primers, (ii) real-time PCR with allele-specific TaqMan® probes, and (iii) quantitative sequencing. Allele frequencies of three single nucleotide polymorphisms in three different genes were estimated from pooled DNA. The pools were made of genomic DNA samples from 96 cases with basal cell carcinoma of the skin and 96 healthy controls with known genotypes. In this study, the allele frequency estimation made by real-time PCR with allele-specific primers had the smallest median deviation (MD) from the real allele frequency with 1.12% (absolute percentage points) and was also the cheapest method. However, this method required the most time for optimization and showed the highest variation between replicates (SD = 6.47%). Quantitative sequencing, the simplest method, was found to have intermediate accuracies (MD = 1.44%, SD = 4.2%). Real-time PCR with TaqMan probes, a convenient but very expensive method, had an MD of 1.47% and the lowest variation between replicates (SD = 3.18%).
Collapse
Affiliation(s)
- Stefan Wilkening
- German Cancer Research Center, Molecular Genetic Epidemiology, Heidelberg, Germany.
| | | | | | | | | | | | | |
Collapse
|
31
|
Craig DW, Huentelman MJ, Hu-Lince D, Zismann VL, Kruer MC, Lee AM, Puffenberger EG, Pearson JM, Stephan DA. Identification of disease causing loci using an array-based genotyping approach on pooled DNA. BMC Genomics 2005; 6:138. [PMID: 16197552 PMCID: PMC1262713 DOI: 10.1186/1471-2164-6-138] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Accepted: 09/30/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Pooling genomic DNA samples within clinical classes of disease followed by genotyping on whole-genome SNP microarrays, allows for rapid and inexpensive genome-wide association studies. Key to the success of these studies is the accuracy of the allelic frequency calculations, the ability to identify false-positives arising from assay variability and the ability to better resolve association signals through analysis of neighbouring SNPs. RESULTS We report the accuracy of allelic frequency measurements on pooled genomic DNA samples by comparing these measurements to the known allelic frequencies as determined by individual genotyping. We describe modifications to the calculation of k-correction factors from relative allele signal (RAS) values that remove biases and result in more accurate allelic frequency predictions. Our results show that the least accurate SNPs, those most likely to give false-positives in an association study, are identifiable by comparing their frequencies to both those from a known database of individual genotypes and those of the pooled replicates. In a disease with a previously identified genetic mutation, we demonstrate that one can identify the disease locus through the comparison of the predicted allelic frequencies in case and control pools. Furthermore, we demonstrate improved resolution of association signals using the mean of individual test-statistics for consecutive SNPs windowed across the genome. A database of k-correction factors for predicting allelic frequencies for each SNP, derived from several thousand individually genotyped samples, is provided. Lastly, a Perl script for calculating RAS values for the Affymetrix platform is provided. CONCLUSION Our results illustrate that pooling of DNA samples is an effective initial strategy to identify a genetic locus. However, it is important to eliminate inaccurate SNPs prior to analysis by comparing them to a database of individually genotyped samples as well as by comparing them to replicates of the pool. Lastly, detection of association signals can be improved by incorporating data from neighbouring SNPs.
Collapse
Affiliation(s)
- David W Craig
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Matthew J Huentelman
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Diane Hu-Lince
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Victoria L Zismann
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Michael C Kruer
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Anne M Lee
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | | | - John M Pearson
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| | - Dietrich A Stephan
- Neurogenomics Division, Translational Genomics Research Institute (TGen) Phoenix, Arizona 85004, USA
| |
Collapse
|
32
|
Brohede J, Dunne R, McKay JD, Hannan GN. PPC: an algorithm for accurate estimation of SNP allele frequencies in small equimolar pools of DNA using data from high density microarrays. Nucleic Acids Res 2005; 33:e142. [PMID: 16199750 PMCID: PMC1240117 DOI: 10.1093/nar/gni142] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Robust estimation of allele frequencies in pools of DNA has the potential to reduce genotyping costs and/or increase the number of individuals contributing to a study where hundreds of thousands of genetic markers need to be genotyped in very large populations sample sets, such as genome wide association studies. In order to make accurate allele frequency estimations from pooled samples a correction for unequal allele representation must be applied. We have developed the polynomial based probe specific correction (PPC) which is a novel correction algorithm for accurate estimation of allele frequencies in data from high-density microarrays. This algorithm was validated through comparison of allele frequencies from a set of 10 individually genotyped DNA's and frequencies estimated from pools of these 10 DNAs using GeneChip 10K Mapping Xba 131 arrays. Our results demonstrate that when using the PPC to correct for allelic biases the accuracy of the allele frequency estimates increases dramatically.
Collapse
Affiliation(s)
- Jesper Brohede
- CSIRO Preventative Health National Research FlagshipSydney, Australia
- CSIRO Molecular and Health TechnologiesSydney, Australia
| | - Rob Dunne
- CSIRO Preventative Health National Research FlagshipSydney, Australia
- CSIRO Mathematical and Information SciencesSydney, Australia
| | - James D. McKay
- Menzies Research Institute, University of TasmaniaHobart, Australia
- International Agency for Research on CancerLyon, France
| | - Garry N. Hannan
- CSIRO Preventative Health National Research FlagshipSydney, Australia
- CSIRO Molecular and Health TechnologiesSydney, Australia
- To whom correspondence should be addressed. Tel. +61 2 9490 5054; Fax +61 2 9490 5010;
| |
Collapse
|
33
|
Meaburn E, Butcher LM, Liu L, Fernandes C, Hansen V, Al-Chalabi A, Plomin R, Craig I, Schalkwyk LC. Genotyping DNA pools on microarrays: tackling the QTL problem of large samples and large numbers of SNPs. BMC Genomics 2005; 6:52. [PMID: 15811185 PMCID: PMC1079828 DOI: 10.1186/1471-2164-6-52] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2004] [Accepted: 04/05/2005] [Indexed: 11/10/2022] Open
Abstract
Background Quantitative trait locus (QTL) theory predicts that genetic influence on complex traits involves multiple genes of small effect size. To detect QTL associations of small effect size, large samples and systematic screens of thousands of DNA markers are required. An efficient solution is to genotype case and control DNA pools using SNP microarrays. We demonstrate that this is practical using DNA pools of 100 individuals. Results Using standard microarray protocols for the Affymetrix GeneChip® Mapping 10 K Array Xba 131, we show that relative allele signal (RAS) values provide a quantitative index of allele frequencies in pooled DNA that correlate 0.986 with allele frequencies for 104 SNPs that were genotyped individually for 100 individuals. The sensitivity of the assay was demonstrated empirically in a spiking experiment in which 15% and 20% of one individual's DNA was added to a DNA pool. Conclusion We conclude that this approach, which we call SNP-MaP (SNP microarrays and pooling), is rapid, cost effective and promises to be a valuable initial screening method in the hunt for QTLs.
Collapse
Affiliation(s)
- Emma Meaburn
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Lee M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Lin Liu
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Cathy Fernandes
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Valerie Hansen
- Department of Neurology, Section of Neurogenetics, Box Number P043, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Ammar Al-Chalabi
- Department of Neurology, Section of Neurogenetics, Box Number P043, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Robert Plomin
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Ian Craig
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| | - Leonard C Schalkwyk
- Social, Genetic and Developmental Psychiatry Centre, Box Number P082, Institute of Psychiatry, De Crespigny Park, London, SE5 8AF, UK
| |
Collapse
|
34
|
Butcher LM, Meaburn E, Knight J, Sham PC, Schalkwyk LC, Craig IW, Plomin R. SNPs, microarrays and pooled DNA: identification of four loci associated with mild mental impairment in a sample of 6000 children. Hum Mol Genet 2005; 14:1315-25. [PMID: 15800012 DOI: 10.1093/hmg/ddi142] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Mild mental impairment (MMI) represents the low extreme of the quantitative trait of general intelligence and is highly heritable. Quantitative trait loci (QTLs) conferring susceptibility to MMI, as for most complex traits, are likely to be of small effect size. Using a novel approach we call SNP-MaP (SNP Microarrays and Pooling), we have identified four loci associated with MMI. These four loci have been replicated in two SNP-MaP studies and verified by individual genotyping. The two SNP-MaP studies conducted were a case versus control comparison (n = 515 and n = 1028, respectively) and a low versus high general intelligence extremes group comparison (n = 503 and n = 505, respectively). Each of the four groups consisted of five independent 'subpools', with each subpool assayed on a separate microarray. Twelve loci showing the largest significant differences in both SNP-MaP studies were individually genotyped on 6154 children. Of the four loci positively associated with MMI, the minor allele of each conferred the greater risk for MMI. Two of the loci are close to known genes and may be in linkage disequilibrium with them. One of the loci is between the candidate genes KLF7 and CREB1, but given possible long-range effects on expression and the unknown importance of untranslated elements such as micro-RNAs, all four loci deserve attention as candidates. Although each SNP accounts for a small amount of variance, their effects are additive and they can be combined in a 'SNP set' that can be used as a genetic risk index for MMI in behavioral genomic analyses.
Collapse
Affiliation(s)
- Lee M Butcher
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, London, UK.
| | | | | | | | | | | | | |
Collapse
|