1
|
El-Kady AM, El-Amir MI, Hassan MH, Allemailem KS, Almatroudi A, Ahmad AA. Genetic Diversity of Schistosoma haematobium in Qena Governorate, Upper Egypt. Infect Drug Resist 2020; 13:3601-3611. [PMID: 33116680 PMCID: PMC7575065 DOI: 10.2147/idr.s266928] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 09/15/2020] [Indexed: 11/23/2022] Open
Abstract
Introduction Schistosomiasis is an important neglected tropical disease (NTD) in several developing countries. Praziquantel is the principle and efficacious chemotherapeutic agent that has been used to treat schistosomiasis for decades. Unfortunately, emerging resistance to praziquantel with accompanying reduced efficacy is reported in some localities. Hence, genetic diversity among parasite populations is of significant interest in assessing the effects of selective pressure generated by praziquantel therapy that might result in encouraging the emergence of new genotypes that are either non-susceptible or drug-resistant. The present study aimed to investigate the genetic diversity of Schistosoma haematobium among human populations using the RAPD technique to help clarify disease epidemiology and transmission. Materials and Methods S. haematobium eggs were isolated from 50 of 134 patients from four different localities in Qena Governorate, Upper Egypt. These patients complained of terminal hematuria and burning micturition. Samples were used for molecular analysis using RAPD-PCR primers (A02, A07, A09, A10). Results Twenty S. haematobium isolates (40%) were amplified using the selected RAPD primers. Amplification patterns of these isolates showed distinct variation in the size and number of amplified fragments, indicating high genetic variation among these isolates. Conclusion To the best of our knowledge, this study is the first to characterize the genetic diversity of S. haematobium in human populations in Upper Egypt. Future studies on a larger geographic scale involving many districts in Upper Egypt should be encouraged. Information from such a study would provide better insight into clonal lineages of S. haematobium in this endemic area. In turn, understanding transmission of the parasite may have a major role in establishing control strategies for urogenital schistosomiasis in Upper Egypt.
Collapse
Affiliation(s)
- Asmaa M El-Kady
- Department of Medical Parasitology, Faculty of Medicine, South Valley University, Qena, Egypt
| | - Mostafa I El-Amir
- Department of Medical Microbiology and Immunology, Faculty of Medicine, South Valley University, Qena, Egypt
| | - Mohammed H Hassan
- Department of Medical Biochemistry, Faculty of Medicine, South Valley University, Qena, Egypt
| | - Khaled S Allemailem
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah, Saudia Arabia
| | - Ahmad Almatroudi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah, Saudia Arabia
| | | |
Collapse
|
2
|
Holtgräwe D, Rosleff Soerensen T, Hausmann L, Pucker B, Viehöver P, Töpfer R, Weisshaar B. A Partially Phase-Separated Genome Sequence Assembly of the Vitis Rootstock 'Börner' ( Vitis riparia × Vitis cinerea) and Its Exploitation for Marker Development and Targeted Mapping. FRONTIERS IN PLANT SCIENCE 2020; 11:156. [PMID: 32194587 PMCID: PMC7064618 DOI: 10.3389/fpls.2020.00156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 01/31/2020] [Indexed: 06/10/2023]
Abstract
Grapevine breeding has become highly relevant due to upcoming challenges like climate change, a decrease in the number of available fungicides, increasing public concern about plant protection, and the demand for a sustainable production. Downy mildew caused by Plasmopara viticola is one of the most devastating diseases worldwide of cultivated Vitis vinifera. In modern breeding programs, therefore, genetic marker technologies and genomic data are used to develop new cultivars with defined and stacked resistance loci. Potential sources of resistance are wild species of American or Asian origin. The interspecific hybrid of Vitis riparia Gm 183 x Vitis cinerea Arnold, available as the rootstock cultivar 'Börner,' carries several relevant resistance loci. We applied next-generation sequencing to enable the reliable identification of simple sequence repeats (SSR), and we also generated a draft genome sequence assembly of 'Börner' to access genome-wide sequence variations in a comprehensive and highly reliable way. These data were used to cover the 'Börner' genome with genetic marker positions. A subset of these marker positions was used for targeted mapping of the P. viticola resistance locus, Rpv14, to validate the marker position list. Based on the reference genome sequence PN40024, the position of this resistance locus can be narrowed down to less than 0.5 Mbp on chromosome 5.
Collapse
Affiliation(s)
- Daniela Holtgräwe
- Faculty of Biology, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | | | - Ludger Hausmann
- Institute for Grapevine Breeding Geilweilerhof, Julius Kuehn-Institute, Federal Research Centre for Cultivated Plants, Siebeldingen, Germany
| | - Boas Pucker
- Faculty of Biology, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Prisca Viehöver
- Faculty of Biology, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Reinhard Töpfer
- Institute for Grapevine Breeding Geilweilerhof, Julius Kuehn-Institute, Federal Research Centre for Cultivated Plants, Siebeldingen, Germany
| | - Bernd Weisshaar
- Faculty of Biology, Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
3
|
Blanton RE. Population Structure and Dynamics of Helminthic Infection: Schistosomiasis. Microbiol Spectr 2019; 7:10.1128/microbiolspec.ame-0009-2019. [PMID: 31325285 PMCID: PMC6650164 DOI: 10.1128/microbiolspec.ame-0009-2019] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Indexed: 11/20/2022] Open
Abstract
While disease and outbreaks are mainly clonal for bacteria and other asexually reproducing organisms, sexual reproduction in schistosomes and other helminths usually results in unique individuals. For sexually reproducing organisms, the traits conserved in clones will instead be conserved in the group of organisms that tends to breed together, the population. While the same tools are applied to characterize DNA, how results are interpreted can be quite different at times (see another article in this collection, http://www.asmscience.org/content/journal/microbiolspec/10.1128/microbiolspec.AME-0002-2018). It is difficult to know what the real effect any control program has on the parasite population without assessing the health of this population, how they respond to the control measure, and how they recover, if they do. This review, part of the Microbiology Spectrum Curated Collection: Advances in Molecular Epidemiology of Infectious Diseases, concentrates on one approach using pooled samples to study schistosome populations and shows how this and other approaches have contributed to our understanding of this parasite family's biology and epidemiology. *This article is part of a curated collection.
Collapse
Affiliation(s)
- Ronald E Blanton
- Center for Global Health and Diseases, Department of Pathology, Case Western Reserve University, Cleveland, OH 44120
| |
Collapse
|
4
|
A new approach based on targeted pooled DNA sequencing identifies novel mutations in patients with Inherited Retinal Dystrophies. Sci Rep 2018; 8:15457. [PMID: 30337596 PMCID: PMC6194132 DOI: 10.1038/s41598-018-33810-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2018] [Accepted: 10/04/2018] [Indexed: 01/28/2023] Open
Abstract
Inherited retinal diseases (IRD) are a heterogeneous group of diseases that mainly affect the retina; more than 250 genes have been linked to the disease and more than 20 different clinical phenotypes have been described. This heterogeneity both at the clinical and genetic levels complicates the identification of causative mutations. Therefore, a detailed genetic characterization is important for genetic counselling and decisions regarding treatment. In this study, we developed a method consisting on pooled targeted next generation sequencing (NGS) that we applied to 316 eye disease related genes, followed by High Resolution Melting and copy number variation analysis. DNA from 115 unrelated test samples was pooled and samples with known mutations were used as positive controls to assess the sensitivity of our approach. Causal mutations for IRDs were found in 36 patients achieving a detection rate of 31.3%. Overall, 49 likely causative mutations were identified in characterized patients, 14 of which were first described in this study (28.6%). Our study shows that this new approach is a cost-effective tool for detection of causative mutations in patients with inherited retinopathies.
Collapse
|
5
|
Dougherty L, Singh R, Brown S, Dardick C, Xu K. Exploring DNA variant segregation types in pooled genome sequencing enables effective mapping of weeping trait in Malus. JOURNAL OF EXPERIMENTAL BOTANY 2018; 69:1499-1516. [PMID: 29361034 PMCID: PMC5888915 DOI: 10.1093/jxb/erx490] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/04/2017] [Accepted: 12/19/2017] [Indexed: 05/19/2023]
Abstract
To unlock the power of next generation sequencing-based bulked segregant analysis in allele discovery in out-crossing woody species, and to understand the genetic control of the weeping trait, an F1 population from the cross 'Cheal's Weeping' × 'Evereste' was used to create two genomic DNA pools 'weeping' (17 progeny) and 'standard' (16 progeny). Illumina pair-end (2 × 151 bp) sequencing of the pools to a 27.1× (weeping) and a 30.4× (standard) genome (742.3 Mb) coverage allowed detection of 84562 DNA variants specific to 'weeping', 92148 specific to 'standard', and 173169 common to both pools. A detailed analysis of the DNA variant genotypes in the pools predicted three informative segregation types of variants: (type I) in weeping pool-specific variants, and (type II) and (type III) in variants common to both pools, where the first allele is assumed to be weeping linked and the allele shown in bold is a variant in relation to the reference genome. Conducting variant allele frequency and density-based mappings revealed four genomic regions with a significant association with weeping: a major locus, Weeping (W), on chromosome 13 and others on chromosomes 10 (W2), 16 (W3), and 5 (W4). The results from type I variants were noisier and less certain than those from type II and type III variants, demonstrating that although type I variants are often the first choice, type II and type III variants represent an important source of DNA variants that can be exploited for genetic mapping in out-crossing woody species. Confirmation of the mapping of W and W2, investigation into their genetic interactions, and identification of expressed genes in the W and W2 regions provided insight into the genetic control of weeping and its expressivity in Malus.
Collapse
Affiliation(s)
- Laura Dougherty
- Horticulture Section, School of Integrative Plant Science, Cornell University, USA
| | - Raksha Singh
- Horticulture Section, School of Integrative Plant Science, Cornell University, USA
| | - Susan Brown
- Horticulture Section, School of Integrative Plant Science, Cornell University, USA
| | | | - Kenong Xu
- Horticulture Section, School of Integrative Plant Science, Cornell University, USA
| |
Collapse
|
6
|
Smadja CM, Loire E, Caminade P, Thoma M, Latour Y, Roux C, Thoss M, Penn DJ, Ganem G, Boursot P. Seeking signatures of reinforcement at the genetic level: a hitchhiking mapping and candidate gene approach in the house mouse. Mol Ecol 2015; 24:4222-4237. [PMID: 26132782 DOI: 10.1111/mec.13301] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 06/12/2015] [Accepted: 06/17/2015] [Indexed: 12/11/2022]
Abstract
Reinforcement is the process by which prezygotic isolation is strengthened as a response to selection against hybridization. Most empirical support for reinforcement comes from the observation of its possible phenotypic signature: an accentuated degree of prezygotic isolation in the hybrid zone as compared to allopatry. Here, we implemented a novel approach to this question by seeking for the signature of reinforcement at the genetic level. In the house mouse, selection against hybrids and enhanced olfactory-based assortative mate preferences are observed in a hybrid zone between the two European subspecies Mus musculus musculus and M. m. domesticus, suggesting a possible recent reinforcement event. To test for the genetic signature of reinforcing selection and identify genes involved in sexual isolation, we adopted a hitchhiking mapping approach targeting genomic regions containing candidate genes for assortative mating in mice. We densely scanned these genomic regions in hybrid zone and allopatric samples using a large number of fast evolving microsatellite loci that allow the detection of recent selection events. We found a handful of loci showing the expected pattern of significant reduction in variability in populations close to the hybrid zone, showing assortative odour preference in mate choice experiments as compared to populations further away and displaying no such preference. These loci lie close to genes that we pinpoint as testable candidates for further investigation.
Collapse
Affiliation(s)
- Carole M Smadja
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Etienne Loire
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Pierre Caminade
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Marios Thoma
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Yasmin Latour
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Camille Roux
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Michaela Thoss
- University of Veterinary Medicine Vienna, Konrad Lorenz Institute of Ethology, Department of Integrative Biology and Evolution, Vienna, Austria
| | - Dustin J Penn
- University of Veterinary Medicine Vienna, Konrad Lorenz Institute of Ethology, Department of Integrative Biology and Evolution, Vienna, Austria
| | - Guila Ganem
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| | - Pierre Boursot
- Institut des Sciences de l'Evolution UMR 5554 (Centre National de la Recherche Scientifique CNRS, Institut pour la Recherche et le Développement IRD, Université de Montpellier), cc065 Université de Montpellier, Campus Triolet, 34095 Montpellier, France
| |
Collapse
|
7
|
Ezeh C, Yin M, Li H, Zhang T, Xu B, Sacko M, Feng Z, Hu W. High genetic variability of Schistosoma haematobium in Mali and Nigeria. THE KOREAN JOURNAL OF PARASITOLOGY 2015; 53:129-34. [PMID: 25748721 PMCID: PMC4384788 DOI: 10.3347/kjp.2015.53.1.129] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Revised: 10/21/2014] [Accepted: 10/23/2014] [Indexed: 12/29/2022]
Abstract
Schistosoma haematobium is one of the most prevalent parasitic flatworms, infecting over 112 million people in Africa. However, little is known about the genetic diversity of natural S. haematobium populations from the human host because of the inaccessible location of adult worms in the host. We used 4 microsatellite loci to genotype individually pooled S. haematobium eggs directly from each patient sampled at 4 endemic locations in Africa. We found that the average allele number of individuals from Mali was significantly higher than that from Nigeria. In addition, no significant difference in allelic composition was detected among the populations within Nigeria; however, the allelic composition was significantly different between Mali and Nigeria populations. This study demonstrated a high level of genetic variability of S. haematobium in the populations from Mali and Nigeria, the 2 major African endemic countries, suggesting that geographical population differentiation may occur in the regions.
Collapse
Affiliation(s)
- Charles Ezeh
- Key Laboratory of Parasite and Vector Biology, Ministry of Public Health, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai 200025, China
| | - Mingbo Yin
- School of Life Science, Fudan University, Shanghai 200433, China
| | - Hongyan Li
- School of Life Science, Fudan University, Shanghai 200433, China
| | - Ting Zhang
- Key Laboratory of Parasite and Vector Biology, Ministry of Public Health, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai 200025, China
| | - Bin Xu
- Key Laboratory of Parasite and Vector Biology, Ministry of Public Health, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai 200025, China
| | - Moussa Sacko
- Laboratory of Parasitology, Institut National de Recherche en Sante Publique, 1771, Bamako, Mali
| | - Zheng Feng
- Key Laboratory of Parasite and Vector Biology, Ministry of Public Health, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai 200025, China
| | - Wei Hu
- Key Laboratory of Parasite and Vector Biology, Ministry of Public Health, National Institute of Parasitic Diseases, Chinese Center for Disease Control and Prevention, Shanghai 200025, China ; School of Life Science, Fudan University, Shanghai 200433, China
| |
Collapse
|
8
|
Guo Y, Cai Q, Li C, Li J, Courtney R, Zheng W, Long J. An evaluation of allele frequency estimation accuracy using pooled sequencing data. ACTA ACUST UNITED AC 2013; 6:279-93. [PMID: 24088264 DOI: 10.1504/ijcbdd.2013.056709] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Next generation sequencing technology has matured, and with its current affordability, will replace the SNP chip as the genotyping tool of choice. Even with the current affordability of NGS, large scale studies will require careful study design to reduce cost. In this study, we designed an experiment to assess the accuracy of allele frequency estimated from pooled sequencing data. We compared the allele frequency estimated from sequencing data with the allele frequency estimated from individual SNP chip data and observed high correlations between them. However, by calculating error rate, we found that many SNPs had their allele frequency estimated from sequencing data significantly different from allele frequency estimated from SNP chip data. In conclusion, we found correlation is not an ideal measurement for comparing allele frequencies. And for the purpose of estimating allele frequency, we do not recommend using pooling with NGS as a cheaper alternative to genotype each sample individually.
Collapse
Affiliation(s)
- Yan Guo
- Department of Cancer Biology, Vanderbilt University, Nashville TN 37232, USA
| | | | | | | | | | | | | |
Collapse
|
9
|
Cuenca J, Aleza P, Navarro L, Ollitrault P. Assignment of SNP allelic configuration in polyploids using competitive allele-specific PCR: application to citrus triploid progeny. ANNALS OF BOTANY 2013; 111:731-42. [PMID: 23422023 PMCID: PMC3605964 DOI: 10.1093/aob/mct032] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 01/04/2013] [Indexed: 05/20/2023]
Abstract
BACKGROUND Polyploidy is a major component of eukaryote evolution. Estimation of allele copy numbers for molecular markers has long been considered a challenge for polyploid species, while this process is essential for most genetic research. With the increasing availability and whole-genome coverage of single nucleotide polymorphism (SNP) markers, it is essential to implement a versatile SNP genotyping method to assign allelic configuration efficiently in polyploids. SCOPE This work evaluates the usefulness of the KASPar method, based on competitive allele-specific PCR, for the assignment of SNP allelic configuration. Citrus was chosen as a model because of its economic importance, the ongoing worldwide polyploidy manipulation projects for cultivar and rootstock breeding, and the increasing availability of SNP markers. CONCLUSIONS Fifteen SNP markers were successfully designed that produced clear allele signals that were in agreement with previous genotyping results at the diploid level. The analysis of DNA mixes between two haploid lines (Clementine and pummelo) at 13 different ratios revealed a very high correlation (average = 0·9796; s.d. = 0·0094) between the allele ratio and two parameters [θ angle = tan(-1) (y/x) and y' = y/(x + y)] derived from the two normalized allele signals (x and y) provided by KASPar. Separated cluster analysis and analysis of variance (ANOVA) from mixed DNA simulating triploid and tetraploid hybrids provided 99·71 % correct allelic configuration. Moreover, triploid populations arising from 2n gametes and interploid crosses were easily genotyped and provided useful genetic information. This work demonstrates that the KASPar SNP genotyping technique is an efficient way to assign heterozygous allelic configurations within polyploid populations. This method is accurate, simple and cost-effective. Moreover, it may be useful for quantitative studies, such as relative allele-specific expression analysis and bulk segregant analysis.
Collapse
Affiliation(s)
- José Cuenca
- Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), 46113 Moncada (Valencia), Spain
| | - Pablo Aleza
- Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), 46113 Moncada (Valencia), Spain
| | - Luis Navarro
- Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), 46113 Moncada (Valencia), Spain
- For correspondence. E-mail or
| | - Patrick Ollitrault
- Centro de Protección Vegetal y Biotecnología, Instituto Valenciano de Investigaciones Agrarias (IVIA), 46113 Moncada (Valencia), Spain
- UMR AGAP, Centre de Coopération Internationale en Recherche Agronomique pour le Développement (CIRAD), TA A-108/02, 34398 Montpellier, Cedex 5, France
- For correspondence. E-mail or
| |
Collapse
|
10
|
Evaluation of allele frequency estimation using pooled sequencing data simulation. ScientificWorldJournal 2013; 2013:895496. [PMID: 23476151 PMCID: PMC3582166 DOI: 10.1155/2013/895496] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 12/30/2012] [Indexed: 11/17/2022] Open
Abstract
Next-generation sequencing (NGS) technology has provided researchers with opportunities to study the genome in unprecedented detail. In particular, NGS is applied to disease association studies. Unlike genotyping chips, NGS is not limited to a fixed set of SNPs. Prices for NGS are now comparable to the SNP chip, although for large studies the cost can be substantial. Pooling techniques are often used to reduce the overall cost of large-scale studies. In this study, we designed a rigorous simulation model to test the practicability of estimating allele frequency from pooled sequencing data. We took crucial factors into consideration, including pool size, overall depth, average depth per sample, pooling variation, and sampling variation. We used real data to demonstrate and measure reference allele preference in DNAseq data and implemented this bias in our simulation model. We found that pooled sequencing data can introduce high levels of relative error rate (defined as error rate divided by targeted allele frequency) and that the error rate is more severe for low minor allele frequency SNPs than for high minor allele frequency SNPs. In order to overcome the error introduced by pooling, we recommend a large pool size and high average depth per sample.
Collapse
|
11
|
Feder AF, Petrov DA, Bergland AO. LDx: estimation of linkage disequilibrium from high-throughput pooled resequencing data. PLoS One 2012; 7:e48588. [PMID: 23152785 PMCID: PMC3494690 DOI: 10.1371/journal.pone.0048588] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 10/03/2012] [Indexed: 12/14/2022] Open
Abstract
High-throughput pooled resequencing offers significant potential for whole genome population sequencing. However, its main drawback is the loss of haplotype information. In order to regain some of this information, we present LDx, a computational tool for estimating linkage disequilibrium (LD) from pooled resequencing data. LDx uses an approximate maximum likelihood approach to estimate LD (r(2)) between pairs of SNPs that can be observed within and among single reads. LDx also reports r(2) estimates derived solely from observed genotype counts. We demonstrate that the LDx estimates are highly correlated with r(2) estimated from individually resequenced strains. We discuss the performance of LDx using more stringent quality conditions and infer via simulation the degree to which performance can improve based on read depth. Finally we demonstrate two possible uses of LDx with real and simulated pooled resequencing data. First, we use LDx to infer genomewide patterns of decay of LD with physical distance in D. melanogaster population resequencing data. Second, we demonstrate that r(2) estimates from LDx are capable of distinguishing alternative demographic models representing plausible demographic histories of D. melanogaster.
Collapse
Affiliation(s)
- Alison F Feder
- Department of Biology, Stanford University, Stanford, California, United States of America.
| | | | | |
Collapse
|
12
|
Danaher MR, Schisterman EF, Roy A, Albert PS. Estimation of gene-environment interaction by pooling biospecimens. Stat Med 2012; 31:3241-52. [PMID: 22859290 DOI: 10.1002/sim.5357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2011] [Accepted: 02/08/2012] [Indexed: 11/09/2022]
Abstract
Case-control studies are prone to low power for testing gene-environment interactions (GXE) given the need for a sufficient number of individuals on each strata of disease, gene, and environment. We propose a new study design to increase power by strategically pooling biospecimens. Pooling biospecimens allows us to increase the number of subjects significantly, thereby providing substantial increase in power. We focus on a special, although realistic case, where disease and environmental statuses are binary, and gene status is ordinal with each individual having 0, 1, or 2 minor alleles. Through pooling, we obtain an allele frequency for each level of disease and environmental status. Using the allele frequencies, we develop a new methodology for estimating and testing GXE that is comparable to the situation when we have complete data on gene status for each individual. We also explore the measurement process and its effect on the GXE estimator. Using an illustration, we show the effectiveness of pooling with an epidemiologic study, which tests an interaction for fiber and paraoxonase on anovulation. Through simulation, we show that taking 12 pooled measurements from 1000 individuals achieves more power than individually genotyping 500 individuals. Our findings suggest that strategic pooling should be considered when an investigator designs a pilot study to test for a GXE.
Collapse
Affiliation(s)
- M R Danaher
- Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Rockville, MD, U.S.A
| | | | | | | |
Collapse
|
13
|
Zhu Y, Bergland AO, González J, Petrov DA. Empirical validation of pooled whole genome population re-sequencing in Drosophila melanogaster. PLoS One 2012; 7:e41901. [PMID: 22848651 PMCID: PMC3406057 DOI: 10.1371/journal.pone.0041901] [Citation(s) in RCA: 73] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 06/28/2012] [Indexed: 11/26/2022] Open
Abstract
The sequencing of pooled non-barcoded individuals is an inexpensive and efficient means of assessing genome-wide population allele frequencies, yet its accuracy has not been thoroughly tested. We assessed the accuracy of this approach on whole, complex eukaryotic genomes by resequencing pools of largely isogenic, individually sequenced Drosophila melanogaster strains. We called SNPs in the pooled data and estimated false positive and false negative rates using the SNPs called in individual strain as a reference. We also estimated allele frequency of the SNPs using “pooled” data and compared them with “true” frequencies taken from the estimates in the individual strains. We demonstrate that pooled sequencing provides a faithful estimate of population allele frequency with the error well approximated by binomial sampling, and is a reliable means of novel SNP discovery with low false positive rates. However, a sufficient number of strains should be used in the pooling because variation in the amount of DNA derived from individual strains is a substantial source of noise when the number of pooled strains is low. Our results and analysis confirm that pooled sequencing is a very powerful and cost-effective technique for assessing of patterns of sequence variation in populations on genome-wide scales, and is applicable to any dataset where sequencing individuals or individual cells is impossible, difficult, time consuming, or expensive.
Collapse
Affiliation(s)
- Yuan Zhu
- Department of Genetics, Stanford University, Stanford, California, United States of America.
| | | | | | | |
Collapse
|
14
|
Design and Statistical Analysis of Pooled Next Generation Sequencing for Rare Variants. JOURNAL OF PROBABILITY AND STATISTICS 2012. [DOI: 10.1155/2012/524724] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Next generation sequencing (NGS) is a revolutionary technology for biomedical research. One highly cost-efficient application of NGS is to detect disease association based on pooled DNA samples. However, several key issues need to be addressed for pooled NGS. One of them is the high sequencing error rate and its high variability across genomic positions and experiment runs, which, if not well considered in the experimental design and analysis, could lead to either inflated false positive rates or loss in statistical power. Another important issue is how to test association of a group of rare variants. To address the first issue, we proposed a new blocked pooling design in which multiple pools of DNA samples from cases and controls are sequenced together on same NGS functional units. To address the second issue, we proposed a testing procedure that does not require individual genotypes but by taking advantage of multiple DNA pools. Through a simulation study, we demonstrated that our approach provides a good control of the type I error rate, and yields satisfactory power compared to the test-based on individual genotypes. Our results also provide guidelines for designing an efficient pooled.
Collapse
|
15
|
Wang T, Pradhan K, Ye K, Wong LJ, Rohan TE. Estimating allele frequency from next-generation sequencing of pooled mitochondrial DNA samples. Front Genet 2011; 2:51. [PMID: 22303347 PMCID: PMC3268604 DOI: 10.3389/fgene.2011.00051] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Accepted: 07/25/2011] [Indexed: 01/08/2023] Open
Abstract
Background: Both common and rare mitochondrial DNA (mtDNA) variants may contribute to genetic susceptibility to some complex human diseases. Understanding of the role of mtDNA variants will provide valuable insights into the etiology of these diseases. However, to date, there have not been any large-scale, genome-wide association studies of complete mtDNA variants and disease risk. One reason for this might be the substantial cost of sequencing the large number of samples required for genetic epidemiology studies. Next-generation sequencing of pooled mtDNA samples will dramatically reduce the cost of such studies and may represent an appealing approach for large-scale genetic epidemiology studies. However, the performance of the different designs of sequencing pooled mtDNA has not been evaluated. Methods: We examined the approach of sequencing pooled mtDNA of multiple individuals for estimating allele frequency using the Illumina genome analyzer (GA) II sequencing system. In this study the pool included mtDNA samples of 20 subjects that had been sequenced previously using Sanger sequencing. Each pool was replicated once to assess variation of the sequencing error between pools. To reduce such variation, barcoding was used for sequencing different pools in the same lane of the flow cell. To evaluate the effect of different pooling strategies pooling was done at both the pre- and post-PCR amplification step. Results: The sequencing error rate was close to that expected based on the Phred score. When only reads with Phred ≥ 20 were considered, the average error rate was about 0.3%. However, there was significant variation of the base-calling errors for different types of bases or at different loci. Using the results of the Sanger sequencing as the standard, the sensitivity of single nucleotide polymorphism detection with post-PCR pooling (about 99%) was higher than that of the pre-PCR pooling (about 82%), while the two approaches had similar specificity (about 99%). Among a total of 298 variants in the sample, the allele frequencies of 293 variants (98%) were correctly estimated with post-PCR pooling, the correlation between the estimated and the true allele frequencies being >0.99, while only 206 allele frequencies (69%) were correctly estimated in the pre-PCR pooling, the correlation being 0.89. Conclusion: Sequencing of mtDNA pooled after PCR amplification is a viable tool for screening mitochondrial variants potentially related to human diseases.
Collapse
Affiliation(s)
- Tao Wang
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine Bronx, NY, USA
| | | | | | | | | |
Collapse
|
16
|
Borge KS, Børresen-Dale AL, Lingaas F. Identification of genetic variation in 11 candidate genes of canine mammary tumour. Vet Comp Oncol 2011; 9:241-50. [DOI: 10.1111/j.1476-5829.2010.00250.x] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
Wang T, Lin CY, Rohan TE, Ye K. Resequencing of pooled DNA for detecting disease associations with rare variants. Genet Epidemiol 2010; 34:492-501. [PMID: 20578089 DOI: 10.1002/gepi.20502] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A combination of common and rare variants is thought to contribute to genetic susceptibility to complex diseases. Recently, next-generation sequencers have greatly lowered sequencing costs, providing an opportunity to identify rare disease variants in large genetic epidemiology studies. At present, it is still expensive and time consuming to resequence large number of individual genomes. However, given that next-generation sequencing technology can provide accurate estimates of allele frequencies from pooled DNA samples, it is possible to detect associations of rare variants using pooled DNA sequencing. Current statistical approaches to the analysis of associations with rare variants are not designed for use with pooled next-generation sequencing data. Hence, they may not be optimal in terms of both validity and power. Therefore, we propose here a new statistical procedure to analyze the output of pooled sequencing data. The test statistic can be computed rapidly, making it feasible to test the association of a large number of variants with disease. By simulation, we compare this approach to Fisher's exact test based either on pooled or individual genotypic data. Our results demonstrate that the proposed method provides good control of the Type I error rate, while yielding substantially higher power than Fisher's exact test using pooled genotypic data for testing rare variants, and has similar or higher power than that of Fisher's exact test using individual genotypic data. Our results also provide guidelines on how various parameters of the pooled sequencing design affect the efficiency of detecting associations.
Collapse
Affiliation(s)
- Tao Wang
- Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York 10461, USA.
| | | | | | | |
Collapse
|
18
|
Shental N, Amir A, Zuk O. Identification of rare alleles and their carriers using compressed se(que)nsing. Nucleic Acids Res 2010; 38:e179. [PMID: 20699269 PMCID: PMC2965256 DOI: 10.1093/nar/gkq675] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2010] [Revised: 06/20/2010] [Accepted: 07/19/2010] [Indexed: 11/29/2022] Open
Abstract
Identification of rare variants by resequencing is important both for detecting novel variations and for screening individuals for known disease alleles. New technologies enable low-cost resequencing of target regions, although it is still prohibitive to test more than a few individuals. We propose a novel pooling design that enables the recovery of novel or known rare alleles and their carriers in groups of individuals. The method is based on a Compressed Sensing (CS) approach, which is general, simple and efficient. CS allows the use of generic algorithmic tools for simultaneous identification of multiple variants and their carriers. We model the experimental procedure and show via computer simulations that it enables the recovery of rare alleles and their carriers in larger groups than were possible before. Our approach can also be combined with barcoding techniques to provide a feasible solution based on current resequencing costs. For example, when targeting a small enough genomic region (∼100 bp) and using only ∼10 sequencing lanes and ∼10 distinct barcodes per lane, one recovers the identity of 4 rare allele carriers out of a population of over 4000 individuals. We demonstrate the performance of our approach over several publicly available experimental data sets.
Collapse
Affiliation(s)
- Noam Shental
- Department of Computer Science, The Open University of Israel, Raanana 43107, Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel and Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Amnon Amir
- Department of Computer Science, The Open University of Israel, Raanana 43107, Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel and Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Or Zuk
- Department of Computer Science, The Open University of Israel, Raanana 43107, Department of Physics of Complex Systems, Weizmann Institute of Science, Rehovot 76100, Israel and Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
19
|
Blank WA, Reis EAG, Thiong'o FW, Braghiroli JF, Santos JM, Melo PRS, Guimarães ICS, Silva LK, Carmo TMA, Reis MG, Blanton RE. Analysis of Schistosoma mansoni population structure using total fecal egg sampling. J Parasitol 2010; 95:881-9. [PMID: 20049994 DOI: 10.1645/ge-1895.1] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Many parasite populations are difficult to sample because they are not uniformly distributed between several host species and are often not easily collected from the living host, thereby limiting sample size and possibly distorting the representation of the population. For the parasite Schistosoma mansoni, we investigated the use of eggs, in aggregate, from the stools of infected individuals as a simple and representative sample. Previously, we demonstrated that microsatellite allele frequencies can be accurately estimated from pooled DNA of cloned S. mansoni adults. Here, we show that genotyping of parasite populations from reproductively isolated laboratory strains can be used to identify these specific populations based on characteristic patterns of allele frequencies, as observed by polyacrylamide gel electrophoresis and automated sequencer analysis of fluorescently labeled PCR products. Microsatellites used to genotype aggregates of eggs collected from stools of infected individuals produced results consistent with the geographic distribution of the samples. Preferential amplification of smaller alleles, and stutter PCR products, had negligible effect on measurement of genetic differentiation. Direct analysis of total stool eggs can be an important approach to questions of population genetics for this parasite by increasing the sample size to thousands per infected individual and by reducing bias.
Collapse
Affiliation(s)
- Walter A Blank
- Center for Global Health and Diseases, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Abstract
Resequencing genomic DNA from pools of individuals is an effective strategy to detect new variants in targeted regions and compare them between cases and controls. There are numerous ways to assign individuals to the pools on which they are to be sequenced. The naïve, disjoint pooling scheme (many individuals to one pool) in predominant use today offers insight into allele frequencies, but does not offer the identity of an allele carrier. We present a framework for overlapping pool design, where each individual sample is resequenced in several pools (many individuals to many pools). Upon discovering a variant, the set of pools where this variant is observed reveals the identity of its carrier. We formalize the mathematical framework for such pool designs and list the requirements from such designs. We specifically address three practical concerns for pooled resequencing designs: (1) false-positives due to errors introduced during amplification and sequencing; (2) false-negatives due to undersampling particular alleles aggravated by nonuniform coverage; and consequently, (3) ambiguous identification of individual carriers in the presence of errors. We build on theory of error-correcting codes to design pools that overcome these pitfalls. We show that in practical parameters of resequencing studies, our designs guarantee high probability of unambiguous singleton carrier identification while maintaining the features of naïve pools in terms of sensitivity, specificity, and the ability to estimate allele frequencies. We demonstrate the ability of our designs in extracting rare variations using short read data from the 1000 Genomes Pilot 3 project.
Collapse
Affiliation(s)
- Snehit Prabhu
- Department of Computer Science, Columbia University, New York, New York 10025, USA.
| | | |
Collapse
|
21
|
Applications of Linkage Disequilibrium and Association Mapping in Maize. MOLECULAR GENETIC APPROACHES TO MAIZE IMPROVEMENT 2008. [DOI: 10.1007/978-3-540-68922-5_13] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
22
|
Abstract
The analysis of genome wide variation offers the possibility of unravelling the genes involved in the pathogenesis of disease. Genome wide association studies are also particularly useful for identifying and validating targets for therapeutic intervention as well as for detecting markers for drug efficacy and side effects. The cost of such large-scale genetic association studies may be reduced substantially by the analysis of pooled DNA from multiple individuals. However, experimental errors inherent in pooling studies lead to a potential increase in the false positive rate and a loss in power compared to individual genotyping. Here we quantify various sources of experimental error using empirical data from typical pooling experiments and corresponding individual genotyping counts using two statistical methods. We provide analytical formulas for calculating these different errors in the absence of complete information, such as replicate pool formation, and for adjusting for the errors in the statistical analysis. We demonstrate that DNA pooling has the potential of estimating allele frequencies accurately, and adjusting the pooled allele frequency estimates for differential allelic amplification considerably improves accuracy. Estimates of the components of error show that differential allelic amplification is the most important contributor to the error variance in absolute allele frequency estimation, followed by allele frequency measurement and pool formation errors. Our results emphasise the importance of minimising experimental errors and obtaining correct error estimates in genetic association studies.
Collapse
Affiliation(s)
- A Jawaid
- Research & Development Genetics, AstraZeneca Pharmaceuticals, Macclesfield Cheshire SK104TG, UK.
| | | |
Collapse
|
23
|
Chen HH, Jou YS, Lee WJ, Pan WH. Applying polynomial standard curve method to correct bias encountered in estimating allele frequencies using DNA pooling strategy. Genomics 2008; 92:429-35. [PMID: 18793711 DOI: 10.1016/j.ygeno.2008.08.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2008] [Revised: 08/15/2008] [Accepted: 08/18/2008] [Indexed: 11/25/2022]
Abstract
DNA pooling approach is a cost-saving strategy which is crucial for multiple-SNP association study and particularly for laboratories with limited budget. However, the biased allele frequency estimates cannot be completely abolished by kappa correction. Using the SNaPshottrade mark, we systematically examined the relations between actual minor allele frequencies (AMiAFs) levels and estimates obtained from the pooling process for all six types of SNPs. We applied principle of polynomial standard curves method (PSCM) to produce allele frequency estimates in pooled DNA samples and compared it with the kappa method. The results showed that estimates derived from the PSCM were in general closer to AMiAFs than those from the kappa method, particularly for C/G and G/T polymorphisms at the range of AMiAF between 20-40%. We demonstrated that applying PSCM in the SNaPshottrade mark platform is suitable for multiple-SNP association study using pooling strategy, due to its cost effectiveness and estimation accuracy.
Collapse
Affiliation(s)
- Hsin-Hung Chen
- Institute of Biomedical Sciences, Academia Sinica, Taiwan, ROC
| | | | | | | |
Collapse
|
24
|
Yang HC, Huang MC, Li LH, Lin CH, Yu ALT, Diccianni MB, Wu JY, Chen YT, Fann CSJ. MPDA: microarray pooled DNA analyzer. BMC Bioinformatics 2008; 9:196. [PMID: 18412951 PMCID: PMC2387178 DOI: 10.1186/1471-2105-9-196] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Accepted: 04/15/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Microarray-based pooled DNA experiments that combine the merits of DNA pooling and gene chip technology constitute a pivotal advance in biotechnology. This new technique uses pooled DNA, thereby reducing costs associated with the typing of DNA from numerous individuals. Moreover, use of an oligonucleotide gene chip reduces costs related to processing various DNA segments (e.g., primers, reagents). Thus, the technique provides an overall cost-effective solution for large-scale genomic/genetic research. However, few publicly shared tools are available to systematically analyze the rapidly accumulating volume of whole-genome pooled DNA data. RESULTS We propose a generalized concept of pooled DNA and present a user-friendly tool named Microarray Pooled DNA Analyzer (MPDA) that we developed to analyze hybridization intensity data from microarray-based pooled DNA experiments. MPDA enables whole-genome DNA preferential amplification/hybridization analysis, allele frequency estimation, association mapping, allelic imbalance detection, and permits integration with shared data resources online. Graphic and numerical outputs from MPDA support global and detailed inspection of large amounts of genomic data. Four whole-genome data analyses are used to illustrate the major functionalities of MPDA. The first analysis shows that MPDA can characterize genomic patterns of preferential amplification/hybridization and provide calibration information for pooled DNA data analysis. The second analysis demonstrates that MPDA can accurately estimate allele frequencies. The third analysis indicates that MPDA is cost-effective and reliable for association mapping. The final analysis shows that MPDA can identify regions of chromosomal aberration in cancer without paired-normal tissue. CONCLUSION MPDA, the software that integrates pooled DNA association analysis and allelic imbalance analysis, provides a convenient analysis system for extensive whole-genome pooled DNA data analysis. The software, user manual and illustrated examples are freely available online at the MPDA website listed in the Availability and requirements section.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan.
| | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
The genetic dissection of complex disorders via genetic marker data has gained popularity in the postgenome era. Methods for typing genetic markers on human chromosomes continue to improve. Compared with the popular individual genotyping experiment, a pooled-DNA experiment (alleotyping experiment) is more cost effective when carrying out genetic typing. This chapter provides an overview of association mapping using pooled DNA and describes a five-stage study design including the preliminary calibration of peak intensities, estimation of allele frequency, single-locus association mapping, multilocus association mapping, and a confirmation study. Software and an analysis of authentic data are presented. The strengths and weaknesses of pooled-DNA analyses, as well as possible future applications for this method, are discussed.
Collapse
Affiliation(s)
- Hsin-Chou Yang
- Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei, Taiwan
| | | |
Collapse
|
26
|
Abstract
In the past, to study Mendelian diseases, segregating families have been carefully ascertained for segregation analysis, followed by collecting extended multiplex families for linkage analysis. This would then be followed by association studies, using independent case-control samples and/or additional family data. Recently, for complex diseases, the initial sampling has been for a genome-wide linkage analysis, often using independent sib-pairs or nuclear families, to identify candidate regions for follow-up with association studies, again using case-control samples and/or additional family data. We now have the ability to conduct genome-wide association studies using 100,000-500,000 diallelic genetic markers. For such studies we focus especially on efficient two-stage association sampling designs, which can retain nearly optimal statistical power at about half the genotyping cost. Similarly, beginning an association study by genotyping pooled samples may also be a viable option if the cost of accurately pooling DNA samples outweighs genotyping costs. Finally, we note that the sampling of family data for linkage analysis is not a practice that should be automatically discontinued.
Collapse
Affiliation(s)
- Robert C Elston
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, Ohio 44106, USA.
| | | | | |
Collapse
|
27
|
Wang D, Sun F. Sample sizes for the transmission disequilibrium tests: tdt, s-tdt and 1-tdt. COMMUN STAT-THEOR M 2007. [DOI: 10.1080/03610920008832535] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
28
|
Xavier GM, de Sá AR, Guimarães ALS, da Silva TA, Gomez RS. Investigation of functional gene polymorphisms interleukin-1β, interleukin-6, interleukin-10 and tumor necrosis factor in individuals with oral lichen planus. J Oral Pathol Med 2007; 36:476-81. [PMID: 17686006 DOI: 10.1111/j.1600-0714.2007.00560.x] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Oral lichen planus (OLP) is a chronic inflammatory oral mucosal disease. There are some studies in the literature demonstrating association between cytokines genes polymorphisms and susceptibility to develop some immune-mediate conditions. The purpose of this study was to investigate cytokine gene polymorphisms in a sample of Brazilian patients with OLP. Fifty-three patients with OLP (mean age = 43.1 years; range 20-68 years) and 53 healthy volunteers (mean age = 42.9 years; range 21-67) were genotyped for IL-1beta +3954 (C/T), IL-6-174 (G/C), IL-10-1082 (G/A) and TNFA-308 (G/A) gene polymorphisms. Statistical analysis was based on the use of logistic regression (P-values below 0.05 were considered as significant). IL-6 and TNFA homozygous genotypes were significantly more often detected in OLP patients. These genotypes were associated with an increased risk of OLP development (OR 6.89 and 13.04, respectively). IL-1beta and IL-10 gene polymorphisms were not related to OLP development. Our findings clearly demonstrate an association between inheritance of IL-6 and TNFA gene polymorphisms and OLP occurrence, thus giving additional support for genetic basis of this disease.
Collapse
Affiliation(s)
- Guilherme Machado Xavier
- Department of Oral Surgery and Pathology, School of Dentistry, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | | | | | | | | |
Collapse
|
29
|
Yatsu K, Mizuki N, Hirawa N, Oka A, Itoh N, Yamane T, Ogawa M, Shiwa T, Tabara Y, Ohno S, Soma M, Hata A, Nakao K, Ueshima H, Ogihara T, Tomoike H, Miki T, Kimura A, Mano S, Kulski JK, Umemura S, Inoko H. High-resolution mapping for essential hypertension using microsatellite markers. Hypertension 2007; 49:446-52. [PMID: 17242298 DOI: 10.1161/01.hyp.0000257256.77680.02] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
During the past decade, considerable efforts and resources have been devoted to elucidating the multiple genetic and environmental determinants responsible for hypertension and its associated cardiovascular diseases. The success of positional cloning, fine mapping, and linkage analysis based on whole-genome screening, however, has been limited in identifying multiple genetic determinants affecting diseases, suggesting that new research strategies for genome-wide typing may be helpful. Disease association (case-control) studies using microsatellite markers, distributed every 150 kb across the human genome, may have some advantages over linkage, candidate, and single nucleotide polymorphism typing methods in terms of statistical power and linkage disequilibrium for finding genomic regions harboring candidate disease genes, although it is not proven. We have carried out genome-wide mapping using 18,977 microsatellite markers in a Japanese population composed of 385 hypertensive patients and 385 normotensive control subjects. Pooled sample analysis was conducted in a 3-stage genomic screen of 3 independent case-control populations, and 54 markers were extracted from the original 18,977 microsatellite markers. As a final step, each single positive marker was confirmed by individual typing, and only 19 markers passed this test. We identified 19 allelic loci that were significantly different between the cases of essential hypertension and the controls.
Collapse
Affiliation(s)
- Keisuke Yatsu
- Department of Medical Science and Cardiorenal Medicine, Yokohama City University School of Medicine, Yokohama, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Yang HC, Liang YJ, Huang MC, Li LH, Lin CH, Wu JY, Chen YT, Fann C. A genome-wide study of preferential amplification/hybridization in microarray-based pooled DNA experiments. Nucleic Acids Res 2006; 34:e106. [PMID: 16931491 PMCID: PMC1616968 DOI: 10.1093/nar/gkl446] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Revised: 05/05/2006] [Accepted: 06/09/2006] [Indexed: 01/27/2023] Open
Abstract
Microarray-based pooled DNA methods overcome the cost bottleneck of simultaneously genotyping more than 100 000 markers for numerous study individuals. The success of such methods relies on the proper adjustment of preferential amplification/hybridization to ensure accurate and reliable allele frequency estimation. We performed a hybridization-based genome-wide single nucleotide polymorphisms (SNPs) genotyping analysis to dissect preferential amplification/hybridization. The majority of SNPs had less than 2-fold signal amplification or suppression, and the lognormal distributions adequately modeled preferential amplification/hybridization across the human genome. Comparative analyses suggested that the distributions of preferential amplification/hybridization differed among genotypes and the GC content. Patterns among different ethnic populations were similar; nevertheless, there were striking differences for a small proportion of SNPs, and a slight ethnic heterogeneity was observed. To fulfill appropriate and gratuitous adjustments, databases of preferential amplification/hybridization for African Americans, Caucasians and Asians were constructed based on the Affymetrix GeneChip Human Mapping 100 K Set. The robustness of allele frequency estimation using this database was validated by a pooled DNA experiment. This study provides a genome-wide investigation of preferential amplification/hybridization and suggests guidance for the reliable use of the database. Our results constitute an objective foundation for theoretical development of preferential amplification/hybridization and provide important information for future pooled DNA analyses.
Collapse
Affiliation(s)
- H.-C. Yang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - Y.-J. Liang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - M.-C. Huang
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - L.-H. Li
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - C.-H. Lin
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - J.-Y. Wu
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - Y.-T. Chen
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| | - C.S.J. Fann
- Institute of Biomedical Sciences, Academia SinicaTaipei 115, Taiwan
| |
Collapse
|
31
|
Chowdari KV, Northup A, Pless L, Wood J, Joo YH, Mirnics K, Lewis DA, Levitt PR, Bacanu SA, Nimgaonkar VL. DNA pooling: a comprehensive, multi-stage association analysis of ACSL6 and SIRT5 polymorphisms in schizophrenia. GENES BRAIN AND BEHAVIOR 2006; 6:229-39. [PMID: 16827919 DOI: 10.1111/j.1601-183x.2006.00251.x] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Many candidate gene association studies have evaluated incomplete, unrepresentative sets of single nucleotide polymorphisms (SNPs), producing non-significant results that are difficult to interpret. Using a rapid, efficient strategy designed to investigate all common SNPs, we tested associations between schizophrenia and two positional candidate genes: ACSL6 (Acyl-Coenzyme A synthetase long-chain family member 6) and SIRT5 (silent mating type information regulation 2 homologue 5). We initially evaluated the utility of DNA sequencing traces to estimate SNP allele frequencies in pooled DNA samples. The mean variances for the DNA sequencing estimates were acceptable and were comparable to other published methods (mean variance: 0.0008, range 0-0.0119). Using pooled DNA samples from cases with schizophrenia/schizoaffective disorder (Diagnostic and Statistical Manual of Mental Disorders edition IV criteria) and controls (n=200, each group), we next sequenced all exons, introns and flanking upstream/downstream sequences for ACSL6 and SIRT5. Among 69 identified SNPs, case-control allele frequency comparisons revealed nine suggestive associations (P<0.2). Each of these SNPs was next genotyped in the individual samples composing the pools. A suggestive association with rs 11743803 at ACSL6 remained (allele-wise P=0.02), with diminished evidence in an extended sample (448 cases, 554 controls, P=0.062). In conclusion, we propose a multi-stage method for comprehensive, rapid, efficient and economical genetic association analysis that enables simultaneous SNP detection and allele frequency estimation in large samples. This strategy may be particularly useful for research groups lacking access to high throughput genotyping facilities. Our analyses did not yield convincing evidence for associations of schizophrenia with ACSL6 or SIRT5.
Collapse
Affiliation(s)
- K V Chowdari
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Yu A, Geng H, Zhou X. Quantify single nucleotide polymorphism (SNP) ratio in pooled DNA based on normalized fluorescence real-time PCR. BMC Genomics 2006; 7:143. [PMID: 16764712 PMCID: PMC1552069 DOI: 10.1186/1471-2164-7-143] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2005] [Accepted: 06/09/2006] [Indexed: 12/02/2022] Open
Abstract
Background Conventional real-time PCR to quantify the allele ratio in pooled DNA mainly depends on PCR amplification efficiency determination and Ct value, which is defined as the PCR cycle number at which the fluorescence emission exceeds the fixed threshold. Because of the nature of exponential calculation, slight errors are multiplied and the variations of the results seem too large. We have developed a new PCR data point analysis strategy for allele ratio quantification based on normalized fluorescence ratio. Results In our method, initial reaction background fluorescence was determined based upon fitting of raw fluorescence data to four-parametric sigmoid function. After that, each fluorescence data point was first subtracted by respective background fluorescence and then each subtracted fluorescence data point was divided by the specific background fluorescence to get normalized fluorescence. By relating the normalized fluorescence ratio to the premixed known allele ratio of two alleles in standard samples, standard linear regression equation was generated, from which unknown specimens allele ratios were extrapolated using the measured normalized fluorescence ratio. In this article, we have compared the results of the proposed method with those of baseline subtracted fluorescence ratio method and conventional Ct method. Conclusion Results demonstrated that the proposed method could improve the reliability, precision, and repeatability for quantifying allele ratios. At the same time, it has the potential of fully automatic allelic ratio quantification.
Collapse
Affiliation(s)
- Airong Yu
- Department of Biology, Huaiyin Teachers College, 71 Jiao tong Road, Huai'an, Jiangsu Province, 223001, P.R. China
| | - Haifeng Geng
- Center of Marine Biotechnology, University of Maryland Biotechnology Institute, MD, 21202, USA
| | - Xuerui Zhou
- Department of Biology, Huaiyin Teachers College, 71 Jiao tong Road, Huai'an, Jiangsu Province, 223001, P.R. China
| |
Collapse
|
33
|
Gasbarra D, Sillanpää MJ. Constructing the parental linkage phase and the genetic map over distances <1 cM using pooled haploid DNA. Genetics 2005; 172:1325-35. [PMID: 16301209 PMCID: PMC1456229 DOI: 10.1534/genetics.105.044271] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
A new statistical approach for construction of the genetic linkage map and estimation of the parental linkage phase based on allele frequency data from pooled gametic (sperm or egg) samples is introduced. This method can be applied for estimation of recombination fractions (over distances <1 cM) and ordering of large numbers (even hundreds) of closely linked markers. This method should be extremely useful in species with a long generation interval and a large genome size such as in dairy cattle or in forest trees; the conifer species have haploid tissues available in megagametophytes. According to Mendelian expectation, two parental alleles should occur in gametes in 1:1 proportions, if segregation distortion does not occur. However, due to mere sampling variation, the observed proportions may deviate from their expected value in practice. These deviations and their dependence along the chromosome can provide information on the parental linkage phase and on the genetic linkage map. Usefulness of the method is illustrated with simulations. The role of segregation distortion as a source of these deviations is also discussed. The software implementing this method is freely available for research purposes from the authors.
Collapse
|
34
|
Silva LK, Liu S, Blanton RE. Microsatellite analysis of pooled Schistosoma mansoni DNA: an approach for studies of parasite populations. Parasitology 2005; 132:331-8. [PMID: 16255835 DOI: 10.1017/s0031182005009066] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2005] [Revised: 06/18/2005] [Accepted: 08/25/2005] [Indexed: 02/07/2023]
Abstract
Human parasites are often distributed in metapopulations, which makes random sampling for genetic epidemiology difficult. The typical approach to sampling Schistosoma mansoni involves laboratory passage to obtain individual worms with small sample size and selection bias as a consequence. By contrast, the naturally pooled samples from egg output in stool or urine directly represent the genetic composition of current populations. To test whether pooled samples could be used to estimate population allele frequencies, DNA from individual cloned parasites was pooled and amplified by PCR for 7 microsatellites. By polyacrylamide gel analysis, the relative band intensities of the products from the major alleles in the pooled samples differed by 0-6% from the summed intensities of the individual clones (mean = 2.1%+/-2.1% S.D.). The number of PCR cycles (25-40) did not influence the accuracy of the estimate. Varying the frequency of 1 allele in pooled samples from 32 to 69% likewise did not affect accuracy. Allele frequency estimates from aggregate samples such as eggs will be a better foundation for studies of parasite population dynamics as well as the basis for large-scale association studies of host and parasite characteristics.
Collapse
Affiliation(s)
- L K Silva
- Center for Global Health and Diseases, 2103 Cornell Road, Case University, Cleveland, OH 44106-7286, USA
| | | | | |
Collapse
|
35
|
Abel K, Reneland R, Kammerer S, Mah S, Hoyal C, Cantor CR, Nelson MR, Braun A. Genome-wide SNP association: identification of susceptibility alleles for osteoarthritis. Autoimmun Rev 2005; 5:258-63. [PMID: 16697966 DOI: 10.1016/j.autrev.2005.07.005] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The successful identification of genes involved in common human disorders is dependent upon availability of informative sample sets, validated marker panels, a high-throughput scoring technology, and a strategy for combining these resources. We have developed a universal platform based on mass spectrometry (MassARRAY) for analyzing nucleic acids with high precision and accuracy. To fuel this technology we have generated more than 100,000 validated assays for single nucleotide polymorphisms (SNPs) covering virtually all known and predicted human genes, and a large DNA sample bank from more than 50,000 consented diseased (case) and healthy (control) individuals. Taking advantage of MassARRAY's capability for quantitative analysis of nucleic acids, allele frequencies are estimated in sample pools containing large numbers of individual DNAs. Comparing frequencies between case and control pools as a first-pass filtering step is a tremendous advantage in throughput and cost over individual genotyping. We have employed this approach in numerous genome-wide association studies to identify genes implicated in common complex diseases, including osteoarthritis (OA). Access to additional patient samples through collaborations allows us to conduct replication studies that validate true disease genes. These discoveries will expand our understanding of genetic disease predisposition, and our capabilities for early diagnosis and improved therapeutic approaches.
Collapse
Affiliation(s)
- Kenneth Abel
- SEQUENOM, Inc., 3595 John Hopkins Court, San Diego, CA, USA 92121
| | | | | | | | | | | | | | | |
Collapse
|
36
|
Tamiya G, Shinya M, Imanishi T, Ikuta T, Makino S, Okamoto K, Furugaki K, Matsumoto T, Mano S, Ando S, Nozaki Y, Yukawa W, Nakashige R, Yamaguchi D, Ishibashi H, Yonekura M, Nakami Y, Takayama S, Endo T, Saruwatari T, Yagura M, Yoshikawa Y, Fujimoto K, Oka A, Chiku S, Linsen SEV, Giphart MJ, Kulski JK, Fukazawa T, Hashimoto H, Kimura M, Hoshina Y, Suzuki Y, Hotta T, Mochida J, Minezaki T, Komai K, Shiozawa S, Taniguchi A, Yamanaka H, Kamatani N, Gojobori T, Bahram S, Inoko H. Whole genome association study of rheumatoid arthritis using 27 039 microsatellites. Hum Mol Genet 2005; 14:2305-21. [PMID: 16000323 DOI: 10.1093/hmg/ddi234] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
A major goal of current human genome-wide studies is to identify the genetic basis of complex disorders. However, the availability of an unbiased, reliable, cost efficient and comprehensive methodology to analyze the entire genome for complex disease association is still largely lacking or problematic. Therefore, we have developed a practical and efficient strategy for whole genome association studies of complex diseases by charting the human genome at 100 kb intervals using a collection of 27,039 microsatellites and the DNA pooling method in three successive genomic screens of independent case-control populations. The final step in our methodology consists of fine mapping of the candidate susceptible DNA regions by single nucleotide polymorphisms (SNPs) analysis. This approach was validated upon application to rheumatoid arthritis, a destructive joint disease affecting up to 1% of the population. A total of 47 candidate regions were identified. The top seven loci, withstanding the most stringent statistical tests, were dissected down to individual genes and/or SNPs on four chromosomes, including the previously known 6p21.3-encoded Major Histocompatibility Complex gene, HLA-DRB1. Hence, microsatellite-based genome-wide association analysis complemented by end stage SNP typing provides a new tool for genetic dissection of multifactorial pathologies including common diseases.
Collapse
Affiliation(s)
- Gen Tamiya
- Department of Molecular Life Science, Course of Basic Medical Science and Molecular Medicine, Tokai University School of Medicine, Bohseidai, Kanagawa, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Abstract
The genetic dissection of complex human diseases requires large-scale association studies which explore the population associations between genetic variants and disease phenotypes. DNA pooling can substantially reduce the cost of genotyping assays in these studies, and thus enables one to examine a large number of genetic variants on a large number of subjects. The availability of pooled genotype data instead of individual data poses considerable challenges in the statistical inference, especially in the haplotype-based analysis because of increased phase uncertainty. Here we present a general likelihood-based approach to making inferences about haplotype-disease associations based on possibly pooled DNA data. We consider cohort and case-control studies of unrelated subjects, and allow arbitrary and unequal pool sizes. The phenotype can be discrete or continuous, univariate or multivariate. The effects of haplotypes on disease phenotypes are formulated through flexible regression models, which allow a variety of genetic hypotheses and gene-environment interactions. We construct appropriate likelihood functions for various designs and phenotypes, accommodating Hardy-Weinberg disequilibrium. The corresponding maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. We develop simple and efficient numerical algorithms for calculating the maximum likelihood estimators and their variances, and implement these algorithms in a freely available computer program. We assess the performance of the proposed methods through simulation studies, and provide an application to the Finland-United States Investigation of NIDDM Genetics Study. The results show that DNA pooling is highly efficient in studying haplotype-disease associations. As a by-product, this work provides valid and efficient methods for estimating haplotype-disease associations with unpooled DNA samples.
Collapse
Affiliation(s)
- D Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599-7420, USA
| | | |
Collapse
|
38
|
Schnack HG, Bakker SC, van 't Slot R, Groot BM, Sinke RJ, Kahn RS, Pearson PL. Accurate determination of microsatellite allele frequencies in pooled DNA samples. Eur J Hum Genet 2005; 12:925-34. [PMID: 15305176 DOI: 10.1038/sj.ejhg.5201234] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Pooling of DNA samples instead of individual genotyping can speed up genetic association studies. However, for microsatellite markers, the electrophoretic pattern of DNA pools can be complex, and procedures for deriving allele frequencies are often confounded by PCR-induced stutter artefacts. We have developed a mathematical procedure to remove stutter noise and accurately determine allele frequencies in pools. A stutter correction model can be reliably derived from one standard 'training set' of the same 10 individual DNA samples for each marker, which can also include heterozygous patterns with partially overlapping peaks. Compared with earlier methods, this reduces the number of genotypes needed in the training set considerably, and allows standardization of analyses for different markers. Moreover, the use of a procedure that fits all data simultaneously makes the method less sensitive to aberrant data. The model was tested with 34 markers, 18 of which were newly defined from human sequence data. Allele frequencies derived from stutter-corrected DNA pool patterns were compared with the summed individual genotyping results of all the individuals in the pools (n = 109 and n = 64). We show that the model is robust and accurately extracts allele frequencies from pooled DNA samples for 32 of the 34 microsatellite markers tested. Finally, we performed a case-control study in celiac disease and found that weakly associated disease alleles, identified by individual genotyping, were only detectable in pools after stutter correction. This efficient method for correcting stutter artefacts in microsatellite markers enables large-scale genetic association studies using DNA pools to be performed.
Collapse
Affiliation(s)
- Hugo G Schnack
- Department of Psychiatry, University Medical Center Utrecht, The Netherlands.
| | | | | | | | | | | | | |
Collapse
|
39
|
Abstract
Disappointments in replicating initial findings in gene mapping for complex traits are often attributed to small sample sizes and inadequate techniques to determine the threshold value. This is clearly not the whole truth. More fundamental reasons lie in the inherent heterogeneity related to disease, including genetic heterogeneity, differences in allele frequencies, and context-dependency in genetic architecture. There are also other reasons related to the data collection and analysis. Replication may remain a source of frustration unless more emphasis is put on controlling these sources of heterogeneity between studies.
Collapse
Affiliation(s)
- M J Sillanpää
- Rolf Nevanlinna Institute, Department of Mathematics and Statistics, P.O. Box 68, FIN-00014 University of Helsinki, Finland.
| | | |
Collapse
|
40
|
Satagopan JM, Venkatraman ES, Begg CB. Two-stage designs for gene-disease association studies with sample size constraints. Biometrics 2005; 60:589-97. [PMID: 15339280 PMCID: PMC8985053 DOI: 10.1111/j.0006-341x.2004.00207.x] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Gene-disease association studies based on case-control designs may often be used to identify candidate polymorphisms (markers) conferring disease risk. If a large number of markers are studied, genotyping all markers on all samples is inefficient in resource utilization. Here, we propose an alternative two-stage method to identify disease-susceptibility markers. In the first stage all markers are evaluated on a fraction of the available subjects. The most promising markers are then evaluated on the remaining individuals in Stage 2. This approach can be cost effective since markers unlikely to be associated with the disease can be eliminated in the first stage. Using simulations we show that, when the markers are independent and when they are correlated, the two-stage approach provides a substantial reduction in the total number of marker evaluations for a minimal loss of power. The power of the two-stage approach is evaluated when a single marker is associated with the disease, and in the presence of multiple disease-susceptibility markers. As a general guideline, the simulations over a wide range of parametric configurations indicate that evaluating all the markers on 50% of the individuals in Stage 1 and evaluating the most promising 10% of the markers on the remaining individuals in Stage 2 provides near-optimal power while resulting in a 45% decrease in the total number of marker evaluations.
Collapse
|
41
|
Castle PE, Schiffman M, Herrero R, Hildesheim A, Rodriguez AC, Bratti MC, Wacholder S, Kendal H, Breheny AM, Prior A, Pfeiffer R, Burk RD. PCR Testing of Pooled Longitudinally Collected Cervical Specimens of Women to Increase the Efficiency of Studying Human Papillomavirus Infection. Cancer Epidemiol Biomarkers Prev 2005. [DOI: 10.1158/1055-9965.256.14.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Abstract
In large active cohort studies of women investigating human papillomavirus (HPV) and cervical neoplasia, many women will be HPV-negative at all time points and testing of all their cervical specimens is an inefficient use of laboratory resources. The aim of this pilot study was to evaluate whether pooling cervical specimens from the same woman might provide a useful pretest of specimens from women unlikely to have high-grade cervical neoplasia or significant HPV exposure. We selected women (n = 187) participating in the Guanacaste Project for whom we already had HPV testing data on all their specimens from multiple visits (median = 8 visits), who were HPV DNA-negative at enrollment and at their 5- to 7-year exit from the cohort, and had no evidence of high-grade cervical neoplasia. Equal aliquots of cervical specimens from these women were pooled to create a proportional pooled specimen. Aliquots of pooled specimens were tested in a masked fashion by MY09/11 L1 consensus primer PCR. Second aliquots of some pooled specimens (n = 83) were included to assess the reliability of pooled testing. Results were compared with the predicted (expected) results based on the obtained test results of the individual specimens collected at interim visits. There was good overall agreement between observed and expected HPV DNA positivity, with a κ of 0.63 [95% confidence interval (95% CI), 0.51-0.75] and a percent agreement of 83.4% (95% CI, 77.3-88.5%) although the HPV DNA positivity in the pooled specimen was less than expected (P = 0.001). The agreement between observed and expected HPV DNA positivity was related to the number of aliquots pooled, suggesting that positivity was related to viral genome concentrations. The κ and percent agreement for intra-batch reliability of testing pooled specimens were 0.68 (95% CI, 0.53-0.84) and 84.3% (95% CI, 74.7-91.4%), respectively. We conclude that pooling specimens and testing by PCR may be useful for discriminating HPV DNA-positive from completely negative specimen sets in women who are likely to have been HPV DNA-negative.>
Collapse
Affiliation(s)
- Philip E. Castle
- 1Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Department of Health and Human Services, Bethesda, Maryland
| | - Mark Schiffman
- 1Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Department of Health and Human Services, Bethesda, Maryland
| | - Rolando Herrero
- 2Proyecto Epidemiologico Guanacaste, San Jose, Costa Rica; and
| | - Allan Hildesheim
- 1Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Department of Health and Human Services, Bethesda, Maryland
| | | | | | | | | | | | - Andrew Prior
- 3Albert Einstein College of Medicine, Bronx, New York
| | - Ruth Pfeiffer
- 2Proyecto Epidemiologico Guanacaste, San Jose, Costa Rica; and
| | | |
Collapse
|
42
|
Jawaid A, Sham PC, Makoff AJ, J Asherson P. Is haplotype tagging the panacea to association mapping studies? Eur J Hum Genet 2004; 12:259-62. [PMID: 14735159 DOI: 10.1038/sj.ejhg.5201146] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Affiliation(s)
- Ansar Jawaid
- MRC Social Genetic Developmental Psychiatry Research Centre (SGDP), Institute of Psychiatry, Kings College London, UK
| | | | | | | |
Collapse
|
43
|
Kwok PY, Xiao M. SNP genotyping and molecular haplotyping of DNA pools. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2004; 68:65-7. [PMID: 15338604 DOI: 10.1101/sqb.2003.68.65] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Affiliation(s)
- P Y Kwok
- Department of Dermatology and Cardiovascular Research Institute, University of California, San Francisco, California 94114, USA
| | | |
Collapse
|
44
|
Santos M, Pinto-Basto J, Rio ME, Sá MJ, Valença A, Sá A, Dinis J, Figueiredo J, Bigotte de Almeida L, Coelho I, Sawcer S, Setakis E, Compston A, Sequeiros J, Maciel P. A whole genome screen for association with multiple sclerosis in Portuguese patients. J Neuroimmunol 2003; 143:112-5. [PMID: 14575926 DOI: 10.1016/j.jneuroim.2003.08.023] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Multiple sclerosis (MS) is common in Europe affecting up to 1:500 people. In an effort to identify genes influencing susceptibility to the disease, we have performed a population-based whole genome screen for association. In this study, 6000 microsatellite markers were typed in separately pooled DNA samples from MS patients (n=188) and matched controls (n=188). Interpretable data was obtained from 4661 of these markers. Refining analysis of the most promising markers identified 10 showing potential evidence for association.
Collapse
Affiliation(s)
- M Santos
- UnIGENe-IBMC, University of Porto, Portugal
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
45
|
Bielecki B, Mycko MP, Tronczyńska E, Bieniek M, Sawcer S, Setakis E, Benediktsson K, Compston A, Selmaj KW. A whole genome screen for association in Polish multiple sclerosis patients. J Neuroimmunol 2003; 143:107-11. [PMID: 14575925 DOI: 10.1016/j.jneuroim.2003.08.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have performed the first systematic search for MS susceptibility genes completed in the Polish population. This screen was performed using 6000 microsatellite markers typed in pooled DNA from cases (n=200), controls (n=200) and trio families (n=129). Five associated markers are identified, one (D6S2444) from the HLA region and four are from novel regions not previously associated with MS, 2p16 (D2S2153), 3p13 (D3S3568), 7p22 (D7S2521) and 15q26 (D15S649).
Collapse
Affiliation(s)
- Bartosz Bielecki
- Department of Neurology, Medical University of Lodz, 22 Kopcinskiego Street, 90-153 Lodz, Poland
| | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Harbo HF, Datta P, Oturai A, Ryder LP, Sawcer S, Setakis E, Akesson E, Celius EG, Modin H, Sandberg-Wollheim M, Myhr KM, Andersen O, Hillert J, Sorensen PS, Svejgaard A, Compston A, Vartdal F, Spurkland A. Two genome-wide linkage disequilibrium screens in Scandinavian multiple sclerosis patients. J Neuroimmunol 2003; 143:101-6. [PMID: 14575924 DOI: 10.1016/j.jneuroim.2003.08.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
We report the first two genome-wide screens for linkage disequilibrium between putative multiple sclerosis (MS) susceptibility genes and genetic markers performed in the genetically homogenous Scandinavian population, using 6000 microsatellite markers and DNA pools of approximately 200 MS cases and 200 controls in each screen. Usable data were achieved from the same 3331 markers in both screens. Nine markers from eight genomic regions (1p33, 3q13, 6p21, 6q14, 7p22, 9p21, 9q21 and Xq22) were identified as potentially associated with MS in both screens.
Collapse
Affiliation(s)
- Hanne F Harbo
- Institute of Immunology, Rikshospitalet University Hospital, 0027 Oslo, Norway.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Satagopan JM, Elston RC. Optimal two-stage genotyping in population-based association studies. Genet Epidemiol 2003; 25:149-57. [PMID: 12916023 PMCID: PMC8978311 DOI: 10.1002/gepi.10260] [Citation(s) in RCA: 122] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We propose a cost-effective two-stage approach to investigate gene-disease associations when testing a large number of candidate markers using a case-control design. Under this approach, all the markers are genotyped and tested at stage 1 using a subset of affected cases and unaffected controls, and the most promising markers are genotyped on the remaining individuals and tested using all the individuals at stage 2. The sample size at stage 1 is chosen such that the power to detect the true markers of association is 1-beta(1) at significance level alpha(1). The most promising markers are tested at significance level alpha(2) at stage 2. In contrast, a one-stage approach would evaluate and test all the markers on all the cases and controls to identify the markers significantly associated with the disease. The goal is to determine the two-stage parameters (alpha(1), beta(1), alpha(2)) that minimize the cost of the study such that the desired overall significance is alpha and the desired power is close to 1-beta, the power of the one-stage approach. We provide analytic formulae to estimate the two-stage parameters. The properties of the two-stage approach are evaluated under various parametric configurations and compared with those of the corresponding one-stage approach. The optimal two-stage procedure does not depend on the signal of the markers associated with the study. Further, when there is a large number of markers, the optimal procedure is not substantially influenced by the total number of markers associated with the disease. The results show that, compared to a one-stage approach, a two-stage procedure typically halves the cost of the study.
Collapse
Affiliation(s)
- Jaya M Satagopan
- Department of Epidemiology and Biostatistics, Memorial Sloan-Kettering Cancer Center, New York, New York 10021, USA.
| | | |
Collapse
|
48
|
Yang Y, Zhang J, Hoh J, Matsuda F, Xu P, Lathrop M, Ott J. Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA. Proc Natl Acad Sci U S A 2003; 100:7225-30. [PMID: 12777616 PMCID: PMC165857 DOI: 10.1073/pnas.1237858100] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2002] [Accepted: 04/24/2003] [Indexed: 11/18/2022] Open
Abstract
The efficiency of single-nucleotide polymorphism haplotype analysis may be increased by DNA pooling, which can dramatically reduce the number of genotyping assays. We develop a method for obtaining maximum likelihood estimates of haplotype frequencies for different pool sizes, assess the accuracy of these estimates, and show that pooling DNA samples is efficient in estimating haplotype frequencies. Although pooling K individuals increases ambiguities, at least for small pool size K and small numbers of loci, the uncertainty of estimation increases
Collapse
Affiliation(s)
- Yaning Yang
- Laboratory of Statistical Genetics, The Rockefeller University, New York, NY 10021, USA.
| | | | | | | | | | | | | |
Collapse
|
49
|
Henrich KO, Sander AC, Wolters V, Dauber J. Isolation and characterization of microsatellite loci in the ant Myrmica scabrinodis. ACTA ACUST UNITED AC 2003. [DOI: 10.1046/j.1471-8286.2003.00433.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
50
|
Hoh J, Matsuda F, Peng X, Markovic D, Lathrop MG, Ott J. SNP haplotype tagging from DNA pools of two individuals. BMC Bioinformatics 2003; 4:14. [PMID: 12709267 PMCID: PMC156884 DOI: 10.1186/1471-2105-4-14] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2002] [Accepted: 04/22/2003] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND DNA pooling is a technique to reduce genotyping effort while incurring only minor losses in accuracy of allele frequency estimates for single nucleotide polymorphism (SNP) markers. RESULTS We present an algorithm for reconstructing haplotypes (alleles for multiple SNPs on same chromosome) from pools of two individual DNAs, in which Hardy-Weinberg equilibrium conditions or other assumptions are not required. The program outputs, in addition to inferred haplotypes, a minimal number of haplotype-tagging SNPs that are identified after an exhaustive search procedure. CONCLUSION Our method and algorithms lead to a significant reduction in genotyping effort, for example, in case-control disease association studies while maintaining the possibility of reconstructing haplotypes under very general conditions.
Collapse
Affiliation(s)
- Josephine Hoh
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10021, USA
| | | | - Xu Peng
- Centre National de Génotypage, 91057 Evry, France
| | - Daniela Markovic
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10021, USA
| | | | - Jurg Ott
- Laboratory of Statistical Genetics, Rockefeller University, New York, NY 10021, USA
| |
Collapse
|