1
|
Yoosefzadeh-Najafabadi M, Eskandari M, Belzile F, Torkamaneh D. Genome-Wide Association Study Statistical Models: A Review. Methods Mol Biol 2022; 2481:43-62. [PMID: 35641758 DOI: 10.1007/978-1-0716-2237-7_4] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Statistical models are at the core of the genome-wide association study (GWAS). In this chapter, we provide an overview of single- and multilocus statistical models, Bayesian, and machine learning approaches for association studies in plants. These models are discussed based on their basic methodology, cofactors adjustment accounted for, statistical power and computational efficiency. New statistical models and machine learning algorithms are both showing improved performance in detecting missed signals, rare mutations and prioritizing causal genetic variants; nevertheless, further optimization and validation studies are required to maximize the power of GWAS.
Collapse
Affiliation(s)
| | - Milad Eskandari
- Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Quebec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada
| | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Quebec City, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, QC, Canada.
| |
Collapse
|
2
|
De La Torre AR, Sekhwal MK, Neale DB. Selective Sweeps and Polygenic Adaptation Drive Local Adaptation along Moisture and Temperature Gradients in Natural Populations of Coast Redwood and Giant Sequoia. Genes (Basel) 2021; 12:1826. [PMID: 34828432 PMCID: PMC8621000 DOI: 10.3390/genes12111826] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/18/2021] [Accepted: 11/18/2021] [Indexed: 12/26/2022] Open
Abstract
Dissecting the genomic basis of local adaptation is a major goal in evolutionary biology and conservation science. Rapid changes in the climate pose significant challenges to the survival of natural populations, and the genomic basis of long-generation plant species is still poorly understood. Here, we investigated genome-wide climate adaptation in giant sequoia and coast redwood, two iconic and ecologically important tree species. We used a combination of univariate and multivariate genotype-environment association methods and a selective sweep analysis using non-overlapping sliding windows. We identified genomic regions of potential adaptive importance, showing strong associations to moisture variables and mean annual temperature. Our results found a complex architecture of climate adaptation in the species, with genomic regions showing signatures of selective sweeps, polygenic adaptation, or a combination of both, suggesting recent or ongoing climate adaptation along moisture and temperature gradients in giant sequoia and coast redwood. The results of this study provide a first step toward identifying genomic regions of adaptive significance in the species and will provide information to guide management and conservation strategies that seek to maximize adaptive potential in the face of climate change.
Collapse
Affiliation(s)
- Amanda R. De La Torre
- School of Forestry, Northern Arizona University, 200 E. Pine Knoll, Flagstaff, AZ 86011, USA;
| | - Manoj K. Sekhwal
- School of Forestry, Northern Arizona University, 200 E. Pine Knoll, Flagstaff, AZ 86011, USA;
| | - David B. Neale
- Department of Plant Sciences, University of California-Davis, One Shields Avenue, Davis, CA 95616, USA
| |
Collapse
|
3
|
dos Santos BA, Pereira GL, Bussiman FDO, Paschoal VR, de Souza Júnior SM, Balieiro JCDC, Chardulo LAL, Curi RA. Genomic analysis of the population structure in horses of the Brazilian Mangalarga Marchador breed. Livest Sci 2019. [DOI: 10.1016/j.livsci.2019.09.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
4
|
Gene-dense autosomal chromosomes show evidence for increased selection. Heredity (Edinb) 2019; 123:774-783. [PMID: 31576017 DOI: 10.1038/s41437-019-0272-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 09/16/2019] [Indexed: 12/20/2022] Open
Abstract
Purifying selection tends to reduce nucleotide and haplotype diversity leading to increased linkage disequilibrium. However, detection of evidence for selection is difficult as the signature is confounded by wide variation in the recombination rate which has a complex relationship with selection. The effective bottleneck time (the ratio of the linkage disequilibrium map to the genetic map in Morgans) controls for variability in the recombination rate. Reduced effective bottleneck times indicate stronger residual linkage disequilibrium, consistent with increased selection. Using whole genome sequence data from one European and three Sub-Saharan African human populations we find, in the African samples, strong correlations between high gene densities and reduced effective bottleneck time for autosomal chromosomes. This suggests that gene-dense autosomes have been subject to increased purifying selection reducing effective bottleneck times compared to gene-poor autosomes. Although previous studies have shown unusually strong linkage disequilibrium for the sex chromosomes variation within the autosomes has not been recognised. The strongest relationship is between effective bottleneck time and the density of essential genes, which are likely targets of greater selective pressure (p = 0.006, for the 22 autosomes). The magnitude of the reduction in chromosome-specific effective bottleneck times from the least to the most gene-dense autosomes is ~17-21% for Sub-Saharan African populations. The effect size is greater in Sub-Saharan African populations, compared to a European sample, consistent with increased efficiency of selection in populations with larger effective population sizes which have not been subject to intense population bottlenecks as experienced by populations of European ancestry. The findings highlight the value of deeper analyses of selection within Sub-Saharan African populations.
Collapse
|
5
|
Wallace AD, Wendt GA, Barcellos LF, de Smith AJ, Walsh KM, Metayer C, Costello JF, Wiemels JL, Francis SS. To ERV Is Human: A Phenotype-Wide Scan Linking Polymorphic Human Endogenous Retrovirus-K Insertions to Complex Phenotypes. Front Genet 2018; 9:298. [PMID: 30154825 PMCID: PMC6102640 DOI: 10.3389/fgene.2018.00298] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 07/16/2018] [Indexed: 12/13/2022] Open
Abstract
Approximately 8% of the human genome is comprised of endogenous retroviral insertions (ERVs) originating from historic retroviral integration into germ cells. The function of ERVs as regulators of gene expression is well established. Less well studied are insertional polymorphisms of ERVs and their contribution to the heritability of complex phenotypes. The most recent integration of ERV, HERV-K, is expressed in a range of complex human conditions from cancer to neurologic diseases. Using an in-house computational pipeline and whole-genome sequencing data from the diverse 1,000 Genomes Phase 3 population (n = 2,504), we identified 46 polymorphic HERV-K insertions that are tagged by adjacent single nucleotide polymorphisms (SNPs). To test the potential role of polymorphic HERV-K in the heritability of complex diseases, existing databases were queried for enrichment of established relationships between the HERV-K insertion-associated SNPs (hiSNPs), and tissue specific gene expression and disease phenotypes. Overall, hiSNPs for the 46 polymorphic HERV-K sites were statistically enriched (p < 1.0E-16) for eQTLs across 44 human tissues. Fifteen of the 46 HERV-K insertions had hiSNPs annotated in the EMBL-EBI GWAS Catalog and cumulatively associated with >100 phenotypes. Experimental factor ontology enrichment analysis suggests that polymorphic HERV-K specifically contribute to neurologic and immunologic disease phenotypes, including traits related to intra cranial volume (FDR 2.00E-09), Parkinson's disease (FDR 1.80E-09), and autoimmune diseases (FDR 1.80E-09). These results provide strong candidates for context-specific study of polymorphic HERV-K insertions in disease-related traits, serving as a roadmap for future studies of the heritability of complex disease.
Collapse
Affiliation(s)
- Amelia D Wallace
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - George A Wendt
- Division of Epidemiology, School of Community Health Sciences, University of Nevada, Reno, NV, United States
| | - Lisa F Barcellos
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - Adam J de Smith
- Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Kyle M Walsh
- Department of Neurosurgery, Duke University, Durham, NC, United States
| | - Catherine Metayer
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - Joseph F Costello
- Department of Neurosurgery, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Joseph L Wiemels
- Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States.,Department of Neurosurgery, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Stephen S Francis
- Division of Epidemiology, School of Community Health Sciences, University of Nevada, Reno, NV, United States.,Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
6
|
Kim S, Cheong HS, Shin HD, Lee SS, Roh HJ, Jeon DY, Cho CY. Genetic diversity and divergence among Korean cattle breeds assessed using a BovineHD single-nucleotide polymorphism chip. ASIAN-AUSTRALASIAN JOURNAL OF ANIMAL SCIENCES 2018; 31:1691-1699. [PMID: 30056676 PMCID: PMC6212751 DOI: 10.5713/ajas.17.0419] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 06/22/2018] [Indexed: 01/07/2023]
Abstract
Objective In Korea, there are three main cattle breeds, which are distinguished by coat color: Brown Hanwoo (BH), Brindle Hanwoo (BRH), and Jeju Black (JB). In this study, we sought to compare the genetic diversity and divergence among there Korean cattle breeds using a BovineHD chip genotyping array. Methods Sample data were collected from 168 cattle in three populations of BH (48 cattle), BRH (96 cattle), and JB (24 cattle). The single-nucleotide polymorphism (SNP) genotyping was performed using the Illumina BovineHD SNP 777K Bead chip. Results Heterozygosity, used as a measure of within-breed genetic diversity, was higher in BH (0.293) and BRH (0.296) than in JB (0.266). Linkage disequilibrium decay was more rapid in BH and BRH than in JB, reaching an average r2 value of 0.2 before 26 kb in BH and BRH, whereas the corresponding value was reached before 32 kb in JB. Intra-population, inter-population, and Fst analyses were used to identify candidate signatures of positive selection in the genome of a domestic Korean cattle population and 48, 11, and 11 loci were detected in the genomic region of the BRH breed, respectively. A Neighbor-Joining phylogenetic tree showed two main groups: a group comprising BH and BRH on one side and a group containing JB on the other. The runs of homozygosity analysis between Korean breeds indicated that the BRH and JB breeds have high inbreeding within breeds compared with BH. An analysis of differentiation based on a high-density SNP chip showed differences between Korean cattle breeds and the closeness of breeds corresponding to the geographic regions where they are evolving. Conclusion Our results indicate that although the Korean cattle breeds have common features, they also show reliable breed diversity.
Collapse
Affiliation(s)
- Seungchang Kim
- Animal Genetic Resources Center, National Institute of Animal Science, RDA, Namwon 55717, Korea
| | - Hyun Sub Cheong
- Department of Genetic Epidemiology, SNP Genetics, Inc., Seoul 04107, Korea
| | - Hyoung Doo Shin
- Department of Genetic Epidemiology, SNP Genetics, Inc., Seoul 04107, Korea.,Department of Life Science, Sogang University, Seoul 04107, Korea
| | - Sung-Soo Lee
- Animal Genetic Resources Center, National Institute of Animal Science, RDA, Namwon 55717, Korea
| | - Hee-Jong Roh
- Animal Genetic Resources Center, National Institute of Animal Science, RDA, Namwon 55717, Korea
| | - Da-Yeon Jeon
- Animal Genetic Resources Center, National Institute of Animal Science, RDA, Namwon 55717, Korea
| | - Chang-Yeon Cho
- Animal Genetic Resources Center, National Institute of Animal Science, RDA, Namwon 55717, Korea
| |
Collapse
|
7
|
Bejarano D, Martínez R, Manrique C, Parra LM, Rocha JF, Gómez Y, Abuabara Y, Gallego J. Linkage disequilibrium levels and allele frequency distribution in Blanco Orejinegro and Romosinuano Creole cattle using medium density SNP chip data. Genet Mol Biol 2018; 41:426-433. [PMID: 30088613 PMCID: PMC6082240 DOI: 10.1590/1678-4685-gmb-2016-0310] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2016] [Accepted: 09/20/2017] [Indexed: 11/22/2022] Open
Abstract
The linkage disequilibrium (LD) between molecular markers affects the accuracy of
genome-wide association studies and genomic selection application. High-density
genotyping platforms allow identifying the genotype of thousands of single
nucleotide polymorphisms (SNPs) distributed throughout the animal genomes, which
increases the resolution of LD evaluations. This study evaluated the
distribution of minor allele frequencies (MAF) and the level of LD in the
Colombian Creole cattle breeds Blanco Orejinegro (BON) and Romosinuano (ROMO)
using a medium density SNP panel (BovineSNP50K_v2). The LD decay in these breeds
was lower than those reported for other taurine breeds, achieving optimal LD
values (r2 ≥ 0.3) up to a distance of 70 kb in BON and 100 kb in
ROMO, which is possibly associated with the conservation status of these cattle
populations and their effective population size. The average MAF for both breeds
was 0.27 ± 0.14 with a higher SNP proportion having high MAF values (≥ 0.3). The
LD levels and distribution of allele frequencies found in this study suggest
that it is possible to have adequate coverage throughout the genome of these
breeds using the BovineSNP50K_v2, capturing the effect of most QTL related with
productive traits, and ensuring an adequate prediction capacity in genomic
analysis.
Collapse
Affiliation(s)
- Diego Bejarano
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Tibaitatá, Cundinamarca, Colombia
| | - Rodrigo Martínez
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Tibaitatá, Cundinamarca, Colombia
| | | | - Luis Miguel Parra
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Tibaitatá, Cundinamarca, Colombia
| | - Juan Felipe Rocha
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Obonuco, Nariño, Colombia
| | - Yolanda Gómez
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Tibaitatá, Cundinamarca, Colombia
| | - Yesid Abuabara
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación Turipaná, Córdoba, Colombia
| | - Jaime Gallego
- Corporación Colombiana de Investigación Agropecuaria - Corpoica. Centro de Investigación El Nus, Antioquia, Colombia
| |
Collapse
|
8
|
Pig identification and meat traceability by multiallelic amplification fragments with multiple single nucleotide polymorphisms. Animal 2017; 12:1785-1791. [PMID: 29271334 DOI: 10.1017/s1751731117003482] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
Compared with conventional identification methods, DNA-based genetic approaches such as single nucleotide polymorphisms (SNPs) and satellites are much more reliable for pig identification and meat traceability. In this study, multiallelic amplification fragments with multiple SNPs, incorporating the advantages of both SNPs and microsatellites, were explored for the first time for pig identification and meat traceability. Primer pairs for multiallelic fragments and their optimal SNPs were successfully selected and used for identification of individuals from Suzhong and Duroc populations. Meanwhile, the combined panel of the above mentioned primer pairs together with their optimal SNPs for Suzhong and/or Duroc pigs were validated for identification of the hybrids (Suzhong×Duroc). Therefore, we have successfully selected multiallelic amplification fragments with multiple SNPs to identify pigs and their meat samples from Suzhong, Duroc or their hybrids. Our study demonstrates that our method is more powerful for pig identification or meat traceability than SNPs or microsatellites.
Collapse
|
9
|
Wang MD, Dzama K, Hefer CA, Muchadeyi FC. Genomic population structure and prevalence of copy number variations in South African Nguni cattle. BMC Genomics 2015; 16:894. [PMID: 26531252 PMCID: PMC4632335 DOI: 10.1186/s12864-015-2122-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 10/22/2015] [Indexed: 12/21/2022] Open
Abstract
Background Copy number variations (CNVs) are modifications in DNA structure comprising of deletions, duplications, insertions and complex multi-site variants. Although CNVs are proven to be involved in a variety of phenotypic discrepancies, the full extent and consequence of CNVs is yet to be understood. To date, no such genomic characterization has been performed in indigenous South African Nguni cattle. Nguni cattle are recognized for their ability to sustain harsh environmental conditions while exhibiting enhanced resistance to disease and parasites and are thought to comprise of up to nine different ecotypes. Methods Illumina BovineSNP50 Beadchip data was utilized to investigate genomic population structure and the prevalence of CNVs in 492 South African Nguni cattle. PLINK, ADMIXTURE, R, gPLINK and Haploview software was utilized for quality control, population structure and haplotype block determination. PennCNV hidden Markov model identified CNVs and genes contained within and 10 Mb downstream from reported CNVs. PANTHER and Ensembl databases were subsequently utilized for gene annotation analyses. Results Population structure analyses on Nguni cattle revealed 5 sub-populations with a possible sub-structure evident at K equal to 8. Four hundred and thirty three CNVs that formed 334 CNVRs ranging from 30 kb to 1 Mb in size are reported. Only 231 of the 492 animals demonstrated CNVRs. Two hundred and eighty nine genes were observed within CNVRs identified. Of these 149, 28, 44, 2 and 14 genes were unique to sub-populations A, B, C, D and E respectively. Gene ontology analyses demonstrated a number of pathways to be represented by respective genes, including immune response, response to abiotic stress and biological regulation processess. Conclusions CNVs may explain part of the phenotypic diversity and the enhanced adaptation evident in Nguni cattle. Genes involved in a number of cellular components, biological processes and molecular functions are reported within CNVRs identified. The significance of such CNVRs and the possible effect thereof needs to be ascertained and may hold interesting insight into the functional and adaptive consequence of CNVs in cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2122-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Magretha Diane Wang
- Department of Animal Sciences, University of Stellenbosch, Private Bag X1, Matieland, Stellenbosch, 7602, South Africa. .,Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| | - Kennedy Dzama
- Department of Animal Sciences, University of Stellenbosch, Private Bag X1, Matieland, Stellenbosch, 7602, South Africa.
| | - Charles A Hefer
- Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| | - Farai C Muchadeyi
- Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| |
Collapse
|
10
|
|
11
|
Pengelly RJ, Tapper W, Gibson J, Knut M, Tearle R, Collins A, Ennis S. Whole genome sequences are required to fully resolve the linkage disequilibrium structure of human populations. BMC Genomics 2015; 16:666. [PMID: 26335686 PMCID: PMC4558963 DOI: 10.1186/s12864-015-1854-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 08/17/2015] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND An understanding of linkage disequilibrium (LD) structures in the human genome underpins much of medical genetics and provides a basis for disease gene mapping and investigating biological mechanisms such as recombination and selection. Whole genome sequencing (WGS) provides the opportunity to determine LD structures at maximal resolution. RESULTS We compare LD maps constructed from WGS data with LD maps produced from the array-based HapMap dataset, for representative European and African populations. WGS provides up to 5.7-fold greater SNP density than array-based data and achieves much greater resolution of LD structure, allowing for identification of up to 2.8-fold more regions of intense recombination. The absence of ascertainment bias in variant genotyping improves the population representativeness of the WGS maps, and highlights the extent of uncaptured variation using array genotyping methodologies. The complete capture of LD patterns using WGS allows for higher genome-wide association study (GWAS) power compared to array-based GWAS, with WGS also allowing for the analysis of rare variation. The impact of marker ascertainment issues in arrays has been greatest for Sub-Saharan African populations where larger sample sizes and substantially higher marker densities are required to fully resolve the LD structure. CONCLUSIONS WGS provides the best possible resource for LD mapping due to the maximal marker density and lack of ascertainment bias. WGS LD maps provide a rich resource for medical and population genetics studies. The increasing availability of WGS data for large populations will allow for improved research utilising LD, such as GWAS and recombination biology studies.
Collapse
Affiliation(s)
- Reuben J Pengelly
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Duthie Building (MP 808), Tremona Road, Southampton, SO16 6YD, UK.
| | - William Tapper
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Duthie Building (MP 808), Tremona Road, Southampton, SO16 6YD, UK.
| | - Jane Gibson
- Centre for Biological Sciences, Faculty of Natural & Environmental Sciences, University of Southampton, Southampton, UK.
| | - Marcin Knut
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Duthie Building (MP 808), Tremona Road, Southampton, SO16 6YD, UK.
| | - Rick Tearle
- Complete Genomics, Inc., Mountain View, CA, USA.
| | - Andrew Collins
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Duthie Building (MP 808), Tremona Road, Southampton, SO16 6YD, UK.
| | - Sarah Ennis
- Human Genetics & Genomic Medicine, Faculty of Medicine, University of Southampton, Duthie Building (MP 808), Tremona Road, Southampton, SO16 6YD, UK.
| |
Collapse
|
12
|
Sehgal D, Skot L, Singh R, Srivastava RK, Das SP, Taunk J, Sharma PC, Pal R, Raj B, Hash CT, Yadav RS. Exploring potential of pearl millet germplasm association panel for association mapping of drought tolerance traits. PLoS One 2015; 10:e0122165. [PMID: 25970600 PMCID: PMC4430295 DOI: 10.1371/journal.pone.0122165] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Accepted: 02/07/2015] [Indexed: 11/19/2022] Open
Abstract
A pearl millet inbred germplasm association panel (PMiGAP) comprising 250 inbred lines, representative of cultivated germplasm from Africa and Asia, elite improved open-pollinated cultivars, hybrid parental inbreds and inbred mapping population parents, was recently established. This study presents the first report of genetic diversity in PMiGAP and its exploitation for association mapping of drought tolerance traits. For diversity and genetic structure analysis, PMiGAP was genotyped with 37 SSR and CISP markers representing all seven linkage groups. For association analysis, it was phenotyped for yield and yield components and morpho-physiological traits under both well-watered and drought conditions, and genotyped with SNPs and InDels from seventeen genes underlying a major validated drought tolerance (DT) QTL. The average gene diversity in PMiGAP was 0.54. The STRUCTURE analysis revealed six subpopulations within PMiGAP. Significant associations were obtained for 22 SNPs and 3 InDels from 13 genes under different treatments. Seven SNPs associations from 5 genes were common under irrigated and one of the drought stress treatments. Most significantly, an important SNP in putative acetyl CoA carboxylase gene showed constitutive association with grain yield, grain harvest index and panicle yield under all treatments. An InDel in putative chlorophyll a/b binding protein gene was significantly associated with both stay-green and grain yield traits under drought stress. This can be used as a functional marker for selecting high yielding genotypes with 'stay green' phenotype under drought stress. The present study identified useful marker-trait associations of important agronomics traits under irrigated and drought stress conditions with genes underlying a major validated DT-QTL in pearl millet. Results suggest that PMiGAP is a useful panel for association mapping. Expression patterns of genes also shed light on some physiological mechanisms underlying pearl millet drought tolerance.
Collapse
Affiliation(s)
- Deepmala Sehgal
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
| | - Leif Skot
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
| | - Richa Singh
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
- Chaudhary Charan Singh Haryana Agricultural University (CCSHAU), Department of Molecular Biology and Biotechnology, Hisar, Haryana, India
| | - Rakesh Kumar Srivastava
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Andhra Pradesh, India
| | - Sankar Prasad Das
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
- ICAR Research Complex for NEH Region, Tripura Centre, Lembucherra, India
| | - Jyoti Taunk
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
- Chaudhary Charan Singh Haryana Agricultural University (CCSHAU), Department of Molecular Biology and Biotechnology, Hisar, Haryana, India
| | - Parbodh C. Sharma
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
- Central Soil Salinity Research Institute (CSSRI), Karnal, India
| | - Ram Pal
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
- National Research Centre for Orchids, Darjeeling Campus, Darjeeling, India
| | - Bhasker Raj
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Andhra Pradesh, India
| | | | - Rattan S. Yadav
- Institute of Biological, Environmental and Biological Sciences (IBERS), Aberystwyth University, Gogerddan, Aberystwyth, Ceredigion, United Kingdom
| |
Collapse
|
13
|
Linkage disequilibrium levels in Bos indicus and Bos taurus cattle using medium and high density SNP chip data and different minor allele frequency distributions. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.05.007] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
14
|
Bhosale SU, Stich B, Rattunde HFW, Weltzien E, Haussmann BIG, Hash CT, Ramu P, Cuevas HE, Paterson AH, Melchinger AE, Parzies HK. Association analysis of photoperiodic flowering time genes in west and central African sorghum [Sorghum bicolor (L.) Moench]. BMC PLANT BIOLOGY 2012; 12:32. [PMID: 22394582 PMCID: PMC3364917 DOI: 10.1186/1471-2229-12-32] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2011] [Accepted: 03/07/2012] [Indexed: 05/18/2023]
Abstract
BACKGROUND Photoperiod-sensitive flowering is a key adaptive trait for sorghum (Sorghum bicolor) in West and Central Africa. In this study we performed an association analysis to investigate the effect of polymorphisms within the genes putatively related to variation in flowering time on photoperiod-sensitive flowering in sorghum. For this purpose a genetically characterized panel of 219 sorghum accessions from West and Central Africa was evaluated for their photoperiod response index (PRI) based on two sowing dates under field conditions. RESULTS Sorghum accessions used in our study were genotyped for single nucleotide polymorphisms (SNPs) in six genes putatively involved in the photoperiodic control of flowering time. Applying a mixed model approach and previously-determined population structure parameters to these candidate genes, we found significant associations between several SNPs with PRI for the genes CRYPTOCHROME 1 (CRY1-b1) and GIGANTEA (GI). CONCLUSIONS The negative values of Tajima's D, found for the genes of our study, suggested that purifying selection has acted on genes involved in photoperiodic control of flowering time in sorghum. The SNP markers of our study that showed significant associations with PRI can be used to create functional markers to serve as important tools for marker-assisted selection of photoperiod-sensitive cultivars in sorghum.
Collapse
Affiliation(s)
- Sankalp U Bhosale
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
| | - Benjamin Stich
- Max Planck Institute for Plant Breeding Research, 50829 Köln, Germany
| | - H Frederick W Rattunde
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) - Bamako, BP 320 Bamako, Mali
| | - Eva Weltzien
- International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) - Bamako, BP 320 Bamako, Mali
| | - Bettina IG Haussmann
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
- ICRISAT - Sadoré, BP 12404 Niamey, Niger
| | - C Thomas Hash
- ICRISAT - Sadoré, BP 12404 Niamey, Niger
- ICRISAT - Patancheru, Hyderabad 502324, Andhra Pradesh, India
| | - Punna Ramu
- ICRISAT - Patancheru, Hyderabad 502324, Andhra Pradesh, India
| | - Hugo E Cuevas
- Plant Genome Mapping Laboratory, University of Georgia, Athens GA 30602, USA
- U.S. Dept. of Agriculture, Agricultural Research Service, Tropical Agriculture Research Station, 2200 P.A. Campos Ave., Mayaguez P.R. 00680, Puerto Rico
| | - Andrew H Paterson
- Plant Genome Mapping Laboratory, University of Georgia, Athens GA 30602, USA
| | - Albrecht E Melchinger
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
| | - Heiko K Parzies
- Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany
| |
Collapse
|
15
|
Uimari P, Kontkanen O, Visscher PM, Pirskanen M, Fuentes R, Salonen JT. Genome-Wide Linkage Disequilibrium from 100,000 SNPs in the East Finland Founder Population. Twin Res Hum Genet 2012. [DOI: 10.1375/twin.8.3.185] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
AbstractInformation about linkage disequilibrium (LD) is important in understanding the genome structure and has its applications in association studies. Here we present the first genome-wide LD study based on a founder population (East Finland). The LD data consist of 118 unrelated individuals and around 480,000 SNP pairs genotyped with the Affymetrix 100K genotyping assay. Using the minor allele frequency (MAF) limit of .05, the squared correlation coefficient between two loci (r2) was .48, .37, .28, and .20 for distances of 5, 10, 20, and 40 kb respectively. MAF had a significant effect on the mean r2 so that the extent of useful LD (r2 > .3) varied from 17 kb to 80 kb depending on the limit set for the MAF. For D' the effect of MAF was smaller but reflected the possible age of the mutation: SNPs with high MAF had lower D' than those with low MAF. The X chromosome showed higher D' values than autosomes and the extent of useful LD (r2 > .3) was twice as long on the X chromosome than on the autosomes. Based on the results, LD varies across the genome and is correlated to local recombination rate between and within chromosomes. However, the recombination rate does not explain all the variation found in LD. We also report a number of long chromosomal regions where exceptionally high or low LD were detected.
Collapse
|
16
|
Perry RT, Dwivedi H, Aissani B. A Simple PCR-RFLP Method for Genetic Phase Determination in Compound Heterozygotes. Front Genet 2012; 2:108. [PMID: 22303402 PMCID: PMC3268647 DOI: 10.3389/fgene.2011.00108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2011] [Accepted: 12/22/2011] [Indexed: 11/13/2022] Open
Abstract
When susceptibility to diseases is caused by cis-effects of multiple alleles at adjacent polymorphic sites, it may be difficult to assess with confidence the genetic phase and identify individuals carrying the risk haplotype. Experimental assessment of genetic phase is still challenging and most population studies use statistical approaches to infer haplotypes given the observed genotypes. While these statistical approaches are powerful and have been proven very useful in large scale genetic population studies, they may be prone to errors in studies with small sample size, especially in the presence of compound heterozygotes. Here, we describe a simple and novel approach using the popular PCR-RFLP based strategy to assess the genetic phase in compound heterozygotes. We apply this method to two extensively studied SNPs in two clustered immune-related genes: The -308 (G > A) and the +252 (A > G) SNPs of the tumor necrosis factor (TNF) alpha and the lymphotoxin alpha (LTA) genes, respectively. Using this method, we successfully determined the genetic phase of these two SNPs in known compound heterozygous individuals and in every sample tested. We show that the A allele of TNF -308 is carried on the same chromosome as the LTA +252(G) allele.
Collapse
Affiliation(s)
- Rodney T Perry
- Department of Epidemiology, University of Alabama at Birmingham Birmingham, AL, USA
| | | | | |
Collapse
|
17
|
Abstract
Dog domestication was probably started very early during the Upper paleolithic period (~35,000 BP), thus well before any other animal or plant domestication. This early process, probably unconscious, is called proto-domestication to distinguish it from the real domestication process that has been dated around 14,000 BC. Genomic DNA analyses have shown recently that domestication started in the Middle East and rapidly expanded into all human populations. Nowadays, the dog population is fragmented in several hundreds of breeds well characterized by their phenotypes that offer a unique spectrum of polymorphism. More recent studies detect genetic signatures that will be useful to highlight breed history as well as the impact of domestication at the DNA level.
Collapse
|
18
|
Abstract
In this chapter, mutation (specifically single-nucleotide polymorphisms, SNPs) and recombination will be covered in more detail, and the concepts of genotype and haplotype will be reviewed. Linkage disequilibrium (LD) describes the strength of a relationship between alleles at different loci. The definition for LD, its visual representation, and the calculation of statistics that measure LD will be presented. The power of genetic association studies to identify disease susceptibility alleles fundamentally relies on the genetic variants studied. A standard approach is to determine a set of tagging-SNPs (tSNPs) that capture the majority of genomic variation in regions of interest by exploiting local correlation structures. The concept of LD and how it is used to select tSNPs will be addressed, as well as specific procedures and algorithms that are practiced by researchers to determine these variants.
Collapse
Affiliation(s)
- Karen Curtin
- Genetic Epidemiology Division, Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA
| | | |
Collapse
|
19
|
Abstract
Meniere's disease remains a disorder of unknown origin despite the collective efforts to determine the pathogenesis, although experts have long recognized that disease development likely has some heritable component. Although genetic studies of Meniere's disease have been inconclusive, increasing knowledge of human genetic structure and mutation and investigative techniques have potential to further understanding of this disorder.
Collapse
Affiliation(s)
- Jeffrey T Vrabec
- Bobby R. Alford Department of Otolaryngology-Head and Neck Surgery Baylor College of Medicine, 6550 Fannin Street, SM1727, Houston, TX 77030, USA.
| |
Collapse
|
20
|
Hanson RL, Millis MP, Young NJ, Kobes S, Nelson RG, Knowler WC, DiStefano JK. ELMO1 variants and susceptibility to diabetic nephropathy in American Indians. Mol Genet Metab 2010; 101:383-90. [PMID: 20826100 PMCID: PMC6542634 DOI: 10.1016/j.ymgme.2010.08.014] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Revised: 08/12/2010] [Accepted: 08/12/2010] [Indexed: 11/27/2022]
Abstract
Variants in the engulfment and cell motility 1 gene, ELMO1, have previously been associated with kidney disease attributed to type 2 diabetes. The Pima Indians of Arizona have high rates of diabetic nephropathy, which is strongly dependent on genetic determinants; thus, we sought to investigate the role of ELMO1 polymorphisms in mediating susceptibility to this disease in this population. Genotype distributions were compared among 141 individuals with nephropathy and 416 individuals without heavy proteinuria in a family study of 257 sibships, and 107 cases with diabetic ESRD and 108 controls with long duration diabetes and no nephropathy. We sequenced 17.4 kb of ELMO1 and identified 19 variants. We genotyped 12 markers, excluding those in 100% genotypic concordance with other variants or with a minor allele frequency <0.05, plus 21 additional markers showing association with ESRD in earlier studies. In the family study, the strongest evidence for association was with rs1345365 (odds ratio [OR]=2.42 per copy of A allele [1.35-4.32]; P=0.001) and rs10951509 (OR=2.42 per copy of A allele [1.31-4.48]; P=0.002), both of which are located in intron 13 and are in strong pairwise linkage disequilibrium (r(2)=0.97). These associations were in the opposite direction from those observed in African Americans, which suggests that the relationship between diabetic kidney disease and ELMO1 variation may involve as yet undiscovered functional variants or complex interactions with other biological variables.
Collapse
Affiliation(s)
- Robert L. Hanson
- Diabetes Epidemiology and Clinical Research Section, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, 1550 East Indian School Road, Phoenix, AZ 85014
| | - Meredith P. Millis
- Translational Genomics Research Institute, Diabetes, Cardiovascular and Metabolic Diseases Division, 445 North Fifth Street, Phoenix, AZ 85004
| | - Naomi J. Young
- Translational Genomics Research Institute, Diabetes, Cardiovascular and Metabolic Diseases Division, 445 North Fifth Street, Phoenix, AZ 85004
| | - Sayuko Kobes
- Diabetes Epidemiology and Clinical Research Section, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, 1550 East Indian School Road, Phoenix, AZ 85014
| | - Robert G. Nelson
- Diabetes Epidemiology and Clinical Research Section, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, 1550 East Indian School Road, Phoenix, AZ 85014
| | - William C. Knowler
- Diabetes Epidemiology and Clinical Research Section, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, 1550 East Indian School Road, Phoenix, AZ 85014
| | - Johanna K. DiStefano
- Translational Genomics Research Institute, Diabetes, Cardiovascular and Metabolic Diseases Division, 445 North Fifth Street, Phoenix, AZ 85004
- Corresponding author: Johanna K. DiStefano, Ph.D., Translational Genomics Research Institute, 445 North Fifth Street, Phoenix, AZ 85004, Tel: 602.343.8812, FAX: 602.343.8844,
| |
Collapse
|
21
|
Stapley J, Birkhead TR, Burke T, Slate J. Pronounced inter- and intrachromosomal variation in linkage disequilibrium across the zebra finch genome. Genome Res 2010; 20:496-502. [PMID: 20357051 DOI: 10.1101/gr.102095.109] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
The extent of nonrandom association of alleles at two or more loci, termed linkage disequilibrium (LD), can reveal much about population demography, selection, and recombination rate, and is a key consideration when designing association mapping studies. Here, we describe a genome-wide analysis of LD in the zebra finch (Taeniopygia guttata) using 838 single nucleotide polymorphisms and present LD maps for all assembled chromosomes. We found that LD declined with physical distance approximately five times faster on the microchromosomes compared to macrochromosomes. The distribution of LD across individual macrochromosomes also varied in a distinct pattern. In the center of the macrochromosomes there were large blocks of markers, sometimes spanning tens of mega bases, in strong LD whereas on the ends of macrochromosomes LD declined more rapidly. Regions of high LD were not simply the result of suppressed recombination around the centromere and this pattern has not been observed previously in other taxa. We also found evidence that this pattern of LD has remained stable across many generations. The variability in LD between and within chromosomes has important implications for genome wide association studies in birds and for our understanding of the distribution of recombination events and the processes that govern them.
Collapse
Affiliation(s)
- Jessica Stapley
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| | | | | | | |
Collapse
|
22
|
Xu H, Shen X, Zhou M, Luo C, Kang L, Liang Y, Zeng H, Nie Q, Zhang D, Zhang X. The dopamine D2 receptor gene polymorphisms associated with chicken broodiness. Poult Sci 2010; 89:428-38. [DOI: 10.3382/ps.2009-00428] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
23
|
Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N. STrengthening the REporting of Genetic Association Studies (STREGA)--an extension of the STROBE statement. Genet Epidemiol 2010; 33:581-98. [PMID: 19278015 DOI: 10.1002/gepi.20410] [Citation(s) in RCA: 185] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Collapse
Affiliation(s)
- Julian Little
- Department of Epidemiology and Community Medicine, University of Ottawa, Ottawa, Ontario, Canada.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Armengol L, Villatoro S, González JR, Pantano L, García-Aragonés M, Rabionet R, Cáceres M, Estivill X. Identification of copy number variants defining genomic differences among major human groups. PLoS One 2009; 4:e7230. [PMID: 19789632 PMCID: PMC2747275 DOI: 10.1371/journal.pone.0007230] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2009] [Accepted: 08/20/2009] [Indexed: 12/14/2022] Open
Abstract
Background Understanding the genetic contribution to phenotype variation of human groups is necessary to elucidate differences in disease predisposition and response to pharmaceutical treatments in different human populations. Methodology/Principal Findings We have investigated the genome-wide profile of structural variation on pooled samples from the three populations studied in the HapMap project by comparative genome hybridization (CGH) in different array platforms. We have identified and experimentally validated 33 genomic loci that show significant copy number differences from one population to the other. Interestingly, we found an enrichment of genes related to environment adaptation (immune response, lipid metabolism and extracellular space) within these regions and the study of expression data revealed that more than half of the copy number variants (CNVs) translate into gene-expression differences among populations, suggesting that they could have functional consequences. In addition, the identification of single nucleotide polymorphisms (SNPs) that are in linkage disequilibrium with the copy number alleles allowed us to detect evidences of population differentiation and recent selection at the nucleotide variation level. Conclusions Overall, our results provide a comprehensive view of relevant copy number changes that might play a role in phenotypic differences among major human populations, and generate a list of interesting candidates for future studies.
Collapse
Affiliation(s)
- Lluís Armengol
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
- Quantitative Genomic Medicine Laboratories (qGenomics), Barcelona, Catalonia, Spain
| | - Sergi Villatoro
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | - Juan R. González
- Center for Research in Environmental Epidemiology (CREAL), Barcelona, Catalonia, Spain
| | - Lorena Pantano
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | - Manel García-Aragonés
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | - Raquel Rabionet
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | - Mario Cáceres
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | - Xavier Estivill
- Genetic Causes of Disease Group, Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
- Genetics Unit, Department of Health and Experimental Life Sciences, Pompeu Fabra University (UPF), Barcelona, Catalonia, Spain
- National Genotyping Center (CeGen) Barcelona Genotyping Node, Center for Genomic Regulation (CRG-UPF), Barcelona, Catalonia, Spain
- * E-mail:
| |
Collapse
|
25
|
Little J, Higgins JP, Ioannidis JP, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart AF, Birkett N. Strengthening the reporting of genetic association studies (STREGA)—an extension of the strengthening the reporting of observational studies in epidemiology (STROBE) statement. J Clin Epidemiol 2009; 62:597-608.e4. [DOI: 10.1016/j.jclinepi.2008.12.004] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2008] [Revised: 12/29/2008] [Accepted: 12/29/2008] [Indexed: 01/15/2023]
|
26
|
Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N. STrengthening the REporting of Genetic Association studies (STREGA)--an extension of the STROBE statement. Eur J Clin Invest 2009; 39:247-66. [PMID: 19297801 PMCID: PMC2730482 DOI: 10.1111/j.1365-2362.2009.02125.x] [Citation(s) in RCA: 183] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the STrengthening the Reporting of OBservational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed, but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct or analysis.
Collapse
Affiliation(s)
- Julian Little
- Canada Research Chair in Human Genome Epidemiology, Ottawa, ON, Canada.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N. STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement. PLoS Med 2009; 6:e22. [PMID: 19192942 PMCID: PMC2634792 DOI: 10.1371/journal.pmed.1000022] [Citation(s) in RCA: 315] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modelling haplotype variation, Hardy-Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Collapse
Affiliation(s)
- Julian Little
- Canada Research Chair in Human Genome Epidemiology, University of Ottawa, Ottawa, Ontario, Canada.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N. Strengthening the reporting of genetic association studies (STREGA): an extension of the STROBE statement. Eur J Epidemiol 2009; 24:37-55. [PMID: 19189221 PMCID: PMC2764094 DOI: 10.1007/s10654-008-9302-y] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2008] [Accepted: 11/04/2008] [Indexed: 02/02/2023]
Abstract
Making sense of rapidly evolving evidence on genetic associations is crucial to making genuine advances in human genomics and the eventual integration of this information in the practice of medicine and public health. Assessment of the strengths and weaknesses of this evidence, and hence the ability to synthesize it, has been limited by inadequate reporting of results. The STrengthening the REporting of Genetic Association studies (STREGA) initiative builds on the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) Statement and provides additions to 12 of the 22 items on the STROBE checklist. The additions concern population stratification, genotyping errors, modeling haplotype variation, Hardy–Weinberg equilibrium, replication, selection of participants, rationale for choice of genes and variants, treatment effects in studying quantitative traits, statistical methods, relatedness, reporting of descriptive and outcome data, and the volume of data issues that are important to consider in genetic association studies. The STREGA recommendations do not prescribe or dictate how a genetic association study should be designed but seek to enhance the transparency of its reporting, regardless of choices made during design, conduct, or analysis.
Collapse
Affiliation(s)
- Julian Little
- Canada Research Chair in Human Genome Epidemiology, Toronto, ON, Canada.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Little J, Higgins JPT, Ioannidis JPA, Moher D, Gagnon F, von Elm E, Khoury MJ, Cohen B, Davey-Smith G, Grimshaw J, Scheet P, Gwinn M, Williamson RE, Zou GY, Hutchings K, Johnson CY, Tait V, Wiens M, Golding J, van Duijn C, McLaughlin J, Paterson A, Wells G, Fortier I, Freedman M, Zecevic M, King R, Infante-Rivard C, Stewart A, Birkett N. Strengthening the reporting of genetic association studies (STREGA): an extension of the STROBE Statement. Hum Genet 2009; 125:131-51. [DOI: 10.1007/s00439-008-0592-7] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2008] [Accepted: 11/09/2008] [Indexed: 12/21/2022]
|
30
|
Stracke S, Haseneyer G, Veyrieras JB, Geiger HH, Sauer S, Graner A, Piepho HP. Association mapping reveals gene action and interactions in the determination of flowering time in barley. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2009; 118:259-73. [PMID: 18830577 DOI: 10.1007/s00122-008-0896-y] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/08/2008] [Accepted: 09/09/2008] [Indexed: 05/18/2023]
Abstract
The interaction between members of a gene network has an important impact on the variation of quantitative traits, and can influence the outcome of phenotype/genotype association studies. Three genes (Ppd-H1, HvCO1, HvFT1) known to play an essential role in the regulation of flowering time under long days in barley were subjected to an analysis of nucleotide diversity in a collection of 220 spring barley accessions. The coding region of Ppd-H1 was highly diverse, while both HvCO1 and HvFT1 showed a rather limited level of diversity. Within all three genes, the extent of linkage disequilibrium was variable, but on average only moderate. Ppd-H1 is strongly associated with flowering time across four environments, showing a difference of five to ten days between the most extreme haplotypes. The association between flowering time and the variation at HvFT1 and HvCO1 was strongly dependent on the haplotype present at Ppd-H1. The interaction between HvCO1 and Ppd-H1 was statistically significant, but this association disappeared when the analysis was corrected for the geographical origin of the accessions. No association existed between flowering time and allelic variation at HvFT1. In contrast to Ppd-H1, functional variation at both HvCO1 and HvFT1 is limited in cultivated barley.
Collapse
Affiliation(s)
- Silke Stracke
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), 06466, Gatersleben, Germany.
| | | | | | | | | | | | | |
Collapse
|
31
|
Gains in power for exhaustive analyses of haplotypes using variable-sized sliding window strategy: a comparison of association-mapping strategies. Eur J Hum Genet 2008; 17:785-92. [PMID: 19092774 DOI: 10.1038/ejhg.2008.244] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
Linkage disequilibrium (LD)-based association mapping is often performed by analyzing either individual SNPs or block-based multi-SNP haplotypes. Sliding windows of several fixed sizes (in terms of SNP numbers) were also applied to a few simulated or real data sets. In comparison, exhaustively testing based on variable-sized sliding windows (VSW) of all possible sizes of SNPs over a genomic region has the best chance to capture the optimum markers (single SNPs or haplotypes) that are most significantly associated with the traits under study. However, the cost is the increased number of multiple tests and computation. Here, a strategy of VSW of all possible sizes is proposed and its power is examined, in comparison with those using only haplotype blocks (BLK) or single SNP loci (SGL) tests. Critical values for statistical significance testing that account for multiple testing are simulated. We demonstrated that, over a wide range of parameters simulated, VSW increased power for the detection of disease variants by approximately 1-15% over the BLK and SGL approaches. The improved performance was more significant in regions with high recombination rates. In an empirical data set, VSW obtained the most significant signal and identified the LRP5 gene as strongly associated with osteoporosis. With the use of computational techniques such as parallel algorithms and clustering computing, it is feasible to apply VSW to large genomic regions or those regions preliminarily identified by traditional SGL/BLK methods.
Collapse
|
32
|
Barroso I, Luan J, Wheeler E, Whittaker P, Wasson J, Zeggini E, Weedon MN, Hunt S, Venkatesh R, Frayling TM, Delgado M, Neuman RJ, Zhao J, Sherva R, Glaser B, Walker M, Hitman G, McCarthy MI, Hattersley AT, Permutt MA, Wareham NJ, Deloukas P. Population-specific risk of type 2 diabetes conferred by HNF4A P2 promoter variants: a lesson for replication studies. Diabetes 2008; 57:3161-5. [PMID: 18728231 PMCID: PMC2570416 DOI: 10.2337/db08-0719] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE Single nucleotide polymorphisms (SNPs) in the P2 promoter region of HNF4A were originally shown to be associated with predisposition for type 2 diabetes in Finnish, Ashkenazi, and, more recently, Scandinavian populations, but they generated conflicting results in additional populations. We aimed to investigate whether data from a large-scale mapping approach would replicate this association in novel Ashkenazi samples and in U.K. populations and whether these data would allow us to refine the association signal. RESEARCH DESIGN AND METHODS Using a dense linkage disequilibrium map of 20q, we selected SNPs from a 10-Mb interval centered on HNF4A. In a staged approach, we first typed 4,608 SNPs in case-control populations from four U.K. populations and an Ashkenazi population (n = 2,516). In phase 2, a subset of 763 SNPs was genotyped in 2,513 additional samples from the same populations. RESULTS Combined analysis of both phases demonstrated association between HNF4A P2 SNPs (rs1884613 and rs2144908) and type 2 diabetes in the Ashkenazim (n = 991; P < 1.6 x 10(-6)). Importantly, these associations are significant in a subset of Ashkenazi samples (n = 531) not previously tested for association with P2 SNPs (odds ratio [OR] approximately 1.7; P < 0.002), thus providing replication within the Ashkenazim. In the U.K. populations, this association was not significant (n = 4,022; P > 0.5), and the estimate for the OR was much smaller (OR 1.04; [95%CI 0.91-1.19]). CONCLUSIONS These data indicate that the risk conferred by HNF4A P2 is significantly different between U.K. and Ashkenazi populations (P < 0.00007), suggesting that the underlying causal variant remains unidentified. Interactions with other genetic or environmental factors may also contribute to this difference in risk between populations.
Collapse
Affiliation(s)
- Inês Barroso
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Genovese LM, Geraci F, Pellegrini M. SpeedHap: an accurate heuristic for the single individual SNP haplotyping problem with many gaps, high reading error rate and low coverage. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:492-502. [PMID: 18989037 DOI: 10.1109/tcbb.2008.67] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Single nucleotide polymorphism (SNP) is the most frequent form of DNA variation. The set of SNP's present in a chromosome (called the haplotype) is of interest in a wide area of applications in molecular biology and biomedicine, including diagnostic and medical therapy. In this paper we propose a new heuristic method for the problem of haplotype reconstruction for (portions of) a pair of homologous human chromosomes from a single individual (SIH). The problem is well known in literature and exact algorithms have been proposed for the case when no (or few) gaps are allowed in the input fragments. These algorithms, though exact and of polynomial complexity, are slow in practice. When gaps are considered no exact method of polynomial complexity is known. The problem is also hard to approximate with guarantees. Therefore fast heuristics have been proposed. In this paper we describe SpeedHap, a new heuristic method that is able to tackle the case of many gapped fragments and retains its effectiveness even when the input fragments have high rate of reading errors (up to 20%) and low coverage (as low as 3). We test SpeedHap on real data from the HapMap Project.
Collapse
Affiliation(s)
- Loredana M Genovese
- Institute for Informatics and Telematics, Italian National Research Council, Via G. Moruzzi 1, 56124 Pisa, Italy.
| | | | | |
Collapse
|
34
|
Abstract
The set of possible postselection genotype frequencies in an infinite, randomly mating population is found. Geometric mean heterozygote frequency divided by geometric mean homozygote frequency equals two times the geometric mean heterozygote fitness divided by geometric mean homozygote fitness. The ratio of genotype frequencies provides a measure of genetic variation that is independent of allele frequencies. When this ratio does not equal two, either selection or population structure is present. Within-population HapMap data show population-specific patterns, while pooled data show an excess of homozygotes.
Collapse
|
35
|
Pattaro C, Ruczinski I, Fallin DM, Parmigiani G. Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies. BMC Genomics 2008; 9:405. [PMID: 18759977 PMCID: PMC2547855 DOI: 10.1186/1471-2164-9-405] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2007] [Accepted: 08/29/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem. RESULTS We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block. CONCLUSION We demonstrate that, in realistic scenarios, our adaptive, study-specific block partitioning approach is as or more efficient than currently available LD-based approaches in guiding the search for disease loci.
Collapse
Affiliation(s)
- Cristian Pattaro
- Unit of Genetic Epidemiology and Biostatistics, Institute of Genetic Medicine, European Academy, Viale Druso 1, I-39100, Bolzano, Italy.
| | | | | | | |
Collapse
|
36
|
Prasad A, Schnabel RD, McKay SD, Murdoch B, Stothard P, Kolbehdari D, Wang Z, Taylor JF, Moore SS. Linkage disequilibrium and signatures of selection on chromosomes 19 and 29 in beef and dairy cattle. Anim Genet 2008; 39:597-605. [PMID: 18717667 PMCID: PMC2659388 DOI: 10.1111/j.1365-2052.2008.01772.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
The objective of this study was to quantify the extent of linkage disequilibrium (LD) on bovine chromosomes 19 and 29 and to study the pattern of selection signatures in beef and dairy breeds (Angus and Holstein) of Bos taurus. The extent of LD was estimated for 370 and 186 single nucleotide polymorphism markers on BTA19 and 29 respectively using the square of the correlation coefficient (r(2)) among alleles at pairs of loci. A comparison of the extent of LD found that the decline of LD followed a similar pattern in both breeds. We observed long-range LD and found that LD dissipates to background levels at a locus separation of about 20 Mb on both chromosomes. Along each chromosome, patterns of LD were variable in both breeds. We find that a minimum of 30 000 informative and evenly spaced markers would be required for whole-genome association studies in cattle. In addition, we have identified chromosomal regions that show some evidence of selection for economically important traits in Angus and Holstein cattle. The results of this study are of importance for the design and application of association studies.
Collapse
Affiliation(s)
- A Prasad
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Alberta, Canada T6G 2P5
| | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Kim Y, Feng S, Zeng ZB. Measuring and partitioning the high-order linkage disequilibrium by multiple order Markov chains. Genet Epidemiol 2008; 32:301-12. [PMID: 18330903 DOI: 10.1002/gepi.20305] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A map of the background levels of disequilibrium between nearby markers can be useful for association mapping studies. In order to assess the background levels of linkage disequilibrium (LD), multilocus LD measures are more advantageous than pairwise LD measures because the combined analysis of pairwise LD measures is not adequate to detect simultaneous allele associations among multiple markers. Various multilocus LD measures based on haplotypes have been proposed. However, most of these measures provide a single index of association among multiple markers and does not reveal the complex patterns and different levels of LD structure. In this paper, we employ non-homogeneous, multiple order Markov Chain models as a statistical framework to measure and partition the LD among multiple markers into components due to different orders of marker associations. Using a sliding window of multiple markers on phased haplotype data, we compute corresponding likelihoods for different Markov Chain (MC) orders in each window. The log-likelihood difference between the lowest MC order model (MC0) and the highest MC order model in each window is used as a measure of the total LD or the overall deviation from the gametic equilibrium for the window. Then, we partition the total LD into lower order disequilibria and estimate the effects from two-, three-, and higher order disequilibria. The relationship between different orders of LD and the log-likelihood difference involving two different orders of MC models are explored. By applying our method to the phased haplotype data in the ENCODE regions of the HapMap project, we are able to identify high/low multilocus LD regions. Our results reveal that the most LD in the HapMap data is attributed to the LD between adjacent pairs of markers across the whole region. LD between adjacent pairs of markers appears to be more significant in high multilocus LD regions than in low multilocus LD regions. We also find that as the multilocus total LD increases, the effects of high-order LD tends to get weaker due to the lack of observed multilocus haplotypes. The overall estimates of first, second, third, and fourth order LD across the ENCODE regions are 64, 23, 9, and 3%.
Collapse
Affiliation(s)
- Yunjung Kim
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina 27695-7566, USA
| | | | | |
Collapse
|
38
|
Meaburn EL, Harlaar N, Craig IW, Schalkwyk LC, Plomin R. Quantitative trait locus association scan of early reading disability and ability using pooled DNA and 100K SNP microarrays in a sample of 5760 children. Mol Psychiatry 2008; 13:729-40. [PMID: 17684495 DOI: 10.1038/sj.mp.4002063] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Quantitative genetic research suggests that reading disability is the quantitative extreme of the same genetic and environmental factors responsible for normal variation in reading ability. This finding warrants a quantitative trait locus (QTL) strategy that compares low versus high extremes of the normal distribution of reading in the search for QTLs associated with variation throughout the distribution. A low reading ability group (N=755) and a high reading group (N=747) were selected from a representative UK sample of 7-year-olds assessed on two measures of reading that we have shown to be highly heritable and highly genetically correlated. The low and high reading ability groups were each divided into 10 independent DNA pools and the 20 pools were assayed on 100 K single nucleotide polymorphism (SNP) microarrays to screen for the largest allele frequency differences between the low and high reading ability groups. Seventy five of these nominated SNPs were individually genotyped in an independent sample of low (N=452) and high (N=452) reading ability children selected from a second sample of 4258 7-year-olds. Nine of the seventy-five SNPs were nominally significant (P<0.05) in the predicted direction. These 9 SNPs and 14 other SNPs showing low versus high allele frequency differences in the predicted direction were genotyped in the rest of the second sample to test the QTL hypothesis. Ten SNPs yielded nominally significant linear associations in the expected direction across the distribution of reading ability. However, none of these SNP associations accounted for more than 0.5% of the variance of reading ability, despite 99% power to detect them. We conclude that QTL effect sizes, even for highly heritable common disorders and quantitative traits such as early reading disability and ability, might be much smaller than previously considered.
Collapse
Affiliation(s)
- E L Meaburn
- Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College, London, UK.
| | | | | | | | | |
Collapse
|
39
|
Abstract
Genome-wide association studies provide a new and powerful approach to investigate the effect of inherited genetic variation on the risk of human disease. These studies rely on high throughput DNA microarray technology to genotype hundreds of thousands of genetic variants across the human genome. The first genome-wide association studies have identified previously unknown genetic risk factors that influence a range of diseases, including prostate cancer, breast cancer, myocardial infarction, age-related macular degeneration, diabetes, Crohn's disease and obesity. Many more studies are currently underway, including a number that will focus on other cancers (e.g., colorectal). Here we discuss the major issues involved in conducting genome-wide association studies and how these studies can be used to examine cancer phenotypes.
Collapse
Affiliation(s)
- Eric Jorgenson
- University of California, Department of Epidemiology & Biostatistics , San Francisco, CA 94143-0794, USA.
| | | |
Collapse
|
40
|
Zhu L, Zhang Z, Feng F, Schweitzer P, Phavaphutanon J, Vernier-Singer M, Corey E, Friedenberg S, Mateescu R, Williams A, Lust G, Acland G, Todhunter R. Single nucleotide polymorphisms refine QTL intervals for hip joint laxity in dogs. Anim Genet 2008; 39:141-6. [PMID: 18261189 DOI: 10.1111/j.1365-2052.2007.01691.x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Hip laxity is one characteristic of canine hip dysplasia (CHD), an inheritable disease that leads to hip osteoarthritis. Using a genome-wide screen with 250 microsatellites in a crossbreed pedigree of 159 dysplastic Labrador retrievers and unaffected greyhounds, we previously identified putative (P < 0.01) QTL on canine chromosomes 11 and 29 (CFA11 and CFA29). To refine these QTL locations, we have genotyped 257 dogs including 105 Labrador retrievers, seven greyhounds, four generations of their crossbreed offspring and three German shepherds for 111 and 171 SNPs on CFA11 and CFA29 respectively. The distraction index (DI, a measure of maximum hip laxity) was used as an intermediate phenotype that predicts whether a hip joint will or will not develop osteoarthritis. Using a multipoint linkage analysis, significant evidence (95% posterior probability) was found for QTL contributing to hip laxity in the 16.2-21 cM region on CFA11 that explained 15-18% of the total variance in DI. Evidence for an independent QTL on CFA29 was weaker than that on CFA11. Identification of the causative mutation(s) will lead to better understanding of biochemical pathways in both dogs and humans with hip laxity and dysplasia.
Collapse
Affiliation(s)
- L Zhu
- Department of Statistics, Oklahoma State University, Stillwater, OK 74078, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Forensic STRs as potential disease markers: A study of VWA and von Willebrand's Disease. Forensic Sci Int Genet 2007; 1:253-61. [DOI: 10.1016/j.fsigen.2007.06.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2007] [Revised: 05/11/2007] [Accepted: 06/01/2007] [Indexed: 11/22/2022]
|
42
|
Linkage disequilibrium maps and location databases. Methods Mol Biol 2007. [PMID: 17984536 DOI: 10.1007/978-1-59745-389-9_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Effective application of association mapping for complex traits requires characterization of linkage disequilibrium (LD) patterns that reflect the dominant process of recombination and its duration in addition to the more subtle influences of mutation, selection, and genetic drift. Maps expressed in linkage disequilibrium units (LDUs) reflect the influences of these factors with the use of a modified version of Malecot's isolation-by-distance model. As a result, LDU maps are analogous to linkage maps in so far as their provision of an additive metric that is related to recombination and facilitates association-mapping studies. However, unlike linkage maps, LDUs also reflect the partly cumulative effects of multiple historical bottlenecks that account for substantial variations in LD patterns between populations. This chapter provides an overview of the data requirements and methodology used to construct LDU maps, their applications outside association mapping, and their integration into location databases.
Collapse
|
43
|
Abstract
Background Bovine whole genome linkage disequilibrium maps were constructed for eight breeds of cattle. These data provide fundamental information concerning bovine genome organization which will allow the design of studies to associate genetic variation with economically important traits and also provides background information concerning the extent of long range linkage disequilibrium in cattle. Results Linkage disequilibrium was assessed using r2 among all pairs of syntenic markers within eight breeds of cattle from the Bos taurus and Bos indicus subspecies. Bos taurus breeds included Angus, Charolais, Dutch Black and White Dairy, Holstein, Japanese Black and Limousin while Bos indicus breeds included Brahman and Nelore. Approximately 2670 markers spanning the entire bovine autosomal genome were used to estimate pairwise r2 values. We found that the extent of linkage disequilibrium is no more than 0.5 Mb in these eight breeds of cattle. Conclusion Linkage disequilibrium in cattle has previously been reported to extend several tens of centimorgans. Our results, based on a much larger sample of marker loci and across eight breeds of cattle indicate that in cattle linkage disequilibrium persists over much more limited distances. Our findings suggest that 30,000–50,000 loci will be needed to conduct whole genome association studies in cattle.
Collapse
|
44
|
Aerts J, Megens HJ, Veenendaal T, Ovcharenko I, Crooijmans R, Gordon L, Stubbs L, Groenen M. Extent of linkage disequilibrium in chicken. Cytogenet Genome Res 2007; 117:338-45. [PMID: 17675876 DOI: 10.1159/000103196] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Accepted: 10/25/2006] [Indexed: 01/27/2023] Open
Abstract
Many of the economically important traits in chicken are multifactorial and governed by multiple genes located at different quantitative trait loci (QTLs). The optimal marker density to identify these QTLs in linkage and association studies is largely determined by the extent of linkage disequilibrium (LD) around them. In this study, we investigated the extent of LD on two chromosomes in a white layer and two broiler chicken breeds. Pairwise levels of LD were calculated for 33 and 36 markers on chromosomes 10 and 28, respectively. We found that useful LD (i.e. an r(2) value higher than 0.3) in Nutreco chicken breed E5 (inbred) can extend to around 1 cM on chromosomes 10 and 28, although in a second region on chromosome 28 it extends to about 2.5 cM. The extent in breed Nutreco E3 (outbred) was very short in chromosome 10 (15 kb) but very much larger on chromosome 28, particularly in one region of depressed heterozygosity. The layer breed E2 (inbred) showed an extent of useful LD up to 4 cM on chromosome 10; the extent on chromosome 28 could not be assessed due to an erratic pattern of LD on that chromosome, although in one region LD appears to be in the order of 0.8 cM. This indicates that there may be very large differences in patterns of LD between different chicken breeds and different genomic regions.
Collapse
Affiliation(s)
- J Aerts
- Animal Breeding and Genetics Group, Wageningen University, Wageningen, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
45
|
Angius A, Hyland FCL, Persico I, Pirastu N, Woodage T, Pirastu M, De la Vega FM. Patterns of linkage disequilibrium between SNPs in a Sardinian population isolate and the selection of markers for association studies. Hum Hered 2007; 65:9-22. [PMID: 17652959 DOI: 10.1159/000106058] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2006] [Accepted: 04/30/2007] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVE In isolated populations, 'background' linkage disequilibrium (LD) has been shown to extend over large genetic distances. This and their reduced environmental and genetic heterogeneity has stimulated interest in their potential for association mapping. We compared LD unit map distances with pair-wise measurements of LD in a dense single nucleotide polymorphism (SNP) set. METHODS We genotyped 771 SNPs in an 8 Mb segment of chromosome 22 on 101 individuals from the isolated village of Talana, Sardinia, and compared with outbred European populations. RESULTS Heterozygosity was remarkably similar in both populations. In contrast, the extent of LD observed was quite different. The decay of LD with distance is slower in the isolate. The differences in LD map lengths suggest that useful LD extends up to three times farther in the Sardinian population; smaller differences are seen with pairwise LD metrics. While LD map length slightly decreases with average relatedness, cryptic relatedness does not explain the decrease in LD map length. Haplotypes, block boundaries, and patterns of LD are similar in both populations, suggesting a shared distribution of recombination hotspots. CONCLUSIONS About 15% fewer haplotype tagging SNPs need to be genotyped in the isolate, and possibly 70% fewer if selecting SNPs evenly spaced on the metric LD map.
Collapse
|
46
|
Montpetit A, Chagnon F. [The Haplotype Map of the human genome: a revolution in the genetics of complex diseases]. Med Sci (Paris) 2007; 22:1061-7. [PMID: 17156727 DOI: 10.1051/medsci/200622121061] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
More than 99.9 % of the sequence is identical when comparing the DNA from two individuals. The remaining 0.1 % is responsible, along with other factors such as the environment, for the risk level of developing complex diseases (such as asthma, diabetes or cancer) or for the different pharmacological response to drugs. Despite the incredible advances in genomics in the past few years, identifying the variants involved remains difficult because of the prodigious amount of information to process. The recent completion of the Haplotype Map of the human genome has raised great hopes in the field as it is expected to help reduce the complexity of association studies and thus accelerate the discovery of genes associated with complex diseases. This review details the rationale behind the HapMap project, gives a summary of the results and also describes potential applications of the Haplotype Map.
Collapse
Affiliation(s)
- Alexandre Montpetit
- Centre d'Innovation de Génome Québec et de l'Université McGill, 740 avenue Dr Penfield, Montréal, Québec, H3A 1A4 Canada.
| | | |
Collapse
|
47
|
Beiraghi S, Nath SK, Gaines M, Mandhyan DD, Hutchings D, Ratnamala U, McElreavey K, Bartoloni L, Antonarakis GS, Antonarakis SE, Radhakrishna U. Autosomal dominant nonsyndromic cleft lip and palate: significant evidence of linkage at 18q21.1. Am J Hum Genet 2007; 81:180-8. [PMID: 17564975 PMCID: PMC1950911 DOI: 10.1086/518944] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2006] [Accepted: 04/05/2007] [Indexed: 01/10/2023] Open
Abstract
Nonsyndromic cleft lip with or without cleft palate (NSCL/P) is one of the most common congenital facial defects, with an incidence of 1 in 700-1,000 live births among individuals of European descent. Several linkage and association studies of NSCL/P have suggested numerous candidate genes and genomic regions. A genomewide linkage analysis of a large multigenerational family (UR410) with NSCL/P was performed using a single-nucleotide-polymorphism array. Nonparametric linkage (NPL) analysis provided significant evidence of linkage for marker rs728683 on chromosome 18q21.1 (NPL=43.33 and P=.000061; nonparametric LOD=3.97 and P=.00001). Parametric linkage analysis with a dominant mode of inheritance and reduced penetrance resulted in a maximum LOD score of 3.61 at position 47.4 Mb on chromosome 18q21.1. Haplotype analysis with informative crossovers defined a 5.7-Mb genomic region spanned by proximal marker rs1824683 (42,403,918 bp) and distal marker rs768206 (48,132,862 bp). Thus, a novel genomic region on 18q21.1 was identified that most likely harbors a high-risk variant for NSCL/P in this family; we propose to name this locus "OFC11" (orofacial cleft 11).
Collapse
Affiliation(s)
- Soraya Beiraghi
- Division of Pediatric Dentistry, University of Minnesota, Minneapolis, MN, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
48
|
Evaluation of sample size effect on the identification of haplotype blocks. BMC Bioinformatics 2007; 8:200. [PMID: 17567919 PMCID: PMC1913927 DOI: 10.1186/1471-2105-8-200] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2006] [Accepted: 06/14/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome-wide maps of linkage disequilibrium (LD) and haplotypes have been created for different populations. Substantial sharing of the boundaries and haplotypes among populations was observed, but haplotype variations have also been reported across populations. Conflicting observations on the extent and distribution of haplotypes require careful examination. The mechanisms that shape haplotypes have not been fully explored, although the effect of sample size has been implicated. We present a close examination of the effect of sample size on haplotype blocks using an original computational simulation. RESULTS A region spanning 19.31 Mb on chromosome 20q was genotyped for 1,147 SNPs in 725 Japanese subjects. One region of 445 kb exhibiting a single strong LD value (average |D'|; 0.94) was selected for the analysis of sample size effect on haplotype structure. Three different block definitions (recombination-based, LD-based, and diversity-based) were exploited to create simulations for block identification with theta value from real genotyping data. As a result, it was quite difficult to estimate a haplotype block for data with less than 200 samples. Attainment of a reliable haplotype structure with 50 samples was not possible, although the simulation was repeated 10,000 times. CONCLUSION These analyses underscored the difficulties of estimating haplotype blocks. To acquire a reliable result, it would be necessary to increase sample size more than 725 and to repeat the simulation 3,000 times. Even in one genomic region showing a high LD value, the haplotype block might be fragile. We emphasize the importance of applying careful confidence measures when using the estimated haplotype structure in biomedical research.
Collapse
|
49
|
Maniatis N, Collins A, Morton NE. Effects of single SNPs, haplotypes, and whole-genome LD maps on accuracy of association mapping. Genet Epidemiol 2007; 31:179-88. [PMID: 17285621 DOI: 10.1002/gepi.20199] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We describe an association mapping approach that utilizes linkage disequilibrium (LD) maps in LD units (LDU). This method uses composite likelihood to combine information from all single marker tests, and applies a model with a parameter for the location of the causal polymorphism. Previous analyses of the poor drug metabolizer phenotype provided evidence of the substantial utility of LDU maps for disease gene association mapping. Using LDU locations for the 27 single nucleotide polymorphisms (SNPs) flanking the CYP2D6 gene on chromosome 22, the most common functional polymorphism within the gene was located at 15 kb from its true location. Here, we examine the performance of this mapping approach by exploiting the high-density LDU map constructed from the HapMap data. Expressing the locations of the 27 SNPs in LDU from the HapMap LDU map, analysis yielded an estimated location that is only 0.3 kb away from the CYP2D6 gene. This supports the use of the high marker density HapMap-derived LDU map for association mapping even though it is derived from a much smaller number of individuals compared to the CYP2D6 sample. We also examine the performance of 2-SNP haplotypes. Using the same modelling procedures and composite likelihood as for single SNPs, the haplotype data provided much poorer localization compared to single SNP analysis. Haplotypes generate more autocorrelation through multiple inclusions of the same SNPs, which could inflate significance in association studies. The results of the present study demonstrate the great potential of the genome HapMap LDU maps for high-resolution mapping of complex phenotypes.
Collapse
Affiliation(s)
- Nikolas Maniatis
- Human Genetics Division, University of Southampton, Southampton General Hospital, Southampton, UK.
| | | | | |
Collapse
|
50
|
Zhu T, Salmeron J. High-definition genome profiling for genetic marker discovery. TRENDS IN PLANT SCIENCE 2007; 12:196-202. [PMID: 17416547 DOI: 10.1016/j.tplants.2007.03.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2006] [Revised: 02/06/2007] [Accepted: 03/29/2007] [Indexed: 05/04/2023]
Abstract
Genetic mapping is a key step towards isolating genes and genetic markers associated with phenotypic traits by elucidating their genetic positions. The success of this approach depends on precision in pinpointing genetic positions and the effectiveness of the discovery process. Recent advances in microarray technology and the increasing availability of genomic information have provided an opportunity to use microarrays to scan effectively for genetic variations at the whole-genome scale, enabling the production of high-definition gene-based genetic maps, in combination with functional analyses and identification of trait-associated genetic marker candidates with high precision. In this review, we discuss the concept, process, tools and applications of microarray-based high-definition genetic analysis. This post-genomics approach should help to identify causative genetic variation by uniting genetic and functional information.
Collapse
Affiliation(s)
- Tong Zhu
- Syngenta Biotechnology, Inc., 3054 East Cornwallis Road, Research Triangle Park, NC 27709, USA.
| | | |
Collapse
|