1801
|
Li H, Glusman G, Huff C, Caballero J, Roach JC. Accurate and robust prediction of genetic relationship from whole-genome sequences. PLoS One 2014; 9:e85437. [PMID: 24586241 PMCID: PMC3938395 DOI: 10.1371/journal.pone.0085437] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2013] [Accepted: 11/27/2013] [Indexed: 12/02/2022] Open
Abstract
Computing the genetic relationship between two humans is important to studies in genetics, genomics, genealogy, and forensics. Relationship algorithms may be sensitive to noise, such as that arising from sequencing errors or imperfect reference genomes. We developed an algorithm for estimation of genetic relationship by averaged blocks (GRAB) that is designed for whole-genome sequencing (WGS) data. GRAB segments the genome into blocks, calculates the fraction of blocks sharing identity, and then uses a classification tree to infer 1st- to 5th- degree relationships and unrelated individuals. We evaluated GRAB on simulated and real sequenced families, and compared it with other software. GRAB achieves similar performance, and does not require knowledge of population background or phasing. GRAB can be used in workflows for identifying unreported relationships, validating reported relationships in family-based studies, and detection of sample-tracking errors or duplicate inclusion. The software is available at familygenomics.systemsbiology.net/grab.
Collapse
Affiliation(s)
- Hong Li
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Gustavo Glusman
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Chad Huff
- Department of Epidemiology, University of Texas M. D. Anderson Cancer Center, Houston, Texas, United States of America
| | - Juan Caballero
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Jared C. Roach
- Institute for Systems Biology, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
1802
|
Kim K, Bang SY, Lee HS, Cho SK, Choi CB, Sung YK, Kim TH, Jun JB, Yoo DH, Kang YM, Kim SK, Suh CH, Shim SC, Lee SS, Lee J, Chung WT, Choe JY, Shin HD, Lee JY, Han BG, Nath SK, Eyre S, Bowes J, Pappas DA, Kremer JM, Gonzalez-Gay MA, Rodriguez-Rodriguez L, Ärlestig L, Okada Y, Diogo D, Liao KP, Karlson EW, Raychaudhuri S, Rantapää-Dahlqvist S, Martin J, Klareskog L, Padyukov L, Gregersen PK, Worthington J, Greenberg JD, Plenge RM, Bae SC. High-density genotyping of immune loci in Koreans and Europeans identifies eight new rheumatoid arthritis risk loci. Ann Rheum Dis 2014; 74:e13. [PMID: 24532676 DOI: 10.1136/annrheumdis-2013-204749] [Citation(s) in RCA: 90] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
OBJECTIVE A highly polygenic aetiology and high degree of allele-sharing between ancestries have been well elucidated in genetic studies of rheumatoid arthritis. Recently, the high-density genotyping array Immunochip for immune disease loci identified 14 new rheumatoid arthritis risk loci among individuals of European ancestry. Here, we aimed to identify new rheumatoid arthritis risk loci using Korean-specific Immunochip data. METHODS We analysed Korean rheumatoid arthritis case-control samples using the Immunochip and genome-wide association studies (GWAS) array to search for new risk alleles of rheumatoid arthritis with anticitrullinated peptide antibodies. To increase power, we performed a meta-analysis of Korean data with previously published European Immunochip and GWAS data for a total sample size of 9299 Korean and 45,790 European case-control samples. RESULTS We identified eight new rheumatoid arthritis susceptibility loci (TNFSF4, LBH, EOMES, ETS1-FLI1, COG6, RAD51B, UBASH3A and SYNGR1) that passed a genome-wide significance threshold (p<5×10(-8)), with evidence for three independent risk alleles at 1q25/TNFSF4. The risk alleles from the seven new loci except for the TNFSF4 locus (monomorphic in Koreans), together with risk alleles from previously established RA risk loci, exhibited a high correlation of effect sizes between ancestries. Further, we refined the number of single nucleotide polymorphisms (SNPs) that represent potentially causal variants through a trans-ethnic comparison of densely genotyped SNPs. CONCLUSIONS This study demonstrates the advantage of dense-mapping and trans-ancestral analysis for identification of potentially causal SNPs. In addition, our findings support the importance of T cells in the pathogenesis and the fact of frequent overlap of risk loci among diverse autoimmune diseases.
Collapse
Affiliation(s)
- Kwangwoo Kim
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
| | - So-Young Bang
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Hye-Soon Lee
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Soo-Kyung Cho
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Chan-Bum Choi
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Yoon-Kyoung Sung
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Tae-Hwan Kim
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Jae-Bum Jun
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Dae Hyun Yoo
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| | - Young Mo Kang
- Division of Rheumatology, Department of Internal Medicine, Kyungpook National University School of Medicine, Daegu, Republic of Korea
| | - Seong-Kyu Kim
- Division of Rheumatology, Department of Internal Medicine, Arthritis & Autoimmunity Research Center, Catholic University of Daegu School of Medicine, Daegu, Republic of Korea
| | - Chang-Hee Suh
- Department of Rheumatology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Seung-Cheol Shim
- Division of Rheumatology, Daejeon Rheumatoid & Degenerative Arthritis Center, Chungnam National University Hospital, Daejeon, Republic of Korea
| | - Shin-Seok Lee
- Division of Rheumatology, Department of Internal Medicine, Chonnam National University Medical School and Hospital, Gwangju, Republic of Korea
| | - Jisoo Lee
- Division of Rheumatology, Department of Internal Medicine, Ewha Womans University School of Medicine, Seoul, Republic of Korea
| | - Won Tae Chung
- Division of Rheumatology, Department of internal medicine, Dong-A University, Busan, Republic of Korea
| | - Jung-Yoon Choe
- Division of Rheumatology, Department of Internal Medicine, Arthritis & Autoimmunity Research Center, Catholic University of Daegu School of Medicine, Daegu, Republic of Korea
| | - Hyoung Doo Shin
- Department of Life Science, Sogang University, Seoul, Republic of Korea
| | - Jong-Young Lee
- Center for Genome Science, Korea National Institute of Health, Osong Health Technology, Chungcheongbuk-do, Republic of Korea
| | - Bok-Ghee Han
- Center for Genome Science, Korea National Institute of Health, Osong Health Technology, Chungcheongbuk-do, Republic of Korea
| | - Swapan K Nath
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, USA
| | - Steve Eyre
- Arthritis Research UK Epidemiology Unit, University of Manchester, Manchester Academic Health Sciences Centre, Manchester, UK
| | - John Bowes
- Arthritis Research UK Epidemiology Unit, University of Manchester, Manchester Academic Health Sciences Centre, Manchester, UK
| | - Dimitrios A Pappas
- Department of Medicine, Division of Rheumatology, Columbia University, New York, New York, USA
| | | | - Miguel A Gonzalez-Gay
- Department of Rheumatology, Hospital Marques de Valdecilla, IFIMAV, Santander, Spain
| | | | - Lisbeth Ärlestig
- Department of Clinical Medicine/Rheumatoloy, Umeå University, Umeå, Sweden
| | - Yukinori Okada
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA Department of Human Genetics and Disease Diversity, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan Laboratory for Statistical Analysis, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Dorothée Diogo
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
| | - Katherine P Liao
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Elizabeth W Karlson
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA
| | - Soumya Raychaudhuri
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA Arthritis Research UK Epidemiology Unit, University of Manchester, Manchester Academic Health Sciences Centre, Manchester, UK
| | | | - Javier Martin
- Instituto de Parasitologia y Biomedicina Lopez-Neyra, CSIC, Granada, Spain
| | - Lars Klareskog
- Rheumatology Unit, Department of Medicine, Karolinska Institutet and Karolinska University Hospital Solna, Stockholm, Sweden
| | - Leonid Padyukov
- Rheumatology Unit, Department of Medicine, Karolinska Institutet and Karolinska University Hospital Solna, Stockholm, Sweden
| | - Peter K Gregersen
- The Feinstein Institute for Medical Research, North Shore-Long Island Jewish Health System, Manhasset, New York, USA
| | - Jane Worthington
- Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, USA
| | - Jeffrey D Greenberg
- Division of Rheumatology, New York University School of Medicine, New York, New York, USA
| | - Robert M Plenge
- Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, USA Program in Medical and Population Genetics, Broad Institute, Cambridge, Massachusetts, USA
| | - Sang-Cheol Bae
- Department of Rheumatology, Hanyang University Hospital for Rheumatic Diseases, Seoul, Republic of Korea
| |
Collapse
|
1803
|
Pierron D, Razafindrazaka H, Pagani L, Ricaut FX, Antao T, Capredon M, Sambo C, Radimilahy C, Rakotoarisoa JA, Blench RM, Letellier T, Kivisild T. Genome-wide evidence of Austronesian-Bantu admixture and cultural reversion in a hunter-gatherer group of Madagascar. Proc Natl Acad Sci U S A 2014; 111:936-41. [PMID: 24395773 PMCID: PMC3903192 DOI: 10.1073/pnas.1321860111] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Linguistic and cultural evidence suggest that Madagascar was the final point of two major dispersals of Austronesian- and Bantu-speaking populations. Today, the Mikea are described as the last-known Malagasy population reported to be still practicing a hunter-gatherer lifestyle. It is unclear, however, whether the Mikea descend from a remnant population that existed before the arrival of Austronesian and Bantu agriculturalists or whether it is only their lifestyle that separates them from the other contemporary populations of South Madagascar. To address these questions we have performed a genome-wide analysis of >700,000 SNP markers on 21 Mikea, 24 Vezo, and 24 Temoro individuals, together with 50 individuals from Bajo and Lebbo populations from Indonesia. Our analyses of these data in the context of data available from other Southeast Asian and African populations reveal that all three Malagasy populations are derived from the same admixture event involving Austronesian and Bantu sources. In contrast to the fact that most of the vocabulary of the Malagasy speakers is derived from the Barito group of the Austronesian language family, we observe that only one-third of their genetic ancestry is related to the populations of the Java-Kalimantan-Sulawesi area. Because no additional ancestry components distinctive for the Mikea were found, it is likely that they have adopted their hunter-gatherer way of life through cultural reversion, and selection signals suggest a genetic adaptation to their new lifestyle.
Collapse
Affiliation(s)
- Denis Pierron
- Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Unité Mixte de Recherche 5288, Centre National de la Recherche Scientifique, Université de Toulouse, 31073 Toulouse, France
- Plateforme Technologique d'Innovation Biomédicale, Institut National de la Santé et de la Recherche Médicale, 33600 Pessac, France
| | - Harilanto Razafindrazaka
- Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Unité Mixte de Recherche 5288, Centre National de la Recherche Scientifique, Université de Toulouse, 31073 Toulouse, France
- Plateforme Technologique d'Innovation Biomédicale, Institut National de la Santé et de la Recherche Médicale, 33600 Pessac, France
| | - Luca Pagani
- Division of Biological Anthropology, University of Cambridge, Cambridge CB2 3DZ, United Kingdom
| | - François-Xavier Ricaut
- Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Unité Mixte de Recherche 5288, Centre National de la Recherche Scientifique, Université de Toulouse, 31073 Toulouse, France
| | - Tiago Antao
- Division of Biological Anthropology, University of Cambridge, Cambridge CB2 3DZ, United Kingdom
| | - Mélanie Capredon
- Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Unité Mixte de Recherche 5288, Centre National de la Recherche Scientifique, Université de Toulouse, 31073 Toulouse, France
| | - Clément Sambo
- Ecole Normale Supérieure, Université de Toliara, Toliara 601, Madagascar
| | - Chantal Radimilahy
- Institut de Civilisations/Musée d'Art et d'Archéologie, Isoraka, Antananarivo 101, Madagascar; and
| | - Jean-Aimé Rakotoarisoa
- Institut de Civilisations/Musée d'Art et d'Archéologie, Isoraka, Antananarivo 101, Madagascar; and
| | - Roger M. Blench
- Kay Williamson Educational Foundation, Cambridge CB1 2AL, United Kingdom
| | - Thierry Letellier
- Laboratoire d'Anthropologie Moléculaire et Imagerie de Synthèse, Unité Mixte de Recherche 5288, Centre National de la Recherche Scientifique, Université de Toulouse, 31073 Toulouse, France
- Plateforme Technologique d'Innovation Biomédicale, Institut National de la Santé et de la Recherche Médicale, 33600 Pessac, France
| | - Toomas Kivisild
- Division of Biological Anthropology, University of Cambridge, Cambridge CB2 3DZ, United Kingdom
| |
Collapse
|
1804
|
Lee YW, Gould BA, Stinchcombe JR. Identifying the genes underlying quantitative traits: a rationale for the QTN programme. AOB PLANTS 2014; 6:plu004. [PMID: 24790125 PMCID: PMC4038433 DOI: 10.1093/aobpla/plu004] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2013] [Accepted: 01/01/2014] [Indexed: 05/19/2023]
Abstract
The goal of identifying the genes or even nucleotides underlying quantitative and adaptive traits has been characterized as the 'QTN programme' and has recently come under severe criticism. Part of the reason for this criticism is that much of the QTN programme has asserted that finding the genes and nucleotides for adaptive and quantitative traits is a fundamental goal, without explaining why it is such a hallowed goal. Here we outline motivations for the QTN programme that offer general insight, regardless of whether QTNs are of large or small effect, and that aid our understanding of the mechanistic dynamics of adaptive evolution. We focus on five areas: (i) vertical integration of insight across different levels of biological organization, (ii) genetic parallelism and the role of pleiotropy in shaping evolutionary dynamics, (iii) understanding the forces maintaining genetic variation in populations, (iv) distinguishing between adaptation from standing variation and new mutation, and (v) the role of genomic architecture in facilitating adaptation. We argue that rather than abandoning the QTN programme, we should refocus our efforts on topics where molecular data will be the most effective for testing hypotheses about phenotypic evolution.
Collapse
Affiliation(s)
- Young Wha Lee
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, CanadaM5S 3B2
| | - Billie A. Gould
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, CanadaM5S 3B2
| | - John R. Stinchcombe
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, CanadaM5S 3B2
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, CanadaM5S 3B2
- Corresponding author's e-mail address:
| |
Collapse
|
1805
|
Benton MC, Lea RA, Macartney-Coxson D, Carless MA, Göring HH, Bellis C, Hanna M, Eccles D, Chambers GK, Curran JE, Harper JL, Blangero J, Griffiths LR. Mapping eQTLs in the Norfolk Island genetic isolate identifies candidate genes for CVD risk traits. Am J Hum Genet 2013; 93:1087-99. [PMID: 24314549 DOI: 10.1016/j.ajhg.2013.11.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Revised: 10/29/2013] [Accepted: 11/07/2013] [Indexed: 02/06/2023] Open
Abstract
Cardiovascular disease (CVD) affects millions of people worldwide and is influenced by numerous factors, including lifestyle and genetics. Expression quantitative trait loci (eQTLs) influence gene expression and are good candidates for CVD risk. Founder-effect pedigrees can provide additional power to map genes associated with disease risk. Therefore, we identified eQTLs in the genetic isolate of Norfolk Island (NI) and tested for associations between these and CVD risk factors. We measured genome-wide transcript levels of blood lymphocytes in 330 individuals and used pedigree-based heritability analysis to identify heritable transcripts. eQTLs were identified by genome-wide association testing of these transcripts. Testing for association between CVD risk factors (i.e., blood lipids, blood pressure, and body fat indices) and eQTLs revealed 1,712 heritable transcripts (p < 0.05) with heritability values ranging from 0.18 to 0.84. From these, we identified 200 cis-acting and 70 trans-acting eQTLs (p < 1.84 × 10(-7)) An eQTL-centric analysis of CVD risk traits revealed multiple associations, including 12 previously associated with CVD-related traits. Trait versus eQTL regression modeling identified four CVD risk candidates (NAAA, PAPSS1, NME1, and PRDX1), all of which have known biological roles in disease. In addition, we implicated several genes previously associated with CVD risk traits, including MTHFR and FN3KRP. We have successfully identified a panel of eQTLs in the NI pedigree and used this to implicate several genes in CVD risk. Future studies are required for further assessing the functional importance of these eQTLs and whether the findings here also relate to outbred populations.
Collapse
Affiliation(s)
- Miles C Benton
- Genomics Research Centre, Griffith Health Institute, Griffith University, Southport, QLD 4222, Australia
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1806
|
Recessive mutations in a distal PTF1A enhancer cause isolated pancreatic agenesis. Nat Genet 2013; 46:61-64. [PMID: 24212882 PMCID: PMC4131753 DOI: 10.1038/ng.2826] [Citation(s) in RCA: 196] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2013] [Accepted: 10/16/2013] [Indexed: 12/19/2022]
Abstract
The contribution of cis-regulatory mutations to human disease remains poorly understood. Whole genome sequencing can identify all non-coding variants, yet discrimination of causal regulatory mutations represents a formidable challenge. We used epigenomic annotation in hESC-derived embryonic pancreatic progenitor cells to guide the interpretation of whole genome sequences from patients with isolated pancreatic agenesis. This uncovered six different recessive mutations in a previously uncharacterized ~400bp sequence located 25kb downstream of PTF1A (pancreas-specific transcription factor 1a) in ten families with pancreatic agenesis. We show that this region acts as a developmental enhancer of PTF1A and that the mutations abolish enhancer activity. These mutations are the most common cause of isolated pancreatic agenesis. Integrating genome sequencing and epigenomic annotation in a disease-relevant cell type can uncover novel non-coding elements underlying human development and disease.
Collapse
|
1807
|
Moltke I, Albrechtsen A. RelateAdmix: a software tool for estimating relatedness between admixed individuals. ACTA ACUST UNITED AC 2013; 30:1027-8. [PMID: 24215025 DOI: 10.1093/bioinformatics/btt652] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Pairwise relatedness plays an important role in a range of genetic research fields. However, currently only few estimators exist for individuals that are admixed, i.e. have ancestry from more than one population, and these estimators fail in some situations. RESULTS We present a new software tool, RelateAdmix, for obtaining maximum likelihood estimates of pairwise relatedness from genetic data between admixed individuals. We show using simulated data that it gives rise to better estimates than three state-of-the-art software tools, REAP, KING and Plink, while still being fast enough to be applicable to large datasets. AVAILABILITY AND IMPLEMENTATION The software tool, implemented in C and R, is freely available from www.popgen.dk/software.
Collapse
Affiliation(s)
- Ida Moltke
- Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA and Department of Biology, The Bioinformatics Centre, University of Copenhagen, 2200 Copenhagen N, Denmark
| | | |
Collapse
|
1808
|
Ridge PG, Mukherjee S, Crane PK, Kauwe JSK. Alzheimer's disease: analyzing the missing heritability. PLoS One 2013; 8:e79771. [PMID: 24244562 PMCID: PMC3820606 DOI: 10.1371/journal.pone.0079771] [Citation(s) in RCA: 210] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 09/27/2013] [Indexed: 12/17/2022] Open
Abstract
Alzheimer's disease (AD) is a complex disorder influenced by environmental and genetic factors. Recent work has identified 11 AD markers in 10 loci. We used Genome-wide Complex Trait Analysis to analyze >2 million SNPs for 10,922 individuals from the Alzheimer's Disease Genetics Consortium to assess the phenotypic variance explained first by known late-onset AD loci, and then by all SNPs in the Alzheimer's Disease Genetics Consortium dataset. In all, 33% of total phenotypic variance is explained by all common SNPs. APOE alone explained 6% and other known markers 2%, meaning more than 25% of phenotypic variance remains unexplained by known markers, but is tagged by common SNPs included on genotyping arrays or imputed with HapMap genotypes. Novel AD markers that explain large amounts of phenotypic variance are likely to be rare and unidentifiable using genome-wide association studies. Based on our findings and the current direction of human genetics research, we suggest specific study designs for future studies to identify the remaining heritability of Alzheimer's disease.
Collapse
Affiliation(s)
- Perry G. Ridge
- Department of Biology, Brigham Young University, Provo, Utah, United States of America
- ARUP Institute for Clinical and Experimental Pathology, Salt Lake City, Utah, United States of America
| | - Shubhabrata Mukherjee
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - Paul K. Crane
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| | - John S. K. Kauwe
- Department of Biology, Brigham Young University, Provo, Utah, United States of America
| | | |
Collapse
|
1809
|
Christie MR. Bayesian parentage analysis reliably controls the number of false assignments in natural populations. Mol Ecol 2013; 22:5731-7. [DOI: 10.1111/mec.12528] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 03/19/2013] [Accepted: 03/25/2013] [Indexed: 11/27/2022]
Affiliation(s)
- Mark R. Christie
- Department of Zoology; Oregon State University; Corvallis OR 97331-2914 USA
| |
Collapse
|
1810
|
A genome-wide association study of chronic otitis media with effusion and recurrent otitis media identifies a novel susceptibility locus on chromosome 2. J Assoc Res Otolaryngol 2013; 14:791-800. [PMID: 23974705 DOI: 10.1007/s10162-013-0411-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 08/04/2013] [Indexed: 01/13/2023] Open
Abstract
Chronic otitis media with effusion (COME) and recurrent otitis media (ROM) have been shown to be heritable, but candidate gene and linkage studies to date have been equivocal. Our aim was to identify genetic susceptibility factors using a genome-wide association study (GWAS). We genotyped 602 subjects from 143 families with 373 COME/ROM subjects using the Illumina Human CNV370-Duo DNA Bead Chip (324,748 SNPs). We carried out the GWAS scan and imputed SNPs at the regions with the most significant associations. Replication genotyping in an independent family-based sample was conducted for 53 SNPs: the 41 most significant SNPs with P < 10(-4) and 12 imputed SNPs with P < 10(-4) on chromosome 15 (near the strongest signal). We replicated the association of rs10497394 (GWAS discovery P = 1.30 × 10(-5)) on chromosome 2 in the independent otitis media population (P = 4.7 × 10(-5); meta-analysis P = 1.52 × 10(-8)). Three additional SNPs had replication P values < 0.10. Two were on chromosome 15q26.1 including rs1110060, the strongest association with COME/ROM in the primary GWAS (P = 3.4 ×10(-7)) in KIF7 intron 7 (P = 0.072), and rs10775247, a non-synonymous SNP in TICRR exon 2 (P = 0.075). The third SNP rs386057 was on chromosome 5 in TPPP intron 1 (P = 0.045). We have performed the first GWAS of COME/ROM and have identified a SNP rs10497394 on chromosome 2 is significantly associated with COME/ROM susceptibility. This SNP is within a 537 kb intergenic region, bordered by CDCA7 and SP3. The genomic and functional significance of this newly identified locus in COME/ROM pathogenesis requires additional investigation.
Collapse
|
1811
|
Wells QS, Becker JR, Su YR, Mosley JD, Weeke P, D'Aoust L, Ausborn NL, Ramirez AH, Pfotenhauer JP, Naftilan AJ, Markham L, Exil V, Roden DM, Hong CC. Whole exome sequencing identifies a causal RBM20 mutation in a large pedigree with familial dilated cardiomyopathy. ACTA ACUST UNITED AC 2013; 6:317-26. [PMID: 23861363 DOI: 10.1161/circgenetics.113.000011] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
BACKGROUND Whole exome sequencing is a powerful technique for Mendelian disease gene discovery. However, variant prioritization remains a challenge. We applied whole exome sequencing to identify the causal variant in a large family with familial dilated cardiomyopathy of unknown pathogenesis. METHODS AND RESULTS A large family with autosomal dominant, familial dilated cardiomyopathy was identified. Exome capture and sequencing were performed in 3 remotely related, affected subjects predicted to share <0.1% of their genomes by descent. Shared variants were filtered for rarity, evolutionary conservation, and predicted functional significance, and remaining variants were filtered against 71 locally generated exomes. Variants were also prioritized using the Variant Annotation Analysis and Search Tool. Final candidates were validated by Sanger sequencing and tested for segregation. There were 664 shared heterozygous nonsense, missense, or splice site variants, of which 26 were rare (minor allele frequency ≤0.001 or not reported) in 2 public databases. Filtering against internal exomes reduced the number of candidates to 2, and of these, a single variant (c.1907 G>A) in RBM20, segregated with disease status and was absent in unaffected internal reference exomes. Bioinformatic prioritization with Variant Annotation Analysis and Search Tool supported this result. CONCLUSIONS Whole exome sequencing of remotely related dilated cardiomyopathy subjects from a large, multiplex family, followed by systematic filtering, identified a causal RBM20 mutation without the need for linkage analysis.
Collapse
Affiliation(s)
- Quinn S Wells
- Division of Cardiovascular Medicine, Vanderbilt University School of Medicine, Nashville, TN, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1812
|
Lee S, Teslovich T, Boehnke M, Lin X. General framework for meta-analysis of rare variants in sequencing association studies. Am J Hum Genet 2013; 93:42-53. [PMID: 23768515 PMCID: PMC3710762 DOI: 10.1016/j.ajhg.2013.05.010] [Citation(s) in RCA: 169] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Revised: 04/19/2013] [Accepted: 05/14/2013] [Indexed: 12/22/2022] Open
Abstract
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels.
Collapse
Affiliation(s)
- Seunggeun Lee
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| | - Tanya M. Teslovich
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Michael Boehnke
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xihong Lin
- Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
| |
Collapse
|
1813
|
Fedorova SA, Reidla M, Metspalu E, Metspalu M, Rootsi S, Tambets K, Trofimova N, Zhadanov SI, Kashani BH, Olivieri A, Voevoda MI, Osipova LP, Platonov FA, Tomsky MI, Khusnutdinova EK, Torroni A, Villems R. Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia. BMC Evol Biol 2013; 13:127. [PMID: 23782551 PMCID: PMC3695835 DOI: 10.1186/1471-2148-13-127] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2013] [Accepted: 06/10/2013] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Sakha--an area connecting South and Northeast Siberia--is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia. RESULTS We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia. CONCLUSIONS Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
Collapse
Affiliation(s)
- Sardana A Fedorova
- Department of Molecular Genetics, Yakut Research Center of Complex Medical Problems, Russian Academy of Medical Sciences and North-Eastern Federal University, Yakutsk, Russia
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
| | - Maere Reidla
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
| | - Ene Metspalu
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
| | | | | | | | - Natalya Trofimova
- Institute of Biochemistry and Genetics, Ufa Scientific Center, Russian Academy of Sciences, Ufa, Russia
| | - Sergey I Zhadanov
- Department of Anthropology, University of Pennsylvania, Philadelphia, USA
| | | | - Anna Olivieri
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Pavia, Italy
| | - Mikhail I Voevoda
- Institute of Internal Medicine, Siberian Branch of Russian Academy of Medical Sciences, Novosibirsk, Russia
| | - Ludmila P Osipova
- Institute of Genetics and Cytology, Siberian Branch of Russian Academy of Sciences, Novosibirsk, Russia
| | - Fedor A Platonov
- Institute of Health, North-East Federal University, Yakutsk, Russia
| | - Mikhail I Tomsky
- Department of Molecular Genetics, Yakut Research Center of Complex Medical Problems, Russian Academy of Medical Sciences and North-Eastern Federal University, Yakutsk, Russia
| | - Elza K Khusnutdinova
- Institute of Biochemistry and Genetics, Ufa Scientific Center, Russian Academy of Sciences, Ufa, Russia
- Department of Genetics and Fundamental Medicine, Bashkir State University, Ufa, Russia
| | - Antonio Torroni
- Dipartimento di Biologia e Biotecnologie, Università di Pavia, Pavia, Italy
| | - Richard Villems
- Department of Evolutionary Biology, University of Tartu, Tartu, Estonia
- Estonian Biocentre, Tartu, Estonia
- Estonian Academy of Sciences, Tallinn, Estonia
| |
Collapse
|
1814
|
Martignetti JA, Tian L, Li D, Ramirez MCM, Camacho-Vanegas O, Camacho SC, Guo Y, Zand DJ, Bernstein AM, Masur SK, Kim CE, Otieno FG, Hou C, Abdel-Magid N, Tweddale B, Metry D, Fournet JC, Papp E, McPherson EW, Zabel C, Vaksmann G, Morisot C, Keating B, Sleiman PM, Cleveland JA, Everman DB, Zackai E, Hakonarson H. Mutations in PDGFRB cause autosomal-dominant infantile myofibromatosis. Am J Hum Genet 2013; 92:1001-7. [PMID: 23731542 DOI: 10.1016/j.ajhg.2013.04.024] [Citation(s) in RCA: 138] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2013] [Revised: 04/19/2013] [Accepted: 04/30/2013] [Indexed: 01/30/2023] Open
Abstract
Infantile myofibromatosis (IM) is a disorder of mesenchymal proliferation characterized by the development of nonmetastasizing tumors in the skin, muscle, bone, and viscera. Occurrence within families across multiple generations is suggestive of an autosomal-dominant (AD) inheritance pattern, but autosomal-recessive (AR) modes of inheritance have also been proposed. We performed whole-exome sequencing (WES) in members of nine unrelated families clinically diagnosed with AD IM to identify the genetic origin of the disorder. In eight of the families, we identified one of two disease-causing mutations, c.1978C>A (p.Pro660Thr) and c.1681C>T (p.Arg561Cys), in PDGFRB. Intriguingly, one family did not have either of these PDGFRB mutations but all affected individuals had a c.4556T>C (p.Leu1519Pro) mutation in NOTCH3. Our studies suggest that mutations in PDGFRB are a cause of IM and highlight NOTCH3 as a candidate gene. Further studies of the crosstalk between PDGFRB and NOTCH pathways may offer new opportunities to identify mutations in other genes that result in IM and is a necessary first step toward understanding the mechanisms of both tumor growth and regression and its targeted treatment.
Collapse
Affiliation(s)
- John A Martignetti
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA; Department of Pediatrics, Mount Sinai School of Medicine, New York, NY 10029, USA; Department of Oncological Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1815
|
Imputing amino acid polymorphisms in human leukocyte antigens. PLoS One 2013; 8:e64683. [PMID: 23762245 PMCID: PMC3675122 DOI: 10.1371/journal.pone.0064683] [Citation(s) in RCA: 473] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 04/17/2013] [Indexed: 12/14/2022] Open
Abstract
DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes.
Collapse
|
1816
|
Morrison J. Characterization and correction of error in genome-wide IBD estimation for samples with population structure. Genet Epidemiol 2013; 37:635-41. [PMID: 23740691 DOI: 10.1002/gepi.21737] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Revised: 04/12/2013] [Accepted: 04/17/2013] [Indexed: 12/30/2022]
Abstract
The proportion of the genome that is shared identical by descent (IBD) between pairs of individuals is often estimated in studies involving genome-wide SNP data. These estimates can be used to check pedigrees, estimate heritability, and adjust association analyses. We focus on the method of moments technique as implemented in PLINK [Purcell et al., 2007] and other software that estimates the proportions of the genome at which two individuals share 0, 1, or 2 alleles IBD. This technique is based on the assumption that the study sample is drawn from a single, homogeneous, randomly mating population. This assumption is violated if pedigree founders are drawn from multiple populations or include admixed individuals. In the presence of population structure, the method of moments estimator has an inflated variance and can be biased because it relies on sample-based allele frequency estimates. In the case of the PLINK estimator, which truncates genome-wide sharing estimates at zero and one to generate biologically interpretable results, the bias is most often towards over-estimation of relatedness between ancestrally similar individuals. Using simulated pedigrees, we are able to demonstrate and quantify the behavior of the PLINK method of moments estimator under different population structure conditions. We also propose a simple method based on SNP pruning for improving genome-wide IBD estimates when the assumption of a single, homogeneous population is violated.
Collapse
Affiliation(s)
- Jean Morrison
- Department of Biostatistics, University of Washington, Seattle, Washington 98195-7232, USA.
| |
Collapse
|
1817
|
Johnson EO, Hancock DB, Levy JL, Gaddis NC, Saccone NL, Bierut LJ, Page GP. Imputation across genotyping arrays for genome-wide association studies: assessment of bias and a correction strategy. Hum Genet 2013; 132:509-22. [PMID: 23334152 PMCID: PMC3628082 DOI: 10.1007/s00439-013-1266-7] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 01/07/2013] [Indexed: 12/20/2022]
Abstract
A great promise of publicly sharing genome-wide association data is the potential to create composite sets of controls. However, studies often use different genotyping arrays, and imputation to a common set of SNPs has shown substantial bias: a problem which has no broadly applicable solution. Based on the idea that using differing genotyped SNP sets as inputs creates differential imputation errors and thus bias in the composite set of controls, we examined the degree to which each of the following occurs: (1) imputation based on the union of genotyped SNPs (i.e., SNPs available on one or more arrays) results in bias, as evidenced by spurious associations (type 1 error) between imputed genotypes and arbitrarily assigned case/control status; (2) imputation based on the intersection of genotyped SNPs (i.e., SNPs available on all arrays) does not evidence such bias; and (3) imputation quality varies by the size of the intersection of genotyped SNP sets. Imputations were conducted in European Americans and African Americans with reference to HapMap phase II and III data. Imputation based on the union of genotyped SNPs across the Illumina 1M and 550v3 arrays showed spurious associations for 0.2 % of SNPs: ~2,000 false positives per million SNPs imputed. Biases remained problematic for very similar arrays (550v1 vs. 550v3) and were substantial for dissimilar arrays (Illumina 1M vs. Affymetrix 6.0). In all instances, imputing based on the intersection of genotyped SNPs (as few as 30 % of the total SNPs genotyped) eliminated such bias while still achieving good imputation quality.
Collapse
Affiliation(s)
- Eric O Johnson
- Behavioral Health Epidemiology Program, RTI International, 3040 Cornwallis Road, PO Box 12194, Research Triangle Park, NC 27709-12194, USA.
| | | | | | | | | | | | | |
Collapse
|
1818
|
Frazier-Wood AC, Manichaikul A, Aslibekyan S, Borecki IB, Goff DC, Hopkins PN, Lai CQ, Ordovas JM, Post WS, Rich SS, Sale MM, Siscovick D, Straka RJ, Tiwari HK, Tsai MY, Rotter JI, Arnett DK. Genetic variants associated with VLDL, LDL and HDL particle size differ with race/ethnicity. Hum Genet 2013; 132:405-13. [PMID: 23263444 PMCID: PMC3600091 DOI: 10.1007/s00439-012-1256-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 11/30/2012] [Indexed: 10/27/2022]
Abstract
Specific constellations of lipoprotein particle features, reflected as differences in mean lipoprotein particle diameters, are associated with risk of insulin resistance (IR) and cardiovascular disease (CVD). The associations of lipid profiles with disease risk differ by race/ethnicity, the reason for this is not clear. We aimed to examine whether there were additional genetic differences between racial/ethnic groups on lipoprotein profile. Genotypes were assessed using the Affymetrix 6.0 array in 817 related Caucasian participants of the Genetics of Lipid Lowering Drugs and Diet Network (GOLDN). Association analysis was conducted on fasting mean particle diameters using linear models, adjusted for age, sex and study center as fixed effects, and pedigree as a random effect. Replication of associations reaching P < 1.97 × 10(-05) (the level at which we achieved at least 80% power to replicate SNP-phenotype associations) was conducted in the Caucasian population of the Multi-Ethnic Study of Atherosclerosis (MESA; N = 2,430). Variants which replicated across both Caucasian populations were subsequently tested for association in the African-American (N = 1,594), Chinese (N = 758), and Hispanic (N = 1,422) populations of MESA. Variants in the APOB gene region were significantly associated with mean VLDL diameter in GOLDN, and in the Caucasian and Hispanic populations of MESA, while variation in the hepatic lipase (LIPC) gene was associated with mean HDL diameter in both Caucasians populations only. Our findings suggest that the genetic underpinnings of mean lipoprotein diameter differ by race/ethnicity. As lipoprotein diameters are modifiable, this may lead new strategies to modify lipoprotein profiles during the reduction of IR that are sensitive to race/ethnicity.
Collapse
Affiliation(s)
- Alexis C Frazier-Wood
- Department of Epidemiology, University of Alabama at Birmingham, Birmingham, AL, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1819
|
A pruning strategy of reference panels for fast SNP genotype imputation. BIOCHIP JOURNAL 2013. [DOI: 10.1007/s13206-013-7102-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
1820
|
Christie MR, Tennessen JA, Blouin MS. Bayesian parentage analysis with systematic accountability of genotyping error, missing data and false matching. ACTA ACUST UNITED AC 2013; 29:725-32. [PMID: 23365409 DOI: 10.1093/bioinformatics/btt039] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
MOTIVATION The goal of any parentage analysis is to identify as many parent-offspring relationships as possible, while minimizing incorrect assignments. Existing methods can achieve these ends, but they require additional information in the form of demographic data, thousands of markers and/or estimates of genotyping error rates. For many non-model systems, it is simply not practical, cost-effective or logistically feasible to obtain this information. Here, we develop a Bayesian parentage method that only requires the sampled genotypes to account for genotyping error, missing data and false matches. RESULTS Extensive testing with microsatellite and SNP datasets reveals that our Bayesian parentage method reliably controls for the number of false assignments, irrespective of the genotyping error rate. When the number of loci is limiting, our approach maximizes the number of correct assignments by accounting for the frequencies of shared alleles. Comparisons with exclusion and likelihood-based methods on an empirical salmon dataset revealed that our Bayesian method had the highest ratio of correct to incorrect assignments.
Collapse
Affiliation(s)
- Mark R Christie
- Department of Zoology, Oregon State University, Corvallis, OR 97331-2914, USA.
| | | | | |
Collapse
|
1821
|
Vernot B, Stergachis AB, Maurano MT, Vierstra J, Neph S, Thurman RE, Stamatoyannopoulos JA, Akey JM. Personal and population genomics of human regulatory variation. Genome Res 2013; 22:1689-97. [PMID: 22955981 PMCID: PMC3431486 DOI: 10.1101/gr.134890.111] [Citation(s) in RCA: 91] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The characteristics and evolutionary forces acting on regulatory variation in humans remains elusive because of the difficulty in defining functionally important noncoding DNA. Here, we combine genome-scale maps of regulatory DNA marked by DNase I hypersensitive sites (DHSs) from 138 cell and tissue types with whole-genome sequences of 53 geographically diverse individuals in order to better delimit the patterns of regulatory variation in humans. We estimate that individuals likely harbor many more functionally important variants in regulatory DNA compared with protein-coding regions, although they are likely to have, on average, smaller effect sizes. Moreover, we demonstrate that there is significant heterogeneity in the level of functional constraint in regulatory DNA among different cell types. We also find marked variability in functional constraint among transcription factor motifs in regulatory DNA, with sequence motifs for major developmental regulators, such as HOX proteins, exhibiting levels of constraint comparable to protein-coding regions. Finally, we perform a genome-wide scan of recent positive selection and identify hundreds of novel substrates of adaptive regulatory evolution that are enriched for biologically interesting pathways such as melanogenesis and adipocytokine signaling. These data and results provide new insights into patterns of regulatory variation in individuals and populations and demonstrate that a large proportion of functionally important variation lies beyond the exome.
Collapse
Affiliation(s)
- Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | | | | | | | | | | | | | | |
Collapse
|
1822
|
Hancock DB, Levy JL, Gaddis NC, Bierut LJ, Saccone NL, Page GP, Johnson EO. Assessment of genotype imputation performance using 1000 Genomes in African American studies. PLoS One 2012; 7:e50610. [PMID: 23226329 PMCID: PMC3511547 DOI: 10.1371/journal.pone.0050610] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2012] [Accepted: 10/26/2012] [Indexed: 11/19/2022] Open
Abstract
Genotype imputation, used in genome-wide association studies to expand coverage of single nucleotide polymorphisms (SNPs), has performed poorly in African Americans compared to less admixed populations. Overall, imputation has typically relied on HapMap reference haplotype panels from Africans (YRI), European Americans (CEU), and Asians (CHB/JPT). The 1000 Genomes project offers a wider range of reference populations, such as African Americans (ASW), but their imputation performance has had limited evaluation. Using 595 African Americans genotyped on Illumina's HumanHap550v3 BeadChip, we compared imputation results from four software programs (IMPUTE2, BEAGLE, MaCH, and MaCH-Admix) and three reference panels consisting of different combinations of 1000 Genomes populations (February 2012 release): (1) 3 specifically selected populations (YRI, CEU, and ASW); (2) 8 populations of diverse African (AFR) or European (AFR) descent; and (3) all 14 available populations (ALL). Based on chromosome 22, we calculated three performance metrics: (1) concordance (percentage of masked genotyped SNPs with imputed and true genotype agreement); (2) imputation quality score (IQS; concordance adjusted for chance agreement, which is particularly informative for low minor allele frequency [MAF] SNPs); and (3) average r2hat (estimated correlation between the imputed and true genotypes, for all imputed SNPs). Across the reference panels, IMPUTE2 and MaCH had the highest concordance (91%-93%), but IMPUTE2 had the highest IQS (81%-83%) and average r2hat (0.68 using YRI+ASW+CEU, 0.62 using AFR+EUR, and 0.55 using ALL). Imputation quality for most programs was reduced by the addition of more distantly related reference populations, due entirely to the introduction of low frequency SNPs (MAF≤2%) that are monomorphic in the more closely related panels. While imputation was optimized by using IMPUTE2 with reference to the ALL panel (average r2hat = 0.86 for SNPs with MAF>2%), use of the ALL panel for African American studies requires careful interpretation of the population specificity and imputation quality of low frequency SNPs.
Collapse
Affiliation(s)
- Dana B Hancock
- Behavioral Health Epidemiology Program, Research Triangle Institute International, Research Triangle Park, North Carolina, United States of America.
| | | | | | | | | | | | | |
Collapse
|
1823
|
Browning SR, Browning BL. Identity-by-descent-based heritability analysis in the Northern Finland Birth Cohort. Hum Genet 2012; 132:129-38. [PMID: 23052944 PMCID: PMC3543768 DOI: 10.1007/s00439-012-1230-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Accepted: 09/15/2012] [Indexed: 01/07/2023]
Abstract
For most complex traits, only a small proportion of heritability is explained by statistically significant associations from genome-wide association studies (GWAS). In order to determine how much heritability can potentially be explained through larger GWAS, several different approaches for estimating total narrow-sense heritability from GWAS data have recently been proposed. These methods include variance components with relatedness estimates from allele-sharing, variance components with relatedness estimates from identity-by-descent (IBD) segments, and regression of phenotypic correlation on relatedness estimates from IBD segments. These methods have not previously been compared on real or simulated data. We analyze the narrow-sense heritability of nine metabolic traits in the Northern Finland Birth Cohort (NFBC) using these methods. We find substantial estimated heritability for several traits, including LDL cholesterol (54 % heritability), HDL cholesterol (46 % heritability), and fasting glucose levels (39 % heritability). Estimates of heritability from the regression-based approach are much lower than variance component estimates in these data, which may be due to the presence of strong population structure. We also investigate the accuracy of the competing approaches using simulated phenotypes based on genotype data from the NFBC. The simulation results substantiate the downward bias of the regression-based approach in the presence of population structure.
Collapse
Affiliation(s)
- Sharon R Browning
- Department of Biostatistics, University of Washington, Seattle, WA, USA.
| | | |
Collapse
|
1824
|
Staples J, Nickerson DA, Below JE. Utilizing graph theory to select the largest set of unrelated individuals for genetic analysis. Genet Epidemiol 2012; 37:136-41. [PMID: 22996348 DOI: 10.1002/gepi.21684] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2012] [Revised: 08/10/2012] [Accepted: 08/17/2012] [Indexed: 01/24/2023]
Abstract
Many statistical analyses of genetic data rely on the assumption of independence among samples. Consequently, relatedness is either modeled in the analysis or samples are removed to "clean" the data of any pairwise relatedness above a tolerated threshold. Current methods do not maximize the number of unrelated individuals retained for further analysis, and this is a needless loss of resources. We report a novel application of graph theory that identifies the maximum set of unrelated samples in any dataset given a user-defined threshold of relatedness as well as all networks of related samples. We have implemented this method into an open source program called Pedigree Reconstruction and Identification of a Maximum Unrelated Set, PRIMUS. We show that PRIMUS outperforms the three existing methods, allowing researchers to retain up to 50% more unrelated samples. A unique strength of PRIMUS is its ability to weight the maximum clique selection using additional criteria (e.g. affected status and data missingness). PRIMUS is a permanent solution to identifying the maximum number of unrelated samples for a genetic analysis.
Collapse
Affiliation(s)
- Jeffrey Staples
- Department of Genome Sciences, The University of Washington, Seattle, WA 98195, USA
| | | | | |
Collapse
|
1825
|
Lalli MA, Garcia G, Madrigal L, Arcos-Burgos M, Arcila ML, Kosik KS, Lopera F. Exploratory data from complete genomes of familial alzheimer disease age-at-onset outliers. Hum Mutat 2012; 33:1630-4. [PMID: 22829467 DOI: 10.1002/humu.22167] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Accepted: 07/07/2012] [Indexed: 11/08/2022]
Abstract
Identifying genes that modify the age at onset (AAO) of Alzheimer disease and targeting them pharmacologically represent a potential treatment strategy. In this exploratory study, we sequenced the complete genomes of six individuals with familial Alzheimer disease due to the autosomal dominant mutation p.Glu280Ala in PSEN1 (MIM# 104311; NM_000021.3:c.839A>C). The disease and its AAO are highly heritable, motivating our search for genetic variants that modulate AAO. The median AAO of dementia in carriers of the mutant allele is 49 years. Extreme phenotypic outliers for AAO in this genetically isolated population with limited environmental variance are likely to harbor onset modifying genetic variants. A narrow distribution of AAO in this kindred suggests large effect sizes of genetic determinants of AAO in these outliers. Identity by descent (IBD) analysis and a combination of bioinformatics filters have suggested several candidate variants for AAO modifiers. Future work and replication studies on these variants may provide mechanistic insights into the etiopathology of Alzheimer disease.
Collapse
Affiliation(s)
- Matthew A Lalli
- Neuroscience Research Institute, University of California at Santa Barbara, CA, USA
| | | | | | | | | | | | | |
Collapse
|
1826
|
Thornton T, Tang H, Hoffmann T, Ochs-Balcom H, Caan B, Risch N. Estimating kinship in admixed populations. Am J Hum Genet 2012; 91:122-38. [PMID: 22748210 PMCID: PMC3397261 DOI: 10.1016/j.ajhg.2012.05.024] [Citation(s) in RCA: 135] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Revised: 04/29/2012] [Accepted: 05/31/2012] [Indexed: 12/21/2022] Open
Abstract
Genome-wide association studies (GWASs) are commonly used for the mapping of genetic loci that influence complex traits. A problem that is often encountered in both population-based and family-based GWASs is that of identifying cryptic relatedness and population stratification because it is well known that failure to appropriately account for both pedigree and population structure can lead to spurious association. A number of methods have been proposed for identifying relatives in samples from homogeneous populations. A strong assumption of population homogeneity, however, is often untenable, and many GWASs include samples from structured populations. Here, we consider the problem of estimating relatedness in structured populations with admixed ancestry. We propose a method, REAP (relatedness estimation in admixed populations), for robust estimation of identity by descent (IBD)-sharing probabilities and kinship coefficients in admixed populations. REAP appropriately accounts for population structure and ancestry-related assortative mating by using individual-specific allele frequencies at SNPs that are calculated on the basis of ancestry derived from whole-genome analysis. In simulation studies with related individuals and admixture from highly divergent populations, we demonstrate that REAP gives accurate IBD-sharing probabilities and kinship coefficients. We apply REAP to the Mexican Americans in Los Angeles, California (MXL) population sample of release 3 of phase III of the International Haplotype Map Project; in this sample, we identify third- and fourth-degree relatives who have not previously been reported. We also apply REAP to the African American and Hispanic samples from the Women's Health Initiative SNP Health Association Resource (WHI-SHARe) study, in which hundreds of pairs of cryptically related individuals have been identified.
Collapse
Affiliation(s)
- Timothy Thornton
- Department of Biostatistics, University of Washington, Seattle, WA 98195, USA
| | - Hua Tang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Thomas J. Hoffmann
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94107, USA
| | - Heather M. Ochs-Balcom
- Department of Social and Preventive Medicine, University at Buffalo, Buffalo, NY 14214, USA
| | - Bette J. Caan
- Division of Research, Kaiser Permanente, Oakland, CA 94612, USA
| | - Neil Risch
- Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94143, USA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA 94107, USA
- Division of Research, Kaiser Permanente, Oakland, CA 94612, USA
| |
Collapse
|
1827
|
Tennessen JA, Bigham AW, O'Connor TD, Fu W, Kenny EE, Gravel S, McGee S, Do R, Liu X, Jun G, Kang HM, Jordan D, Leal SM, Gabriel S, Rieder MJ, Abecasis G, Altshuler D, Nickerson DA, Boerwinkle E, Sunyaev S, Bustamante CD, Bamshad MJ, Akey JM. Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science 2012; 337:64-9. [PMID: 22604720 PMCID: PMC3708544 DOI: 10.1126/science.1219240] [Citation(s) in RCA: 1230] [Impact Index Per Article: 102.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ~313 genes per genome, and ~95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
Collapse
Affiliation(s)
- Jacob A. Tennessen
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Abigail W. Bigham
- Department of Pediatrics, University of Washington, Seattle, WA 98195, USA
| | - Timothy D. O'Connor
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Wenqing Fu
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Eimear E. Kenny
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Simon Gravel
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Sean McGee
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Ron Do
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
- The Center for Human Genetic Research, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Xiaoming Liu
- Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA
| | - Goo Jun
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hyun Min Kang
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Daniel Jordan
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Suzanne M. Leal
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Stacey Gabriel
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
| | - Mark J. Rieder
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Goncalo Abecasis
- Department of Biostatistics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | | | - Eric Boerwinkle
- Human Genetics Center, University of Texas Health Sciences Center at Houston, Houston, TX 77030, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Shamil Sunyaev
- Broad Institute of MIT and Harvard, Cambridge, MA02142, USA
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | - Michael J. Bamshad
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Department of Pediatrics, University of Washington, Seattle, WA 98195, USA
| | - Joshua M. Akey
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
1828
|
Pajewski NM, Shrestha S, Quinn CP, Parker SD, Wiener H, Aissani B, McKinney BA, Poland GA, Edberg JC, Kimberly RP, Tang J, Kaslow RA. A genome-wide association study of host genetic determinants of the antibody response to Anthrax Vaccine Adsorbed. Vaccine 2012; 30:4778-84. [PMID: 22658931 DOI: 10.1016/j.vaccine.2012.05.032] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2012] [Revised: 04/20/2012] [Accepted: 05/14/2012] [Indexed: 11/16/2022]
Abstract
Several lines of evidence have supported a host genetic contribution to vaccine response, but genome-wide assessments for specific determinants have been sparse. Here we describe a genome-wide association study (GWAS) of protective antigen-specific antibody (AbPA) responses among 726 European-Americans who received Anthrax Vaccine Adsorbed (AVA) as part of a clinical trial. After quality control, 736,996 SNPs were tested for association with the AbPA response to 3 or 4 AVA vaccinations given over a 6-month period. No SNP achieved the threshold of genome-wide significance (p=5 × 10(-8)), but suggestive associations (p<1 × 10(-5)) were observed for SNPs in or near the class II region of the major histocompatibility complex (MHC), in the promoter region of SPSB1, and adjacent to MEX3C. Multivariable regression modeling suggested that much of the association signal within the MHC corresponded to previously identified HLA DR-DQ haplotypes involving component HLA-DRB1 alleles of *15:01, *01:01, or *01:02. We estimated the proportion of additive genetic variance explained by common SNP variation for the AbPA response after the 6 month vaccination. This analysis indicated a significant, albeit imprecisely estimated, contribution of variation tagged by common polymorphisms (p=0.032). Future studies will be required to replicate these findings in European Americans and to further elucidate the host genetic factors underlying variable immune response to AVA.
Collapse
Affiliation(s)
- Nicholas M Pajewski
- Department of Biostatistical Sciences, Wake Forest University Health Sciences, Winston Salem, NC 27157-1063, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1829
|
Edwards TL, Li C. Optimized selection of unrelated subjects for whole-genome sequencing studies of rare high-penetrance alleles. Genet Epidemiol 2012; 36:472-9. [PMID: 22623060 DOI: 10.1002/gepi.21641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Revised: 03/05/2012] [Accepted: 04/11/2012] [Indexed: 01/13/2023]
Abstract
Sequencing studies using whole-genome or exome scans are still more expensive than genome-wide association studies on a per-subject basis. As a result, only a subset of subjects from a larger study will be selected for sequencing. To perform an agnostic investigation of the entire genome, subjects may be selected that capture independent ancestral lineages, i.e., founder genomes, and thus avoid redundant information from regions that were inherited identical by descent (IBD) from a common ancestor. We present SampleSeq2 that can be used to select a subset of optimally unrelated subjects with minimal IBD sharing. It also can be used to estimate the number, G(T), of founder chromosomes in a sample or select the minimum number of subjects that will carry a target G(T). We evaluated SampleSeq2 compared to a random draw of a small number of subjects both by simulation and using the Anabaptist genealogy. SampleSeq2 provided an increase in G(T) relative to a random draw across a range of small sample sizes. This increase in founder chromosomes improves the power of association tests, mitigates the effect of cryptic relatedness on parameter estimates, increases the total yield of alleles from sequencing, and minimizes the average size of regions shared IBD around disease alleles in cases.
Collapse
Affiliation(s)
- Todd L Edwards
- Vanderbilt Epidemiology Center, Division of Epidemiology, Department of Medicine, Vanderbilt University, Nashville, TN 37212-0700, USA
| | | |
Collapse
|
1830
|
Phillips C, García-Magariños M, Salas A, Carracedo A, Lareu MV. SNPs as Supplements in Simple Kinship Analysis or as Core Markers in Distant Pairwise Relationship Tests: When Do SNPs Add Value or Replace Well-Established and Powerful STR Tests? Transfus Med Hemother 2012; 39:202-210. [PMID: 22851936 DOI: 10.1159/000338857] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2012] [Accepted: 03/03/2012] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND: Genetic tests for kinship testing routinely reach likelihoods that provide virtual proof of the claimed relationship by typing microsatellites-commonly consisting of 12-15 standard forensic short tandem repeats (STRs). Single nucleotide polymorphisms (SNPs) have also been applied to kinship testing but these binary markers are required in greater numbers than multiple-allele STRs. However SNPs offer certain advantageous characteristics not found in STRs, including, much higher mutational stability, good performance typing highly degraded DNA, and the ability to be readily up-scaled to very high marker numbers reaching over a million loci. This article outlines kinship testing applications where SNPs markedly improve the genetic data obtained. In particular we explore the minimum number of SNPs that will be required to confirm pairwise relationship claims in deficient pedigrees that typify missing persons' identification or war grave investigations where commonly few surviving relatives are available for comparison and the DNA is highly degraded. METHODS: We describe the application of SNPs alongside STRs when incomplete profiles or allelic instability in STRs create ambiguous results, we review the use of high density SNP arrays when the relationship claim is very distant, and we outline simulations of kinship analyses with STRs supplemented with SNPs in order to estimate the practical limit of pairwise relationships that can be differentiated from random unrelated pairs from the same population. RESULTS: The minimum number of SNPs for robust statistical inference of parent-offspring relationships through to those of second cousins (S-3-3) is estimated for both simple, single multiplex SNP sets and for subsets of million-SNP arrays. CONCLUSIONS: There is considerable scope for resolving ambiguous STR results and for improving the statistical power of kinship analysis by adding small-scale SNP sets but where the pedigree is deficient the pairwise relationships must be relatively close. For more distant relationships it is possible to reduce chip-based SNP arrays from the million+ markers down to ∼7,000. However, such numbers indicate that current genotyping approaches will not be able to deliver sufficient data to resolve distant pairwise relationships from the limited DNA typical of the most challenging identification cases.
Collapse
Affiliation(s)
- Christopher Phillips
- Forensic Genetics Unit, Institute of Legal Medicine, University of Santiago de Compostela, Santiago de Compostela, Galicia, Galicia, Spain
| | | | | | | | | |
Collapse
|
1831
|
Analysis of a claimed distant relationship in a deficient pedigree using high density SNP data. Forensic Sci Int Genet 2012; 6:350-3. [DOI: 10.1016/j.fsigen.2011.07.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Revised: 06/16/2011] [Accepted: 07/19/2011] [Indexed: 11/19/2022]
|
1832
|
DNA microarray as a tool in establishing genetic relatedness—Current status and future prospects. Forensic Sci Int Genet 2012; 6:322-9. [PMID: 21813350 DOI: 10.1016/j.fsigen.2011.07.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Revised: 06/12/2011] [Accepted: 07/05/2011] [Indexed: 11/21/2022]
|
1833
|
Manichaikul A, Palmas W, Rodriguez CJ, Peralta CA, Divers J, Guo X, Chen WM, Wong Q, Williams K, Kerr KF, Taylor KD, Tsai MY, Goodarzi MO, Sale MM, Diez-Roux AV, Rich SS, Rotter JI, Mychaleckyj JC. Population structure of Hispanics in the United States: the multi-ethnic study of atherosclerosis. PLoS Genet 2012; 8:e1002640. [PMID: 22511882 PMCID: PMC3325201 DOI: 10.1371/journal.pgen.1002640] [Citation(s) in RCA: 67] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2011] [Accepted: 02/20/2012] [Indexed: 01/06/2023] Open
Abstract
Using ~60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.
Collapse
Affiliation(s)
- Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, Virginia, United States of America.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
1834
|
Manichaikul A, Chen WM, Williams K, Wong Q, Sale MM, Pankow JS, Tsai MY, Rotter JI, Rich SS, Mychaleckyj JC. Analysis of family- and population-based samples in cohort genome-wide association studies. Hum Genet 2012; 131:275-87. [PMID: 21805149 PMCID: PMC3369696 DOI: 10.1007/s00439-011-1071-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2011] [Accepted: 07/09/2011] [Indexed: 01/22/2023]
Abstract
Cohort studies typically sample unrelated individuals from a population, although family members of index cases may also be recruited to investigate shared familial risk factors. Recruitment of family members may be incomplete or ancillary to the main cohort, resulting in a mixed sample of independent family units, including unrelated singletons and multiplex families. Multiple methods are available to perform genome-wide association (GWA) analysis of binary or continuous traits in families, but it is unclear whether methods known to perform well on ascertained pedigrees, sibships, or trios are appropriate in analysis of a mixed unrelated cohort and family sample. We present simulation studies based on Multi-Ethnic Study of Atherosclerosis (MESA) pedigree structures to compare the performance of several popular methods of GWA analysis for both quantitative and dichotomous traits in cohort studies. We evaluate approaches suitable for analysis of families, and combined the best performing methods with population-based samples either by meta-analysis, or by pooled analysis of family- and population-based samples (mega-analysis), comparing type 1 error and power. We further assess practical considerations, such as availability of software and ability to incorporate covariates in statistical modeling, and demonstrate our recommended approaches through quantitative and binary trait analysis of HDL cholesterol (HDL-C) in 2,553 MESA family- and population-based African-American samples. Our results suggest linear modeling approaches that accommodate family-induced phenotypic correlation (e.g., variance-component model for quantitative traits or generalized estimating equations for dichotomous traits) perform best in the context of combined family- and population-based cohort GWAS.
Collapse
Affiliation(s)
- Ani Manichaikul
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
- Department of Public Health Sciences, Division of Biostatistics and Epidemiology, University of Virginia, Charlottesville, VA
| | - Wei-Min Chen
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
- Department of Public Health Sciences, Division of Biostatistics and Epidemiology, University of Virginia, Charlottesville, VA
| | - Kayleen Williams
- Collaborative Health Studies Coordinating Center, University of Washington, Seattle, Washington
| | - Quenna Wong
- Collaborative Health Studies Coordinating Center, University of Washington, Seattle, Washington
| | - Michèle M. Sale
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
- Department of Medicine and Department of Biochemistry and Molecular Genetics, University of Virginia, Charlottesville, VA
| | - James S. Pankow
- Division of Epidemiology and Community Health, University of Minnesota, Minneapolis, MN
| | - Michael Y. Tsai
- Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN
| | - Jerome I. Rotter
- Medical Genetics Institute, Cedars-Sinai Medical Center, Los Angeles, CA
| | - Stephen S. Rich
- Center for Public Health Genomics, University of Virginia, Charlottesville, VA
| | | |
Collapse
|
1835
|
Lenstra JA, Groeneveld LF, Eding H, Kantanen J, Williams JL, Taberlet P, Nicolazzi EL, Sölkner J, Simianer H, Ciani E, Garcia JF, Bruford MW, Ajmone-Marsan P, Weigend S. Molecular tools and analytical approaches for the characterization of farm animal genetic diversity. Anim Genet 2012; 43:483-502. [DOI: 10.1111/j.1365-2052.2011.02309.x] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/15/2011] [Indexed: 12/30/2022]
Affiliation(s)
- J. A. Lenstra
- Faculty of Veterinary Medicine; Utrecht University; Utrecht; The Netherlands
| | - L. F. Groeneveld
- Institute of Farm Animal Genetics; Friedrich-Loeffler-Institut; Hoeltystr. 10; 31535; Neustadt; Germany
| | - H. Eding
- Animal Evaluations Unit; CRV; Arnhem; The Netherlands
| | - J. Kantanen
- Biotechnology and Food Research; MTT Agrifood Research Finland; FI-31600; Jokioinen; Finland
| | - J. L. Williams
- Parco Tecnologico Padano; via Einstein; 2600; Lodi; Italy
| | - P. Taberlet
- Laboratoire d'Ecologie Alpine; Université Joseph Fourier; BP 53; Grenoble; France
| | - E. L. Nicolazzi
- Istituto di Zootecnica and BioDNA Research Centre; Università Cattolica del Sacro Cuore; Piacenza; Italy
| | - J. Sölkner
- Department of Sustainable Agricultural Systems; Animal Breeding Group; BOKU - University of Natural Resources and Life Sciences; Vienna; Austria
| | - H. Simianer
- Department of Animal Sciences; Animal Breeding and Genetics Group; Georg-August-University Göttingen; 37075; Göttingen; Germany
| | - E. Ciani
- Department of General and Environmental Physiology; University of Bari “Aldo Moro”; Bari; Italy
| | - J. F. Garcia
- Universidade Estadual Paulista; Araçatuba; Brazil
| | - M. W. Bruford
- Organisms and Environment Division; School of Biosciences; Cardiff University; Cardiff; UK
| | - P. Ajmone-Marsan
- Istituto di Zootecnica and BioDNA Research Centre; Università Cattolica del Sacro Cuore; Piacenza; Italy
| | - S. Weigend
- Institute of Farm Animal Genetics; Friedrich-Loeffler-Institut; Hoeltystr. 10; 31535; Neustadt; Germany
| |
Collapse
|
1836
|
Yang X, Xu S. Identification of close relatives in the HUGO Pan-Asian SNP database. PLoS One 2011; 6:e29502. [PMID: 22242128 PMCID: PMC3248454 DOI: 10.1371/journal.pone.0029502] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2011] [Accepted: 11/29/2011] [Indexed: 12/27/2022] Open
Abstract
The HUGO Pan-Asian SNP Consortium has recently released a genome-wide dataset, which consists of 1,719 DNA samples collected from 71 Asian populations. For studies of human population genetics such as genetic structure and migration history, this provided the most comprehensive large-scale survey of genetic variation to date in East and Southeast Asia. However, although considered in the analysis, close relatives were not clearly reported in the original paper. Here we performed a systematic analysis of genetic relationships among individuals from the Pan-Asian SNP (PASNP) database and identified 3 pairs of monozygotic twins or duplicate samples, 100 pairs of first-degree and 161 second-degree of relationships. Three standardized subsets with different levels of unrelated individuals were suggested here for future applications of the samples in most types of population-genetics studies (denoted by PASNP1716, PASNP1640 and PASNP1583 respectively) based on the relationships inferred in this study. In addition, we provided gender information for PASNP samples, which were not included in the original dataset, based on analysis of X chromosome data.
Collapse
Affiliation(s)
- Xiong Yang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Shuhua Xu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences and Max Planck Society (CAS-MPG) Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- * E-mail:
| | | |
Collapse
|
1837
|
Inaoka Y, Tajima A, Tamura T, Satoh F, Osawa M. Kinship analysis based on SNP data from microarray assay. FORENSIC SCIENCE INTERNATIONAL GENETICS SUPPLEMENT SERIES 2011. [DOI: 10.1016/j.fsigss.2011.08.134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
1838
|
Chen WM, Manichaikul A, Rich SS. Identifying variants that contribute to linkage for dichotomous and quantitative traits in extended pedigrees. BMC Proc 2011; 5 Suppl 9:S68. [PMID: 22373516 PMCID: PMC3287907 DOI: 10.1186/1753-6561-5-s9-s68] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Compared to genome-wide association analysis, linkage analysis is less influenced by allelic heterogeneity. The use of linkage information in large families should provide a great opportunity to identify less frequent variants. We perform a linkage scan for both dichotomous and quantitative traits in eight extended families. For the dichotomous trait, we identified one linkage region on chromosome 4q. For quantitative traits, we identified two regions on chromosomes 4q and 6p for Q1 and one region on chromosome 6q for Q2. To identify variants that contribute to these linkage signals, we performed standard association analysis in genomic regions of interest. We also screened less frequent variants in the linkage region based on the risk ratio and phenotypic distribution among carriers. Two rare variants at VEGFC and one common variant on chromosome 4q conferred the greatest risk for the dichotomous trait. We identified two rare variants on chromosomes 4q (VEGFC) and 6p (VEGFA) that explain 12.4% of the total phenotypic variance of trait Q1. We also identified four variants (including one at VNN3) on chromosome 6q that are able to drop the linkage LOD from 3.7 to 1.0. These results suggest that the use of classical linkage and association methods in large families can provide a useful approach to identifying variants that are responsible for diseases and complex traits in families.
Collapse
Affiliation(s)
- Wei-Min Chen
- Center for Public Health Genomics, University of Virginia, West Complex, 6th Floor, Suite 6111, PO Box 800717, University of Virginia, Charlottesville, VA 22908, USA.
| | | | | |
Collapse
|
1839
|
Li MH, Strandén I, Tiirikka T, Sevón-Aimonen ML, Kantanen J. A comparison of approaches to estimate the inbreeding coefficient and pairwise relatedness using genomic and pedigree data in a sheep population. PLoS One 2011; 6:e26256. [PMID: 22114661 PMCID: PMC3220595 DOI: 10.1371/journal.pone.0026256] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Accepted: 09/23/2011] [Indexed: 11/19/2022] Open
Abstract
Genome-wide SNP data provide a powerful tool to estimate pairwise relatedness among individuals and individual inbreeding coefficient. The aim of this study was to compare methods for estimating the two parameters in a Finnsheep population based on genome-wide SNPs and genealogies, separately. This study included ninety-nine Finnsheep in Finland that differed in coat colours (white, black, brown, grey, and black/white spotted) and were from a large pedigree comprising 319 119 animals. All the individuals were genotyped with the Illumina Ovine SNP50K BeadChip by the International Sheep Genomics Consortium. We identified three genetic subpopulations that corresponded approximately with the coat colours (grey, white, and black and brown) of the sheep. We detected a significant subdivision among the colour types (F(ST) = 5.4%, P<0.05). We applied robust algorithms for the genomic estimation of individual inbreeding (F(SNP)) and pairwise relatedness (Φ(SNP)) as implemented in the programs KING and PLINK, respectively. Estimates of the two parameters from pedigrees (F(PED) and Φ(PED)) were computed using the RelaX2 program. Values of the two parameters estimated from genomic and genealogical data were mostly consistent, in particular for the highly inbred animals (e.g. inbreeding coefficient F>0.0625) and pairs of closely related animals (e.g. the full- or half-sibs). Nevertheless, we also detected differences in the two parameters between the approaches, particularly with respect to the grey Finnsheep. This could be due to the smaller sample size and relative incompleteness of the pedigree for them.We conclude that the genome-wide genomic data will provide useful information on a per sample or pairwise-samples basis in cases of complex genealogies or in the absence of genealogical data.
Collapse
Affiliation(s)
- Meng-Hua Li
- Biotechnology and Food Research, MTT Agrifood Research Finland, Jokioinen, Finland
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Ismo Strandén
- Biotechnology and Food Research, MTT Agrifood Research Finland, Jokioinen, Finland
| | - Timo Tiirikka
- Biotechnology and Food Research, MTT Agrifood Research Finland, Jokioinen, Finland
| | | | - Juha Kantanen
- Biotechnology and Food Research, MTT Agrifood Research Finland, Jokioinen, Finland
| |
Collapse
|
1840
|
Kyriazopoulou-Panagiotopoulou S, Kashef Haghighi D, Aerni SJ, Sundquist A, Bercovici S, Batzoglou S. Reconstruction of genealogical relationships with applications to Phase III of HapMap. Bioinformatics 2011; 27:i333-41. [PMID: 21685089 PMCID: PMC3117348 DOI: 10.1093/bioinformatics/btr243] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate inference of genealogical relationships between pairs of individuals is paramount in association studies, forensics and evolutionary analyses of wildlife populations. Current methods for relationship inference consider only a small set of close relationships and have limited to no power to distinguish between relationships with the same number of meioses separating the individuals under consideration (e.g. aunt-niece versus niece-aunt or first cousins versus great aunt-niece). RESULTS We present CARROT (ClAssification of Relationships with ROTations), a novel framework for relationship inference that leverages linkage information to differentiate between rotated relationships, that is, between relationships with the same number of common ancestors and the same number of meioses separating the individuals under consideration. We demonstrate that CARROT clearly outperforms existing methods on simulated data. We also applied CARROT on four populations from Phase III of the HapMap Project and detected previously unreported pairs of third- and fourth-degree relatives. AVAILABILITY Source code for CARROT is freely available at http://carrot.stanford.edu. CONTACT sofiakp@stanford.edu.
Collapse
|
1841
|
Day-Williams AG, Blangero J, Dyer TD, Lange K, Sobel EM. Linkage analysis without defined pedigrees. Genet Epidemiol 2011; 35:360-70. [PMID: 21465549 DOI: 10.1002/gepi.20584] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2010] [Revised: 02/11/2011] [Accepted: 03/01/2011] [Indexed: 01/13/2023]
Abstract
The need to collect accurate and complete pedigree information has been a drawback of family-based linkage and association studies. Even in case-control studies, investigators should be aware of, and condition on, familial relationships. In single nucleotide polymorphism (SNP) genome scans, relatedness can be directly inferred from the genetic data rather than determined through interviews. Various methods of estimating relatedness have previously been implemented, most notably in PLINK. We present new fast and accurate algorithms for estimating global and local kinship coefficients from dense SNP genotypes. These algorithms require only a single pass through the SNP genotype data. We also show that these estimates can be used to cluster individuals into pedigrees. With these estimates in hand, quantitative trait locus linkage analysis proceeds via traditional variance components methods without any prior relationship information. We demonstrate the success of our algorithms on simulated and real data sets. Our procedures make linkage analysis as easy as a typical genomewide association study.
Collapse
Affiliation(s)
- Aaron G Day-Williams
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095-7088, USA
| | | | | | | | | |
Collapse
|