1
|
Halachev M, Meynert A, Taylor MS, Vitart V, Kerr SM, Klaric L, Aitman TJ, Haley CS, Prendergast JG, Pugh C, Hume DA, Harris SE, Liewald DC, Deary IJ, Semple CA, Wilson JF. Increased ultra-rare variant load in an isolated Scottish population impacts exonic and regulatory regions. PLoS Genet 2019; 15:e1008480. [PMID: 31765389 PMCID: PMC6901239 DOI: 10.1371/journal.pgen.1008480] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 12/09/2019] [Accepted: 10/15/2019] [Indexed: 01/03/2023] Open
Abstract
Human population isolates provide a snapshot of the impact of historical demographic processes on population genetics. Such data facilitate studies of the functional impact of rare sequence variants on biomedical phenotypes, as strong genetic drift can result in higher frequencies of variants that are otherwise rare. We present the first whole genome sequencing (WGS) study of the VIKING cohort, a representative collection of samples from the isolated Shetland population in northern Scotland, and explore how its genetic characteristics compare to a mainland Scottish population. Our analyses reveal the strong contributions played by the founder effect and genetic drift in shaping genomic variation in the VIKING cohort. About one tenth of all high-quality variants discovered are unique to the VIKING cohort or are seen at frequencies at least ten fold higher than in more cosmopolitan control populations. Multiple lines of evidence also suggest relaxation of purifying selection during the evolutionary history of the Shetland isolate. We demonstrate enrichment of ultra-rare VIKING variants in exonic regions and for the first time we also show that ultra-rare variants are enriched within regulatory regions, particularly promoters, suggesting that gene expression patterns may diverge relatively rapidly in human isolates.
Collapse
Affiliation(s)
- Mihail Halachev
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Alison Meynert
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Martin S. Taylor
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Veronique Vitart
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Shona M. Kerr
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Lucija Klaric
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | | | - Timothy J. Aitman
- Centre for Genomic and Experimental Medicine, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - Chris S. Haley
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, United Kingdom
| | - James G. Prendergast
- The Roslin Institute, University of Edinburgh, Easter Bush, Midlothian, United Kingdom
| | - Carys Pugh
- Centre for Clinical Brain Sciences, Division of Psychiatry, University of Edinburgh, Royal Edinburgh Hospital, Edinburgh, United Kingdom
| | - David A. Hume
- Mater Research Institute, University of Queensland, Woolloongabba, Australia
| | - Sarah E. Harris
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, George Square, Edinburgh, United Kingdom
| | - David C. Liewald
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, George Square, Edinburgh, United Kingdom
| | - Ian J. Deary
- Centre for Cognitive Ageing and Cognitive Epidemiology, Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, George Square, Edinburgh, United Kingdom
| | - Colin A. Semple
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
| | - James F. Wilson
- MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Crewe Road, Edinburgh, United Kingdom
- Centre for Global Health Research, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Teviot Place, Edinburgh, United Kingdom
| |
Collapse
|
2
|
An actionable KCNH2 Long QT Syndrome variant detected by sequence and haplotype analysis in a population research cohort. Sci Rep 2019; 9:10964. [PMID: 31358886 PMCID: PMC6662790 DOI: 10.1038/s41598-019-47436-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Accepted: 07/17/2019] [Indexed: 12/12/2022] Open
Abstract
The Viking Health Study Shetland is a population-based research cohort of 2,122 volunteer participants with ancestry from the Shetland Isles in northern Scotland. The high kinship and detailed phenotype data support a range of approaches for associating rare genetic variants, enriched in this isolate population, with quantitative traits and diseases. As an exemplar, the c.1750G > A; p.Gly584Ser variant within the coding sequence of the KCNH2 gene implicated in Long QT Syndrome (LQTS), which occurred once in 500 whole genome sequences from this population, was investigated. Targeted sequencing of the KCNH2 gene in family members of the initial participant confirmed the presence of the sequence variant and identified two further members of the same family pedigree who shared the variant. Investigation of these three related participants for whom single nucleotide polymorphism (SNP) array genotypes were available allowed a unique shared haplotype of 1.22 Mb to be defined around this locus. Searching across the full cohort for this haplotype uncovered two additional apparently unrelated individuals with no known genealogical connection to the original kindred. All five participants with the defined haplotype were shown to share the rare variant by targeted Sanger sequencing. If this result were verified in a healthcare setting, it would be considered clinically actionable, and has been actioned in relatives ascertained independently through clinical presentation. The General Practitioners of four study participants with the rare variant were alerted to the research findings by letters outlining the phenotype (prolonged electrocardiographic QTc interval). A lack of detectable haplotype sharing between c.1750G > A; p.Gly584Ser chromosomes from previously reported individuals from Finland and those in this study from Shetland suggests that this mutation has arisen more than once in human history. This study showcases the potential value of isolate population-based research resources for genomic medicine. It also illustrates some challenges around communication of actionable findings in research participants in this context.
Collapse
|
3
|
Martin AR, Karczewski KJ, Kerminen S, Kurki MI, Sarin AP, Artomov M, Eriksson JG, Esko T, Genovese G, Havulinna AS, Kaprio J, Konradi A, Korányi L, Kostareva A, Männikkö M, Metspalu A, Perola M, Prasad RB, Raitakari O, Rotar O, Salomaa V, Groop L, Palotie A, Neale BM, Ripatti S, Pirinen M, Daly MJ. Haplotype Sharing Provides Insights into Fine-Scale Population History and Disease in Finland. Am J Hum Genet 2018; 102:760-775. [PMID: 29706349 DOI: 10.1016/j.ajhg.2018.03.003] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Accepted: 02/28/2018] [Indexed: 01/23/2023] Open
Abstract
Finland provides unique opportunities to investigate population and medical genomics because of its adoption of unified national electronic health records, detailed historical and birth records, and serial population bottlenecks. We assembled a comprehensive view of recent population history (≤100 generations), the timespan during which most rare-disease-causing alleles arose, by comparing pairwise haplotype sharing from 43,254 Finns to that of 16,060 Swedes, Estonians, Russians, and Hungarians from geographically and linguistically adjacent countries with different population histories. We find much more extensive sharing in Finns, with at least one ≥ 5 cM tract on average between pairs of unrelated individuals. By coupling haplotype sharing with fine-scale birth records from more than 25,000 individuals, we find that although haplotype sharing broadly decays with geographical distance, there are pockets of excess haplotype sharing; individuals from northeast Finland typically share several-fold more of their genome in identity-by-descent segments than individuals from southwest regions. We estimate recent effective population-size changes through time across regions of Finland, and we find that there was more continuous gene flow as Finns migrated from southwest to northeast between the early- and late-settlement regions than was dichotomously described previously. Lastly, we show that haplotype sharing is locally enriched by an order of magnitude among pairs of individuals sharing rare alleles and especially among pairs sharing rare disease-causing variants. Our work provides a general framework for using haplotype sharing to reconstruct an integrative view of recent population history and gain insight into the evolutionary origins of rare variants contributing to disease.
Collapse
Affiliation(s)
- Alicia R Martin
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
| | - Konrad J Karczewski
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Sini Kerminen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland
| | - Mitja I Kurki
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Psychiatric and Neurodevelopmental Genetics Unit, Department of Psychiatry, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Antti-Pekka Sarin
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; National Institute for Health and Welfare of Finland, Helsinki 00271, Finland
| | - Mykyta Artomov
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Johan G Eriksson
- National Institute for Health and Welfare of Finland, Helsinki 00271, Finland; Folkhälsan Research Center, Helsinki 00290, Finland; Department of General Practice and Primary Health Care, University of Helsinki and Helsinki University Hospital, Helsinki 00014, Finland
| | - Tõnu Esko
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Estonian Genome Center, University of Tartu, Tartu 50090, Estonia
| | - Giulio Genovese
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Aki S Havulinna
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; National Institute for Health and Welfare of Finland, Helsinki 00271, Finland
| | - Jaakko Kaprio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Department of Public Health, University of Helsinki, Helsinki 00014, Finland
| | - Alexandra Konradi
- Almazov National Medical Research Centre, Saint Petersburg 197341, Russia; National Research University of Information Technologies, Mechanics, and Optics, Saint Petersburg 197101, Russia
| | - László Korányi
- Heart Center Foundation, Drug Research Centre, Balatonfured H-8230, Hungary
| | - Anna Kostareva
- Almazov National Medical Research Centre, Saint Petersburg 197341, Russia; National Research University of Information Technologies, Mechanics, and Optics, Saint Petersburg 197101, Russia
| | - Minna Männikkö
- Center for Life Course Health Research, Faculty of Medicine, University of Oulu, Oulu 90014, Finland
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu 50090, Estonia
| | - Markus Perola
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Estonian Genome Center, University of Tartu, Tartu 50090, Estonia; Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku University Hospital, Turku 20520, Finland
| | - Rashmi B Prasad
- Lund University Diabetes Centre, Department of Clinical Sciences, Lund University CRC, Skåne University Hospital Malmö, SE-205 02, Malmö, Sweden
| | - Olli Raitakari
- Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku, Turku University Hospital, Turku 20520, Finland; Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku 20520, Finland
| | - Oxana Rotar
- Almazov National Medical Research Centre, Saint Petersburg 197341, Russia
| | - Veikko Salomaa
- National Institute for Health and Welfare of Finland, Helsinki 00271, Finland
| | - Leif Groop
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Lund University Diabetes Centre, Department of Clinical Sciences, Lund University CRC, Skåne University Hospital Malmö, SE-205 02, Malmö, Sweden
| | - Aarno Palotie
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Psychiatric and Neurodevelopmental Genetics Unit, Department of Psychiatry, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Benjamin M Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA
| | - Samuli Ripatti
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Department of Public Health, University of Helsinki, Helsinki 00014, Finland
| | - Matti Pirinen
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland; Department of Public Health, University of Helsinki, Helsinki 00014, Finland; Helsinki Institute for Information Technology and Department of Mathematics and Statistics, University of Helsinki, 00014 Helsinki, Finland
| | - Mark J Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA 02114, USA; Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki 00014, Finland.
| |
Collapse
|
4
|
Torkamaneh D, Boyle B, Belzile F. Efficient genome-wide genotyping strategies and data integration in crop plants. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2018; 131:499-511. [PMID: 29352324 DOI: 10.1007/s00122-018-3056-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 01/12/2018] [Indexed: 05/21/2023]
Abstract
Next-generation sequencing (NGS) has revolutionized plant and animal research by providing powerful genotyping methods. This review describes and discusses the advantages, challenges and, most importantly, solutions to facilitate data processing, the handling of missing data, and cross-platform data integration. Next-generation sequencing technologies provide powerful and flexible genotyping methods to plant breeders and researchers. These methods offer a wide range of applications from genome-wide analysis to routine screening with a high level of accuracy and reproducibility. Furthermore, they provide a straightforward workflow to identify, validate, and screen genetic variants in a short time with a low cost. NGS-based genotyping methods include whole-genome re-sequencing, SNP arrays, and reduced representation sequencing, which are widely applied in crops. The main challenges facing breeders and geneticists today is how to choose an appropriate genotyping method and how to integrate genotyping data sets obtained from various sources. Here, we review and discuss the advantages and challenges of several NGS methods for genome-wide genetic marker development and genotyping in crop plants. We also discuss how imputation methods can be used to both fill in missing data in genotypic data sets and to integrate data sets obtained using different genotyping tools. It is our hope that this synthetic view of genotyping methods will help geneticists and breeders to integrate these NGS-based methods in crop plant breeding and research.
Collapse
Affiliation(s)
- Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, Canada
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
| | - Brian Boyle
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada
| | - François Belzile
- Département de Phytologie, Université Laval, Québec City, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec City, QC, Canada.
| |
Collapse
|
5
|
Herzig AF, Nutile T, Babron MC, Ciullo M, Bellenguez C, Leutenegger AL. Strategies for phasing and imputation in a population isolate. Genet Epidemiol 2018; 42:201-213. [PMID: 29319195 DOI: 10.1002/gepi.22109] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Revised: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 11/05/2022]
Abstract
In the search for genetic associations with complex traits, population isolates offer the advantage of reduced genetic and environmental heterogeneity. In addition, cost-efficient next-generation association approaches have been proposed in these populations where only a subsample of representative individuals is sequenced and then genotypes are imputed into the rest of the population. Gene mapping in such populations thus requires high-quality genetic imputation and preliminary phasing. To identify an effective study design, we compare by simulation a range of phasing and imputation software and strategies. We simulated 1,115,604 variants on chromosome 10 for 477 members of the large complex pedigree of Campora, a village within the established isolate of Cilento in southern Italy. We assessed the phasing performance of identical by descent based software ALPHAPHASE and SLRP, LD-based software SHAPEIT2, SHAPEIT3, and BEAGLE, and new software EAGLE that combines both methodologies. For imputation we compared IMPUTE2, IMPUTE4, MINIMAC3, BEAGLE, and new software PBWT. Genotyping errors and missing genotypes were simulated to observe their effects on the performance of each software. Highly accurate phased data were achieved by all software with SHAPEIT2, SHAPEIT3, and EAGLE2 providing the most accurate results. MINIMAC3, IMPUTE4, and IMPUTE2 all performed strongly as imputation software and our study highlights the considerable gain in imputation accuracy provided by a genome sequenced reference panel specific to the population isolate.
Collapse
Affiliation(s)
- Anthony Francis Herzig
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| | - Teresa Nutile
- Institute of Genetics and Biophysics A. Buzzati-Traverso-CNR, Naples, Italy
| | - Marie-Claude Babron
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| | - Marina Ciullo
- Institute of Genetics and Biophysics A. Buzzati-Traverso-CNR, Naples, Italy.,IRCCS Neuromed, Pozzilli, Isernia, Italy
| | - Céline Bellenguez
- Inserm, U1167, RID-AGE-Risk Factors and Molecular Determinants of Aging-Related Diseases, Lille, France.,Institut Pasteur de Lille, Lille, France.,Université de Lille, U1167-Excellence Laboratory LabEx DISTALZ, Lille, France
| | - Anne-Louise Leutenegger
- Université Paris-Diderot, Sorbonne Paris Cité, U946, Paris, France.,Inserm, U946, Genetic Variation and Human Diseases, Paris, France
| |
Collapse
|
6
|
Jeroncic A, Memari Y, Ritchie GR, Hendricks AE, Kolb-Kokocinski A, Matchan A, Vitart V, Hayward C, Kolcic I, Glodzik D, Wright AF, Rudan I, Campbell H, Durbin R, Polašek O, Zeggini E, Boraska Perica V. Whole-exome sequencing in an isolated population from the Dalmatian island of Vis. Eur J Hum Genet 2016; 24:1479-87. [PMID: 27049301 PMCID: PMC4950961 DOI: 10.1038/ejhg.2016.23] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Revised: 02/07/2016] [Accepted: 02/17/2016] [Indexed: 12/14/2022] Open
Abstract
We have whole-exome sequenced 176 individuals from the isolated population of the island of Vis in Croatia in order to describe exonic variation architecture. We found 290 577 single nucleotide variants (SNVs), 65% of which are singletons, low frequency or rare variants. A total of 25 430 (9%) SNVs are novel, previously not catalogued in NHLBI GO Exome Sequencing Project, UK10K-Generation Scotland, 1000Genomes Project, ExAC or NCBI Reference Assembly dbSNP. The majority of these variants (76%) are singletons. Comparable to data obtained from UK10K-Generation Scotland that were sequenced and analysed using the same protocols, we detected an enrichment of potentially damaging variants (non-synonymous and loss-of-function) in the low frequency and common variant categories. On average 115 (range 93–140) genotypes with loss-of-function variants, 23 (15–34) of which were homozygous, were identified per person. The landscape of loss-of-function variants across an exome revealed that variants mainly accumulated in genes on the xenobiotic-related pathways, of which majority coded for enzymes. The frequency of loss-of-function variants was additionally increased in Vis runs of homozygosity regions where variants mainly affected signalling pathways. This work confirms the isolate status of Vis population by means of whole-exome sequence and reveals the pattern of loss-of-function mutations, which resembles the trails of adaptive evolution that were found in other species. By cataloguing the exomic variants and describing the allelic structure of the Vis population, this study will serve as a valuable resource for future genetic studies of human diseases, population genetics and evolution in this population.
Collapse
Affiliation(s)
- Ana Jeroncic
- Department of Research in Biomedicine and Health, University of Split School of Medicine, Split, Croatia
| | - Yasin Memari
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | | | - Audrey E Hendricks
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.,Department of Mathematical and Statistical Sciences, University of Colorado, Denver, CO, USA
| | | | | | - Veronique Vitart
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
| | - Caroline Hayward
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
| | - Ivana Kolcic
- Department of Public Health, University of Split School of Medicine, Split, Croatia
| | - Dominik Glodzik
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
| | - Alan F Wright
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, UK
| | - Igor Rudan
- Centre for Global Health Research, University of Edinburgh, Edinburgh, UK
| | - Harry Campbell
- Centre for Global Health Research, University of Edinburgh, Edinburgh, UK
| | | | - Ozren Polašek
- Department of Public Health, University of Split School of Medicine, Split, Croatia.,Centre for Global Health Research, University of Edinburgh, Edinburgh, UK
| | | | - Vesna Boraska Perica
- Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK.,Department of Medical Biology, University of Split School of Medicine, Split, Croatia
| |
Collapse
|
7
|
Livne OE, Han L, Alkorta-Aranburu G, Wentworth-Sheilds W, Abney M, Ober C, Nicolae DL. PRIMAL: Fast and accurate pedigree-based imputation from sequence data in a founder population. PLoS Comput Biol 2015; 11:e1004139. [PMID: 25735005 PMCID: PMC4348507 DOI: 10.1371/journal.pcbi.1004139] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2014] [Accepted: 01/19/2015] [Indexed: 12/31/2022] Open
Abstract
Founder populations and large pedigrees offer many well-known advantages for genetic mapping studies, including cost-efficient study designs. Here, we describe PRIMAL (PedigRee IMputation ALgorithm), a fast and accurate pedigree-based phasing and imputation algorithm for founder populations. PRIMAL incorporates both existing and original ideas, such as a novel indexing strategy of Identity-By-Descent (IBD) segments based on clique graphs. We were able to impute the genomes of 1,317 South Dakota Hutterites, who had genome-wide genotypes for ~300,000 common single nucleotide variants (SNVs), from 98 whole genome sequences. Using a combination of pedigree-based and LD-based imputation, we were able to assign 87% of genotypes with >99% accuracy over the full range of allele frequencies. Using the IBD cliques we were also able to infer the parental origin of 83% of alleles, and genotypes of deceased recent ancestors for whom no genotype information was available. This imputed data set will enable us to better study the relative contribution of rare and common variants on human phenotypes, as well as parental origin effect of disease risk alleles in >1,000 individuals at minimal cost.
Collapse
Affiliation(s)
- Oren E. Livne
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Lide Han
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Gorka Alkorta-Aranburu
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - William Wentworth-Sheilds
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Mark Abney
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Carole Ober
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
| | - Dan L. Nicolae
- Department of Human Genetics, The University of Chicago, Chicago, Illinois, United States of America
- Departments of Medicine, and Statistics, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
8
|
Karafet TM, Bulayeva KB, Bulayev OA, Gurgenova F, Omarova J, Yepiskoposyan L, Savina OV, Veeramah KR, Hammer MF. Extensive genome-wide autozygosity in the population isolates of Daghestan. Eur J Hum Genet 2015; 23:1405-12. [PMID: 25604856 DOI: 10.1038/ejhg.2014.299] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2014] [Revised: 12/09/2014] [Accepted: 12/19/2014] [Indexed: 01/01/2023] Open
Abstract
Isolated populations are valuable resources for mapping disease genes, as inbreeding increases genome-wide homozygosity and enhances the ability to map disease alleles on a genetically uniform background within a relatively homogenous environment. The populations of Daghestan are thought to have resided in the Caucasus Mountains for hundreds of generations and are characterized by a high prevalence of certain complex diseases. To explore the extent to which their unique population history led to increased levels of inbreeding, we genotyped >550 000 autosomal single-nucleotide polymorphisms (SNPs) in a set of 14 population isolates speaking Nakh-Daghestanian (ND) languages. The ND-speaking populations showed greatly elevated coefficients of inbreeding, very high numbers and long lengths of Runs of Homozygosity, and elevated linkage disequilibrium compared with surrounding groups from the Caucasus, the Near East, Europe, Central and South Asia. These results are consistent with the hypothesis that most ND-speaking groups descend from a common ancestral population that fragmented into a series of genetic isolates in the Daghestanian highlands. They have subsequently maintained a long-term small effective population size as a result of constant inbreeding and very low levels of gene flow. Given these findings, Daghestanian population isolates are likely to be useful for mapping genes associated with complex diseases.
Collapse
Affiliation(s)
- Tatiana M Karafet
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ, USA
| | - Kazima B Bulayeva
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Oleg A Bulayev
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Farida Gurgenova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Jamilia Omarova
- Vavilov Institute of General Genetics, Russian Academy of Sciences, Moscow, Russia
| | - Levon Yepiskoposyan
- Institute of Molecular Biology, National Academy of Sciences, Yerevan, Armenia
| | - Olga V Savina
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ, USA
| | | | - Michael F Hammer
- ARL Division of Biotechnology, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
9
|
Strong effects of genetic and lifestyle factors on biomarker variation and use of personalized cutoffs. Nat Commun 2014; 5:4684. [PMID: 25147954 PMCID: PMC4143927 DOI: 10.1038/ncomms5684] [Citation(s) in RCA: 124] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 07/14/2014] [Indexed: 12/14/2022] Open
Abstract
Ideal biomarkers used for disease diagnosis should display deviating levels in affected individuals only and be robust to factors unrelated to the disease. Here we show the impact of genetic, clinical and lifestyle factors on circulating levels of 92 protein biomarkers for cancer and inflammation, using a population-based cohort of 1,005 individuals. For 75% of the biomarkers, the levels are significantly heritable and genome-wide association studies identifies 16 novel loci and replicate 2 previously known loci with strong effects on one or several of the biomarkers with P-values down to 4.4 × 10−58. Integrative analysis attributes as much as 56.3% of the observed variance to non-disease factors. We propose that information on the biomarker-specific profile of major genetic, clinical and lifestyle factors should be used to establish personalized clinical cutoffs, and that this would increase the sensitivity of using biomarkers for prediction of clinical end points. Protein biomarkers could play an important role in the diagnosis and management of diseases. Here the authors investigate the impact of genetic, clinical and lifestyle factors on 92 protein biomarkers for cancer and inflammation and suggest that personalized biomarker thresholds should be used in cancer management.
Collapse
|
10
|
Abstract
The use of genetically isolated populations can empower next-generation association studies. In this review, we discuss the advantages of this approach and review study design and analytical considerations of genetic association studies focusing on isolates. We cite successful examples of using population isolates in association studies and outline potential ways forward.
Collapse
|
11
|
O'Connell J, Gurdasani D, Delaneau O, Pirastu N, Ulivi S, Cocca M, Traglia M, Huang J, Huffman JE, Rudan I, McQuillan R, Fraser RM, Campbell H, Polasek O, Asiki G, Ekoru K, Hayward C, Wright AF, Vitart V, Navarro P, Zagury JF, Wilson JF, Toniolo D, Gasparini P, Soranzo N, Sandhu MS, Marchini J. A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet 2014; 10:e1004234. [PMID: 24743097 PMCID: PMC3990520 DOI: 10.1371/journal.pgen.1004234] [Citation(s) in RCA: 381] [Impact Index Per Article: 38.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Accepted: 01/27/2014] [Indexed: 01/20/2023] Open
Abstract
Many existing cohorts contain a range of relatedness between genotyped individuals, either by design or by chance. Haplotype estimation in such cohorts is a central step in many downstream analyses. Using genotypes from six cohorts from isolated populations and two cohorts from non-isolated populations, we have investigated the performance of different phasing methods designed for nominally 'unrelated' individuals. We find that SHAPEIT2 produces much lower switch error rates in all cohorts compared to other methods, including those designed specifically for isolated populations. In particular, when large amounts of IBD sharing is present, SHAPEIT2 infers close to perfect haplotypes. Based on these results we have developed a general strategy for phasing cohorts with any level of implicit or explicit relatedness between individuals. First SHAPEIT2 is run ignoring all explicit family information. We then apply a novel HMM method (duoHMM) to combine the SHAPEIT2 haplotypes with any family information to infer the inheritance pattern of each meiosis at all sites across each chromosome. This allows the correction of switch errors, detection of recombination events and genotyping errors. We show that the method detects numbers of recombination events that align very well with expectations based on genetic maps, and that it infers far fewer spurious recombination events than Merlin. The method can also detect genotyping errors and infer recombination events in otherwise uninformative families, such as trios and duos. The detected recombination events can be used in association scans for recombination phenotypes. The method provides a simple and unified approach to haplotype estimation, that will be of interest to researchers in the fields of human, animal and plant genetics.
Collapse
Affiliation(s)
- Jared O'Connell
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom; Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Deepti Gurdasani
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Olivier Delaneau
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Nicola Pirastu
- Institute for Maternal and Child Health - IRCCS Burlo Garofolo, University of Trieste, Trieste, Italy
| | - Sheila Ulivi
- Institute for Maternal and Child Health - IRCCS Burlo Garofolo, Trieste, Italy
| | - Massimiliano Cocca
- Division of Genetics and Cell Biology, San Raffaele Scientific Institute, Milano, Italy
| | - Michela Traglia
- Division of Genetics and Cell Biology, San Raffaele Scientific Institute, Milano, Italy
| | - Jie Huang
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Jennifer E Huffman
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Igor Rudan
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Ruth McQuillan
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Ross M Fraser
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Harry Campbell
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Ozren Polasek
- Faculty of Medicine, University of Split, Split, Croatia
| | - Gershim Asiki
- Medical Research Council/Uganda Virus Research Institute (MRC/UVRI), Uganda Research Unit on AIDS, Entebbe, Uganda
| | - Kenneth Ekoru
- Laboratoire Génomique, Bioinformatique, et Applications (EA4627), Conservatoire National des Arts et Métiers, Paris, France
| | - Caroline Hayward
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Alan F Wright
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Veronique Vitart
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Pau Navarro
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Jean-Francois Zagury
- Laboratoire Génomique, Bioinformatique, et Applications (EA4627), Conservatoire National des Arts et Métiers, Paris, France
| | - James F Wilson
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Daniela Toniolo
- Division of Genetics and Cell Biology, San Raffaele Scientific Institute, Milano, Italy
| | - Paolo Gasparini
- Institute for Maternal and Child Health - IRCCS Burlo Garofolo, University of Trieste, Trieste, Italy
| | - Nicole Soranzo
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom
| | - Manjinder S Sandhu
- Wellcome Trust Sanger Institute, Hinxton, United Kingdom; Department of Public Health and Primary Care, University of Cambridge, Cambridge, United Kingdom
| | - Jonathan Marchini
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom; Department of Statistics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
12
|
Joshi PK, Prendergast J, Fraser RM, Huffman JE, Vitart V, Hayward C, McQuillan R, Glodzik D, Polašek O, Hastie ND, Rudan I, Campbell H, Wright AF, Haley CS, Wilson JF, Navarro P. Local exome sequences facilitate imputation of less common variants and increase power of genome wide association studies. PLoS One 2013; 8:e68604. [PMID: 23874685 PMCID: PMC3712964 DOI: 10.1371/journal.pone.0068604] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 05/30/2013] [Indexed: 11/18/2022] Open
Abstract
The analysis of less common variants in genome-wide association studies promises to elucidate complex trait genetics but is hampered by low power to reliably detect association. We show that addition of population-specific exome sequence data to global reference data allows more accurate imputation, particularly of less common SNPs (minor allele frequency 1-10%) in two very different European populations. The imputation improvement corresponds to an increase in effective sample size of 28-38%, for SNPs with a minor allele frequency in the range 1-3%.
Collapse
Affiliation(s)
- Peter K. Joshi
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - James Prendergast
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Ross M. Fraser
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Jennifer E. Huffman
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Veronique Vitart
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Caroline Hayward
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Ruth McQuillan
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Dominik Glodzik
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Ozren Polašek
- Department of Public Health, University of Split, Split, Croatia
- Centre for Global Health, University of Split, Split, Croatia
| | - Nicholas D. Hastie
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Igor Rudan
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Harry Campbell
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Alan F. Wright
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Chris S. Haley
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- Roslin Institute, University of Edinburgh, Scotland, United Kingdom
| | - James F. Wilson
- Centre for Population Health Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Pau Navarro
- MRC Human Genetics Unit, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| |
Collapse
|