351
|
Ancestral Origins and Genetic History of Tibetan Highlanders. Am J Hum Genet 2016; 99:580-594. [PMID: 27569548 DOI: 10.1016/j.ajhg.2016.07.002] [Citation(s) in RCA: 129] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Accepted: 07/01/2016] [Indexed: 12/30/2022] Open
Abstract
The origin of Tibetans remains one of the most contentious puzzles in history, anthropology, and genetics. Analyses of deeply sequenced (30×-60×) genomes of 38 Tibetan highlanders and 39 Han Chinese lowlanders, together with available data on archaic and modern humans, allow us to comprehensively characterize the ancestral makeup of Tibetans and uncover their origins. Non-modern human sequences compose ∼6% of the Tibetan gene pool and form unique haplotypes in some genomic regions, where Denisovan-like, Neanderthal-like, ancient-Siberian-like, and unknown ancestries are entangled and elevated. The shared ancestry of Tibetan-enriched sequences dates back to ∼62,000-38,000 years ago, predating the Last Glacial Maximum (LGM) and representing early colonization of the plateau. Nonetheless, most of the Tibetan gene pool is of modern human origin and diverged from that of Han Chinese ∼15,000 to ∼9,000 years ago, which can be largely attributed to post-LGM arrivals. Analysis of ∼200 contemporary populations showed that Tibetans share ancestry with populations from East Asia (∼82%), Central Asia and Siberia (∼11%), South Asia (∼6%), and western Eurasia and Oceania (∼1%). Our results support that Tibetans arose from a mixture of multiple ancestral gene pools but that their origins are much more complicated and ancient than previously suspected. We provide compelling evidence of the co-existence of Paleolithic and Neolithic ancestries in the Tibetan gene pool, indicating a genetic continuity between pre-historical highland-foragers and present-day Tibetans. In particular, highly differentiated sequences harbored in highlanders' genomes were most likely inherited from pre-LGM settlers of multiple ancestral origins (SUNDer) and maintained in high frequency by natural selection.
Collapse
|
352
|
Genome-wide scans reveal variants at EDAR predominantly affecting hair straightness in Han Chinese and Uyghur populations. Hum Genet 2016; 135:1279-1286. [PMID: 27487801 DOI: 10.1007/s00439-016-1718-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Accepted: 07/23/2016] [Indexed: 10/21/2022]
Abstract
Hair straightness/curliness is one of the most conspicuous features of human variation and is particularly diverse among populations. A recent genome-wide scan found common variants in the Trichohyalin (TCHH) gene that are associated with hair straightness in Europeans, but different genes might affect this phenotype in other populations. By sampling 2899 Han Chinese, we performed the first genome-wide scan of hair straightness in East Asians, and found EDAR (rs3827760) as the predominant gene (P = 4.67 × 10-16), accounting for 3.66 % of the total variance. The candidate gene approach did not find further significant associations, suggesting that hair straightness may be affected by a large number of genes with subtle effects. Notably, genetic variants associated with hair straightness in Europeans are generally low in frequency in Han Chinese, and vice versa. To evaluate the relative contribution of these variants, we performed a second genome-wide scan in 709 samples from the Uyghur, an admixed population with both eastern and western Eurasian ancestries. In Uyghurs, both EDAR (rs3827760: P = 1.92 × 10-12) and TCHH (rs11803731: P = 1.46 × 10-3) are associated with hair straightness, but EDAR (OR 0.415) has a greater effect than TCHH (OR 0.575). We found no significant interaction between EDAR and TCHH (P = 0.645), suggesting that these two genes affect hair straightness through different mechanisms. Furthermore, haplotype analysis indicates that TCHH is not subject to selection. While EDAR is under strong selection in East Asia, it does not appear to be subject to selection after the admixture in Uyghurs. These suggest that hair straightness is unlikely a trait under selection.
Collapse
|
353
|
Morris DL, Sheng Y, Zhang Y, Wang YF, Zhu Z, Tombleson P, Chen L, Cunninghame Graham DS, Bentham J, Roberts AL, Chen R, Zuo X, Wang T, Wen L, Yang C, Liu L, Yang L, Li F, Huang Y, Yin X, Yang S, Rönnblom L, Fürnrohr BG, Voll RE, Schett G, Costedoat-Chalumeau N, Gaffney PM, Lau YL, Zhang X, Yang W, Cui Y, Vyse TJ. Genome-wide association meta-analysis in Chinese and European individuals identifies ten new loci associated with systemic lupus erythematosus. Nat Genet 2016; 48:940-946. [PMID: 27399966 PMCID: PMC4966635 DOI: 10.1038/ng.3603] [Citation(s) in RCA: 235] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2015] [Accepted: 06/01/2016] [Indexed: 12/14/2022]
Abstract
Systemic lupus erythematosus (SLE; OMIM 152700) is a genetically complex autoimmune disease. Genome-wide association studies (GWASs) have identified more than 50 loci as robustly associated with the disease in single ancestries, but genome-wide transancestral studies have not been conducted. We combined three GWAS data sets from Chinese (1,659 cases and 3,398 controls) and European (4,036 cases and 6,959 controls) populations. A meta-analysis of these studies showed that over half of the published SLE genetic associations are present in both populations. A replication study in Chinese (3,043 cases and 5,074 controls) and European (2,643 cases and 9,032 controls) subjects found ten previously unreported SLE loci. Our study provides further evidence that the majority of genetic risk polymorphisms for SLE are contained within the same regions across both populations. Furthermore, a comparison of risk allele frequencies and genetic risk scores suggested that the increased prevalence of SLE in non-Europeans (including Asians) has a genetic basis.
Collapse
Affiliation(s)
- David L Morris
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Yujun Sheng
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China
| | - Yan Zhang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Yong-Fei Wang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Zhengwei Zhu
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Philip Tombleson
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Lingyan Chen
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | | | - James Bentham
- Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
| | - Amy L Roberts
- Division of Genetics and Molecular Medicine, King's College London, London, UK
| | - Ruoyan Chen
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Xianbo Zuo
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Tingyou Wang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Leilei Wen
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Chao Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lu Liu
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lulu Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Feng Li
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Yuanbo Huang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Xianyong Yin
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Sen Yang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
| | - Lars Rönnblom
- Department of Medical Sciences, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Barbara G Fürnrohr
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
- Division of Genetic Epidemiology, Medical University Innsbruck, Innsbruck, Austria
- Division of Biological Chemistry, Medical University Innsbruck, Innsbruck, Austria
| | - Reinhard E Voll
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
- Department of Rheumatology, University Hospital Freiburg, Freiburg, Germany
- Department of Rheumatology and Clinical Immunology, University Hospital Freiburg, Freiburg, Germany
- Centre for Chronic Immunodeficiency, University Hospital Freiburg, Freiburg, Germany
| | - Georg Schett
- Department of Internal Medicine 3, University of Erlangen-Nuremberg, Erlangen, Germany
- Institute for Clinical Immunology, University of Erlangen-Nuremberg, Erlangen, Germany
| | - Nathalie Costedoat-Chalumeau
- AP-HP, Hôpital Cochin, Centre de référence maladies auto-immunes et systémiques rares, Paris, France
- Université Paris Descartes-Sorbonne Paris Cité, Paris, France
| | - Patrick M Gaffney
- Arthritis and Clinical Immunology Program, Oklahoma Medical Research Foundation, Oklahoma City, Oklahoma, USA
| | - Yu Lung Lau
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
- The University of Hong Kong Shenzhen Hospital, Shenzhen, China
| | - Xuejun Zhang
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, Huashan Hospital of Fudan University, Shanghai, China
| | - Wanling Yang
- Department of Paediatrics and Adolescent Medicine, LKS Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Yong Cui
- Department of Dermatology, No. 1 Hospital, Anhui Medical University, Hefei, Anhui, China
- Key Laboratory of Dermatology, Ministry of Education, Anhui Medical University, Hefei, Anhui, China
- Department of Dermatology, China-Japan Friendship Hospital, Beijing, China
| | - Timothy J Vyse
- Division of Genetics and Molecular Medicine, King's College London, London, UK
- Division of Immunology, Infection and Inflammatory Disease, King's College London, London, UK
| |
Collapse
|
354
|
A thrifty variant in CREBRF strongly influences body mass index in Samoans. Nat Genet 2016; 48:1049-1054. [PMID: 27455349 PMCID: PMC5069069 DOI: 10.1038/ng.3620] [Citation(s) in RCA: 157] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 06/15/2016] [Indexed: 12/14/2022]
Abstract
Samoans are a unique founder population with a high prevalence of obesity, making them well suited for identifying new genetic contributors to obesity. We conducted a genome-wide association study (GWAS) in 3,072 Samoans, discovered a variant, rs12513649, strongly associated with body mass index (BMI) (P = 5.3 × 10(-14)), and replicated the association in 2,102 additional Samoans (P = 1.2 × 10(-9)). Targeted sequencing identified a strongly associated missense variant, rs373863828 (p.Arg457Gln), in CREBRF (meta P = 1.4 × 10(-20)). Although this variant is extremely rare in other populations, it is common in Samoans (frequency of 0.259), with an effect size much larger than that of any other known common BMI risk variant (1.36-1.45 kg/m(2) per copy of the risk-associated allele). In comparison to wild-type CREBRF, the Arg457Gln variant when overexpressed selectively decreased energy use and increased fat storage in an adipocyte cell model. These data, in combination with evidence of positive selection of the allele encoding p.Arg457Gln, support a 'thrifty' variant hypothesis as a factor in human obesity.
Collapse
|
355
|
Schulz CA, Christensson A, Ericson U, Almgren P, Hindy G, Nilsson PM, Struck J, Bergmann A, Melander O, Orho-Melander M. High Level of Fasting Plasma Proenkephalin-A Predicts Deterioration of Kidney Function and Incidence of CKD. J Am Soc Nephrol 2016; 28:291-303. [PMID: 27401687 DOI: 10.1681/asn.2015101177] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2015] [Accepted: 05/20/2016] [Indexed: 11/03/2022] Open
Abstract
High levels of proenkephalin-A (pro-ENK) have been associated with decreased eGFR in an acute setting. Here, we examined whether pro-ENK levels predict CKD and decline of renal function in a prospective cohort of 2568 participants without CKD (eGFR>60 ml/min per 1.73 m2) at baseline. During a mean follow-up of 16.6 years, 31.7% of participants developed CKD. Participants with baseline pro-ENK levels in the highest tertile had significantly greater yearly mean decline of eGFR (Ptrend<0.001) and rise of cystatin C (Ptrend=0.01) and creatinine (Ptrend<0.001) levels. Furthermore, compared with participants in the lowest tertile, participants in the highest tertile of baseline pro-ENK concentration had increased CKD incidence (odds ratio, 1.51; 95% confidence interval, 1.18 to 1.94) when adjusted for multiple factors. Adding pro-ENK to a model of conventional risk factors in net reclassification improvement analysis resulted in reclassification of 14.14% of participants. Genome-wide association analysis in 4150 participants of the same cohort revealed the strongest association of pro-ENK levels with rs1012178 near the PENK gene, where the minor T-allele associated with a 0.057 pmol/L higher pro-ENK level per allele (P=4.67x10-21). Furthermore, the T-allele associated with a 19% increased risk of CKD per allele (P=0.03) and a significant decrease in the instrumental variable estimator for eGFR (P<0.01) in a Mendelian randomization analysis. In conclusion, circulating plasma pro-ENK level predicts incident CKD and may aid in identifying subjects in need of primary preventive regimens. Additionally, the Mendelian randomization analysis suggests a causal relationship between pro-ENK level and deterioration of kidney function over time.
Collapse
Affiliation(s)
- Christina-Alexandra Schulz
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Anders Christensson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Ulrika Ericson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Peter Almgren
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - George Hindy
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Peter M Nilsson
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | | | - Andreas Bergmann
- Sphingotec GmbH, Hennigsdorf, Germany; and.,Waltraut Bergmann Foundation, Hohen Neuendorf, Germany
| | - Olle Melander
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden
| | - Marju Orho-Melander
- Department of Clinical Sciences, University Hospital Malmo Clinical Research Center, Lund University, Malmo, Sweden;
| |
Collapse
|
356
|
Alston C, Compton A, Formosa L, Strecker V, Oláhová M, Haack T, Smet J, Stouffs K, Diakumis P, Ciara E, Cassiman D, Romain N, Yarham J, He L, De Paepe B, Vanlander A, Seneca S, Feichtinger R, Płoski R, Rokicki D, Pronicka E, Haller R, Van Hove J, Bahlo M, Mayr J, Van Coster R, Prokisch H, Wittig I, Ryan M, Thorburn D, Taylor R. Biallelic Mutations in TMEM126B Cause Severe Complex I Deficiency with a Variable Clinical Phenotype. Am J Hum Genet 2016; 99:217-27. [PMID: 27374774 PMCID: PMC5005451 DOI: 10.1016/j.ajhg.2016.05.021] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 05/18/2016] [Indexed: 11/22/2022] Open
Abstract
Complex I deficiency is the most common biochemical phenotype observed in individuals with mitochondrial disease. With 44 structural subunits and over 10 assembly factors, it is unsurprising that complex I deficiency is associated with clinical and genetic heterogeneity. Massively parallel sequencing (MPS) technologies including custom, targeted gene panels or unbiased whole-exome sequencing (WES) are hugely powerful in identifying the underlying genetic defect in a clinical diagnostic setting, yet many individuals remain without a genetic diagnosis. These individuals might harbor mutations in poorly understood or uncharacterized genes, and their diagnosis relies upon characterization of these orphan genes. Complexome profiling recently identified TMEM126B as a component of the mitochondrial complex I assembly complex alongside proteins ACAD9, ECSIT, NDUFAF1, and TIMMDC1. Here, we describe the clinical, biochemical, and molecular findings in six cases of mitochondrial disease from four unrelated families affected by biallelic (c.635G>T [p.Gly212Val] and/or c.401delA [p.Asn134Ilefs∗2]) TMEM126B variants. We provide functional evidence to support the pathogenicity of these TMEM126B variants, including evidence of founder effects for both variants, and establish defects within this gene as a cause of complex I deficiency in association with either pure myopathy in adulthood or, in one individual, a severe multisystem presentation (chronic renal failure and cardiomyopathy) in infancy. Functional experimentation including viral rescue and complexome profiling of subject cell lines has confirmed TMEM126B as the tenth complex I assembly factor associated with human disease and validates the importance of both genome-wide sequencing and proteomic approaches in characterizing disease-associated genes whose physiological roles have been previously undetermined.
Collapse
|
357
|
Staples J, Witherspoon D, Jorde L, Nickerson D, Below J, Huff C, Huff CD. PADRE: Pedigree-Aware Distant-Relationship Estimation. Am J Hum Genet 2016; 99:154-62. [PMID: 27374771 DOI: 10.1016/j.ajhg.2016.05.020] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 05/16/2016] [Indexed: 10/21/2022] Open
Abstract
Accurate estimation of shared ancestry is an important component of many genetic studies; current prediction tools accurately estimate pairwise genetic relationships up to the ninth degree. Pedigree-aware distant-relationship estimation (PADRE) combines relationship likelihoods generated by estimation of recent shared ancestry (ERSA) with likelihoods from family networks reconstructed by pedigree reconstruction and identification of a maximum unrelated set (PRIMUS), improving the power to detect distant relationships between pedigrees. Using PADRE, we estimated relationships from simulated pedigrees and three extended pedigrees, correctly predicting 20% more fourth- through ninth-degree simulated relationships than when using ERSA alone. By leveraging pedigree information, PADRE can even identify genealogical relationships between individuals who are genetically unrelated. For example, although 95% of 13(th)-degree relatives are genetically unrelated, in simulations, PADRE correctly predicted 50% of 13(th)-degree relationships to within one degree of relatedness. The improvement in prediction accuracy was consistent between simulated and actual pedigrees. We also applied PADRE to the HapMap3 CEU samples and report new cryptic relationships and validation of previously described relationships between families. PADRE greatly expands the range of relationships that can be estimated by using genetic data in pedigrees.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Chad D Huff
- Department of Epidemiology, The University of Texas M.D. Anderson Cancer Center, Houston, TX 77030, USA.
| |
Collapse
|
358
|
Loh PR, Palamara PF, Price AL. Fast and accurate long-range phasing in a UK Biobank cohort. Nat Genet 2016; 48:811-6. [PMID: 27270109 PMCID: PMC4925291 DOI: 10.1038/ng.3571] [Citation(s) in RCA: 221] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 04/22/2016] [Indexed: 01/01/2023]
Abstract
Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4-cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N ≈ 150,000 samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈ 0.3%, corresponding to perfect phase in a majority of 10-Mb segments). We also observed that, when used within an imputation pipeline, Eagle prephasing improved downstream imputation accuracy in comparison to prephasing in batches using existing methods, as necessary to achieve comparable computational cost.
Collapse
Affiliation(s)
- Po-Ru Loh
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Pier Francesco Palamara
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
| | - Alkes L Price
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
359
|
Utsunomiya YT, Milanesi M, Utsunomiya ATH, Ajmone-Marsan P, Garcia JF. GHap: an R package for genome-wide haplotyping. Bioinformatics 2016; 32:2861-2. [DOI: 10.1093/bioinformatics/btw356] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 05/31/2016] [Indexed: 11/13/2022] Open
|
360
|
Abstract
The UK Biobank (UKB) has recently released genotypes on 152,328 individuals together with extensive phenotypic and lifestyle information. We present a new phasing method SHAPEIT3 that can handle such biobank scale datasets and results in switch error rates as low as ~0.3%. The method exhibits O(NlogN) scaling in sample size (N), enabling fast and accurate phasing of even larger cohorts.
Collapse
|
361
|
Adhikari K, Fuentes-Guajardo M, Quinto-Sánchez M, Mendoza-Revilla J, Camilo Chacón-Duque J, Acuña-Alonzo V, Jaramillo C, Arias W, Lozano RB, Pérez GM, Gómez-Valdés J, Villamil-Ramírez H, Hunemeier T, Ramallo V, Silva de Cerqueira CC, Hurtado M, Villegas V, Granja V, Gallo C, Poletti G, Schuler-Faccini L, Salzano FM, Bortolini MC, Canizales-Quinteros S, Cheeseman M, Rosique J, Bedoya G, Rothhammer F, Headon D, González-José R, Balding D, Ruiz-Linares A. A genome-wide association scan implicates DCHS2, RUNX2, GLI3, PAX1 and EDAR in human facial variation. Nat Commun 2016; 7:11616. [PMID: 27193062 PMCID: PMC4874031 DOI: 10.1038/ncomms11616] [Citation(s) in RCA: 116] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2015] [Accepted: 04/14/2016] [Indexed: 12/28/2022] Open
Abstract
We report a genome-wide association scan for facial features in ∼6,000 Latin Americans. We evaluated 14 traits on an ordinal scale and found significant association (P values<5 × 10−8) at single-nucleotide polymorphisms (SNPs) in four genomic regions for three nose-related traits: columella inclination (4q31), nose bridge breadth (6p21) and nose wing breadth (7p13 and 20p11). In a subsample of ∼3,000 individuals we obtained quantitative traits related to 9 of the ordinal phenotypes and, also, a measure of nasion position. Quantitative analyses confirmed the ordinal-based associations, identified SNPs in 2q12 associated to chin protrusion, and replicated the reported association of nasion position with SNPs in PAX3. Strongest association in 2q12, 4q31, 6p21 and 7p13 was observed for SNPs in the EDAR, DCHS2, RUNX2 and GLI3 genes, respectively. Associated SNPs in 20p11 extend to PAX1. Consistent with the effect of EDAR on chin protrusion, we documented alterations of mandible length in mice with modified Edar funtion. Humans show great diversity in facial appearance and this variation is highly heritable. Here, Andres Ruiz-Linares and colleagues examined facial features in admixed Latin Americans and identify genome-wide associations for 14 facial traits, including four gene loci (RUNX2, GLI3, DCHS2 and PAX1) influencing nose morphology.
Collapse
Affiliation(s)
- Kaustubh Adhikari
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Macarena Fuentes-Guajardo
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Departamento de Tecnología Médica, Facultad de Ciencias de la Salud, Universidad de Tarapacá, Arica 1000009, Chile
| | - Mirsha Quinto-Sánchez
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina
| | - Javier Mendoza-Revilla
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Juan Camilo Chacón-Duque
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Victor Acuña-Alonzo
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México
| | - Claudia Jaramillo
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - William Arias
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - Rodrigo Barquera Lozano
- Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México.,Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Gastón Macín Pérez
- Laboratorio de Genética Molecular, Escuela Nacional de Antropologia e Historia, México City 14030, México.,Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Jorge Gómez-Valdés
- Departamento de Anatomía, Facultad de Medicina, Universidad Nacional Autónoma de México (UNAM), México City 04510, México
| | - Hugo Villamil-Ramírez
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Tábita Hunemeier
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Virginia Ramallo
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina.,Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Caio C Silva de Cerqueira
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina.,Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Malena Hurtado
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Valeria Villegas
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Vanessa Granja
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Carla Gallo
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Giovanni Poletti
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima 31, Perú
| | - Lavinia Schuler-Faccini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Francisco M Salzano
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Maria-Cátira Bortolini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Samuel Canizales-Quinteros
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México City 4510, México
| | - Michael Cheeseman
- Division of Developmental Biology, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Javier Rosique
- Departamento de Antropología, Universidad de Antioquia, Medellín 5001000, Colombia
| | - Gabriel Bedoya
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | | | - Denis Headon
- Division of Developmental Biology, The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Rolando González-José
- Centro Nacional Patagónico, CONICET, Unidad de Diversidad, Sistematica y Evolucion, Puerto Madryn U912OACD, Argentina
| | - David Balding
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK.,Schools of BioSciences and Mathematics and Statistics, University of Melbourne, Melbourne, Victoria 3010, Australia
| | - Andrés Ruiz-Linares
- Department of Genetics, Evolution and Environment, UCL Genetics Institute, University College London, London WC1E 6BT, UK
| |
Collapse
|
362
|
Cook JP, Morris AP. Multi-ethnic genome-wide association study identifies novel locus for type 2 diabetes susceptibility. Eur J Hum Genet 2016; 24:1175-80. [PMID: 27189021 PMCID: PMC4947384 DOI: 10.1038/ejhg.2016.17] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Revised: 12/21/2015] [Accepted: 02/01/2016] [Indexed: 12/16/2022] Open
Abstract
Genome-wide association studies (GWAS) have traditionally been undertaken in homogeneous populations from the same ancestry group. However, with the increasing availability of GWAS in large-scale multi-ethnic cohorts, we have evaluated a framework for detecting association of genetic variants with complex traits, allowing for population structure, and developed a powerful test of heterogeneity in allelic effects between ancestry groups. We have applied the methodology to identify and characterise loci associated with susceptibility to type 2 diabetes (T2D) using GWAS data from the Resource for Genetic Epidemiology on Adult Health and Aging, a large multi-ethnic population-based cohort, created for investigating the genetic and environmental basis of age-related diseases. We identified a novel locus for T2D susceptibility at genome-wide significance (P<5 × 10−8) that maps to TOMM40-APOE, a region previously implicated in lipid metabolism and Alzheimer's disease. We have also confirmed previous reports that single-nucleotide polymorphisms at the TCF7L2 locus demonstrate the greatest extent of heterogeneity in allelic effects between ethnic groups, with the lowest risk observed in populations of East Asian ancestry.
Collapse
Affiliation(s)
- James P Cook
- Department of Biostatistics, University of Liverpool, Liverpool, UK
| | - Andrew P Morris
- Department of Biostatistics, University of Liverpool, Liverpool, UK.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
| |
Collapse
|
363
|
Multiallelic copy number variation in the complement component 4A (C4A) gene is associated with late-stage age-related macular degeneration (AMD). J Neuroinflammation 2016; 13:81. [PMID: 27090374 PMCID: PMC4835888 DOI: 10.1186/s12974-016-0548-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 04/11/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Age-related macular degeneration (AMD) is the leading cause of vision loss in Western societies with a strong genetic component. Candidate gene studies as well as genome-wide association studies strongly implicated genetic variations in complement genes to be involved in disease risk. So far, no association of AMD with complement component 4 (C4) was reported probably due to the complex nature of the C4 locus on chromosome 6. METHODS We used multiplex ligation-dependent probe amplification (MLPA) to determine the copy number of the C4 gene as well as of both relevant isoforms, C4A and C4B, and assessed their association with AMD using logistic regression models. RESULTS Here, we report on the analysis of 2645 individuals (1536 probands and 1109 unaffected controls), across three different centers, for multiallelic copy number variation (CNV) at the C4 locus. We find strong statistical significance for association of increased copy number of C4A (OR 0.81 (0.73; 0.89);P = 4.4 × 10(-5)), with the effect most pronounced in individuals over 78 years (OR 0.67 (0.55; 0.81)) and females (OR 0.77 (0.68; 0.87)). Furthermore, this association is independent of known AMD-associated risk variants in the nearby CFB/C2 locus, particularly in females and in individuals over 78 years. CONCLUSIONS Our data strengthen the notion that complement dysregulation plays a crucial role in AMD etiology, an important finding for early intervention strategies and future therapeutics. In addition, for the first time, we provide evidence that multiallelic CNVs are associated with AMD pathology.
Collapse
|
364
|
Transcript Isoform Variation Associated with Cytosine Modification in Human Lymphoblastoid Cell Lines. Genetics 2016; 203:985-95. [PMID: 27029734 DOI: 10.1534/genetics.115.185504] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 03/27/2016] [Indexed: 11/18/2022] Open
Abstract
Cytosine modification on DNA is variable among individuals, which could correlate with gene expression variation. The effect of cytosine modification on interindividual transcript isoform variation (TIV), however, remains unclear. In this study, we assessed the extent of cytosine modification-specific TIV in lymphoblastoid cell lines (LCLs) derived from unrelated individuals of European and African descent. Our study detected cytosine modification-specific TIVs for 17% of the analyzed genes at a 5% false discovery rate. Forty-five percent of the TIV-associated cytosine modifications correlated with the overall gene expression levels as well, with the corresponding CpG sites overrepresented in transcript initiation sites, transcription factor binding sites, and distinct histone modification peaks, suggesting that alternative isoform transcription underlies the TIVs. Our analysis also revealed 33% of the TIV-associated cytosine modifications that affected specific exons, with the corresponding CpG sites overrepresented in exon/intron junctions, splicing branching points, and transcript termination sites, implying that the TIVs are attributable to alternative splicing or transcription termination. Genetic and epigenetic regulation of TIV shared target preference but exerted independent effects on 61% of the common exon targets. Cytosine modification-specific TIVs detected from LCLs were differentially enriched in those detected from various tissues in The Cancer Genome Atlas, indicating their developmental dependency. Genes containing cytosine modification-specific TIVs were enriched in pathways of cancers and metabolic disorders. Our study demonstrated a prominent effect of cytosine modification variation on the transcript isoform spectrum over gross transcript abundance and revealed epigenetic contributions to diseases that were mediated through cytosine modification-specific TIV.
Collapse
|
365
|
Bukowicki M, Franssen SU, Schlötterer C. High rates of phasing errors in highly polymorphic species with low levels of linkage disequilibrium. Mol Ecol Resour 2016; 16:874-82. [PMID: 26929272 DOI: 10.1111/1755-0998.12516] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Revised: 01/27/2016] [Accepted: 02/08/2016] [Indexed: 12/01/2022]
Abstract
Short read sequencing of diploid individuals does not permit the direct inference of the sequence on each of the two homologous chromosomes. Although various phasing software packages exist, they were primarily tailored for and tested on human data, which differ from other species in factors that influence phasing, such as SNP density, amounts of linkage disequilibrium (LD) and sample sizes. Despite becoming increasingly popular for other species, the reliability of phasing in non-human data has not been evaluated to a sufficient extent. We scrutinized the phasing accuracy for Drosophila melanogaster, a species with high polymorphism levels and reduced LD relative to humans. We phased two D. melanogaster populations and compared the results to the known haplotypes. The performance increased with size of the reference panel and was highest when the reference panel and phased individuals were from the same population. Full genomic SNP data and inclusion of sequence read information also improved phasing. Despite humans and Drosophila having similar switch error rates between polymorphic sites, the distances between switch errors were much shorter in Drosophila with only fragments <300-1500 bp being correctly phased with ≥95% confidence. This suggests that the higher SNP density cannot compensate for the higher recombination rate in D. melanogaster. Furthermore, we show that populations that have gone through demographic events such as bottlenecks can be phased with higher accuracy. Our results highlight that statistically phased data are particularly error prone in species with large population sizes or populations lacking suitable reference panels.
Collapse
Affiliation(s)
- Marek Bukowicki
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| | - Susanne U Franssen
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| | - Christian Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Veterinärplatz 1, Austria
| |
Collapse
|
366
|
Boitard S, Rodríguez W, Jay F, Mona S, Austerlitz F. Inferring Population Size History from Large Samples of Genome-Wide Molecular Data - An Approximate Bayesian Computation Approach. PLoS Genet 2016; 12:e1005877. [PMID: 26943927 PMCID: PMC4778914 DOI: 10.1371/journal.pgen.1005877] [Citation(s) in RCA: 102] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 01/27/2016] [Indexed: 12/02/2022] Open
Abstract
Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey), PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles. Molecular data sampled from extant individuals contains considerable information about their demographic history. In particular, one classical question in population genetics is to reconstruct past population size changes from such data. Relating these changes to various climatic, geological or anthropogenic events allows characterizing the main factors driving genetic diversity and can have major outcomes for conservation. Until recently, mostly very simple histories, including one or two population size changes, could be estimated from genetic data. This has changed with the sequencing of entire genomes in many species, and several methods allow now inferring complex histories consisting of several tens of population size changes. However, analyzing entire genomes, while accounting for recombination, remains a statistical and numerical challenge. These methods, therefore, can only be applied to small samples with a few diploid genomes. We overcome this limitation by using an approximate estimation approach, where observed genomes are summarized using a small number of statistics related to allele frequencies and linkage disequilibrium. In contrast to previous approaches, we show that our method allows us to reconstruct also the most recent part (the last 100 generations) of the population size history. As an illustration, we apply it to large samples of whole-genome sequences in four cattle breeds.
Collapse
Affiliation(s)
- Simon Boitard
- Institut de Systématique, Évolution, Biodiversité ISYEB - UMR 7205 - CNRS & MNHN & UPMC & EPHE, Ecole Pratique des Hautes Etudes, Sorbonne Universités, Paris, France
- GABI, INRA, AgroParisTech, Université Paris-Saclay, Jouy-en-Josas, France
- * E-mail:
| | - Willy Rodríguez
- UMR CNRS 5219, Institut de Mathématiques de Toulouse, Université de Toulouse, Toulouse, France
| | - Flora Jay
- UMR 7206 Eco-anthropologie et Ethnobiologie, Muséum National d’Histoire Naturelle, CNRS, Université Paris Diderot, Paris, France
- LRI, Paris-Sud University, CNRS UMR 8623, Orsay, France
| | - Stefano Mona
- Institut de Systématique, Évolution, Biodiversité ISYEB - UMR 7205 - CNRS & MNHN & UPMC & EPHE, Ecole Pratique des Hautes Etudes, Sorbonne Universités, Paris, France
| | - Frédéric Austerlitz
- UMR 7206 Eco-anthropologie et Ethnobiologie, Muséum National d’Histoire Naturelle, CNRS, Université Paris Diderot, Paris, France
| |
Collapse
|
367
|
Simonti CN, Vernot B, Bastarache L, Bottinger E, Carrell DS, Chisholm RL, Crosslin DR, Hebbring SJ, Jarvik GP, Kullo IJ, Li R, Pathak J, Ritchie MD, Roden DM, Verma SS, Tromp G, Prato JD, Bush WS, Akey JM, Denny JC, Capra JA. The phenotypic legacy of admixture between modern humans and Neandertals. Science 2016; 351:737-41. [PMID: 26912863 DOI: 10.1126/science.aad2149] [Citation(s) in RCA: 163] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many modern human genomes retain DNA inherited from interbreeding with archaic hominins, such as Neandertals, yet the influence of this admixture on human traits is largely unknown. We analyzed the contribution of common Neandertal variants to over 1000 electronic health record (EHR)-derived phenotypes in ~28,000 adults of European ancestry. We discovered and replicated associations of Neandertal alleles with neurological, psychiatric, immunological, and dermatological phenotypes. Neandertal alleles together explained a significant fraction of the variation in risk for depression and skin lesions resulting from sun exposure (actinic keratosis), and individual Neandertal alleles were significantly associated with specific human phenotypes, including hypercoagulation and tobacco use. Our results establish that archaic admixture influences disease risk in modern humans, provide hypotheses about the effects of hundreds of Neandertal haplotypes, and demonstrate the utility of EHR data in evolutionary analyses.
Collapse
Affiliation(s)
- Corinne N Simonti
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA
| | - Benjamin Vernot
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | | | - David S Carrell
- Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Rex L Chisholm
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - David R Crosslin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Scott J Hebbring
- Center for Human Genetics, Marshfield Clinic, Marshfield, WI, USA
| | - Gail P Jarvik
- Department of Genome Sciences, University of Washington, Seattle, WA, USA. Department of Medicine (Medical Genetics), University of Washington Medical Center, Seattle, WA, USA
| | - Iftikhar J Kullo
- Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN, USA
| | - Rongling Li
- Division of Genomic Medicine, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jyotishman Pathak
- Division of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Marylyn D Ritchie
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA. Biomedical and Translational Informatics, Geisinger Health System, Danville, PA, USA
| | - Dan M Roden
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Medicine, Vanderbilt University, Nashville, TN, USA. Department of Pharmacology, Vanderbilt University, Nashville, TN, USA
| | - Shefali S Verma
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| | - Gerard Tromp
- Weis Center for Research, Geisinger Health System, Danville, PA, USA. Division of Molecular Biology and Human Genetics, Department of Biomedical Sciences, Faculty of Health Science, Stellenbosch University, Tygerberg, South Africa
| | - Jeffrey D Prato
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA
| | - William S Bush
- Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Joshua C Denny
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Medicine, Vanderbilt University, Nashville, TN, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, USA. Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, USA. Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA. Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
368
|
DeLorenze GN, Nelson CL, Scott WK, Allen AS, Ray GT, Tsai AL, Quesenberry CP, Fowler VG. Polymorphisms in HLA Class II Genes Are Associated With Susceptibility to Staphylococcus aureus Infection in a White Population. J Infect Dis 2016; 213:816-23. [PMID: 26450422 PMCID: PMC4747615 DOI: 10.1093/infdis/jiv483] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Accepted: 09/30/2015] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Staphylococcus aureus can cause life-threatening infections. Human susceptibility to S. aureus infection may be influenced by host genetic variation. METHODS A genome-wide association study (GWAS) in a large health plan-based cohort included biologic specimens from 4701 culture-confirmed S. aureus cases and 45 344 matched controls; 584 535 single-nucleotide polymorphisms (SNPs) were genotyped on an array specific to individuals of European ancestry. Coverage was increased by imputation of >25 million common SNPs, using the 1000 Genomes Reference panel. In addition, human leukocyte antigen (HLA) serotypes were also imputed. RESULTS Logistic regression analysis, performed under the assumption of an additive genetic model, revealed several imputed SNPs (eg, rs115231074: odds ratio [OR], 1.22 [P = 1.3 × 10(-10)]; rs35079132: OR, 1.24 [P = 3.8 × 10(-8)]) achieving genome-wide significance on chromosome 6 in the HLA class II region. One adjacent genotyped SNP was nearly genome-wide significant (rs4321864: OR, 1.13; P = 8.8 × 10(-8)). These polymorphisms are located near the genes encoding HLA-DRA and HLA-DRB1. Results of further logistic regression analysis, in which the most significant GWAS SNPs were conditioned on HLA-DRB1*04 serotype, showed additional support for the strength of association between HLA class II genetic variants and S. aureus infection. CONCLUSIONS Our study results are the first reported evidence of human genetic susceptibility to S. aureus infection.
Collapse
Affiliation(s)
| | | | - William K Scott
- John P. Hussman Institute for Human Genomics Dr. John T. Macdonald Foundation Department of Human Genetics, University of Miami Miller School of Medicine, Florida
| | - Andrew S Allen
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, North Carolina
| | - G Thomas Ray
- Division of Research, Kaiser Permanente Northern California, Oakland
| | - Ai-Lin Tsai
- Division of Research, Kaiser Permanente Northern California, Oakland
| | | | - Vance G Fowler
- Duke Clinical Research Institute Division of Infectious Diseases, Duke University Medical Center
| |
Collapse
|
369
|
Adhikari K, Fontanil T, Cal S, Mendoza-Revilla J, Fuentes-Guajardo M, Chacón-Duque JC, Al-Saadi F, Johansson JA, Quinto-Sanchez M, Acuña-Alonzo V, Jaramillo C, Arias W, Barquera Lozano R, Macín Pérez G, Gómez-Valdés J, Villamil-Ramírez H, Hunemeier T, Ramallo V, Silva de Cerqueira CC, Hurtado M, Villegas V, Granja V, Gallo C, Poletti G, Schuler-Faccini L, Salzano FM, Bortolini MC, Canizales-Quinteros S, Rothhammer F, Bedoya G, Gonzalez-José R, Headon D, López-Otín C, Tobin DJ, Balding D, Ruiz-Linares A. A genome-wide association scan in admixed Latin Americans identifies loci influencing facial and scalp hair features. Nat Commun 2016; 7:10815. [PMID: 26926045 PMCID: PMC4773514 DOI: 10.1038/ncomms10815] [Citation(s) in RCA: 121] [Impact Index Per Article: 15.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 01/25/2016] [Indexed: 12/20/2022] Open
Abstract
We report a genome-wide association scan in over 6,000 Latin Americans for features of scalp hair (shape, colour, greying, balding) and facial hair (beard thickness, monobrow, eyebrow thickness). We found 18 signals of association reaching genome-wide significance (P values 5 × 10(-8) to 3 × 10(-119)), including 10 novel associations. These include novel loci for scalp hair shape and balding, and the first reported loci for hair greying, monobrow, eyebrow and beard thickness. A newly identified locus influencing hair shape includes a Q30R substitution in the Protease Serine S1 family member 53 (PRSS53). We demonstrate that this enzyme is highly expressed in the hair follicle, especially the inner root sheath, and that the Q30R substitution affects enzyme processing and secretion. The genome regions associated with hair features are enriched for signals of selection, consistent with proposals regarding the evolution of human hair.
Collapse
Affiliation(s)
- Kaustubh Adhikari
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Tania Fontanil
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Santiago Cal
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Javier Mendoza-Revilla
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Macarena Fuentes-Guajardo
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Departamento de Tecnología Médica, Facultad de Ciencias de la Salud, Universidad de Tarapacá, Arica 1000009, Chile
| | - Juan-Camilo Chacón-Duque
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Farah Al-Saadi
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| | - Jeanette A. Johansson
- Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | | | - Victor Acuña-Alonzo
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- National Institute of Anthropology and History, México 4510, México
| | - Claudia Jaramillo
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - William Arias
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | - Rodrigo Barquera Lozano
- National Institute of Anthropology and History, México 4510, México
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | - Gastón Macín Pérez
- National Institute of Anthropology and History, México 4510, México
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | | | - Hugo Villamil-Ramírez
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | - Tábita Hunemeier
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Virginia Ramallo
- Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Caio C. Silva de Cerqueira
- Centro Nacional Patagónico, CONICET, Puerto Madryn U9129ACD, Argentina
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Malena Hurtado
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Valeria Villegas
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Vanessa Granja
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Carla Gallo
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Giovanni Poletti
- Laboratorios de Investigación y Desarrollo, Facultad de Ciencias y Filosofía, Universidad Peruana Cayetano Heredia, Lima, 31, Perú
| | - Lavinia Schuler-Faccini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Francisco M. Salzano
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Maria-Cátira Bortolini
- Departamento de Genética, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brasil
| | - Samuel Canizales-Quinteros
- Unidad de Genómica de Poblaciones Aplicada a la Salud, Facultad de Química, UNAM-Instituto Nacional de Medicina Genómica, México 4510, México
| | | | - Gabriel Bedoya
- GENMOL (Genética Molecular), Universidad de Antioquia, Medellín 5001000, Colombia
| | | | - Denis Headon
- Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian EH25 9RG, UK
| | - Carlos López-Otín
- Departamento de Bioquímica y Biología Molecular, IUOPA, Universidad de Oviedo, Oviedo 33006, Spain
| | - Desmond J. Tobin
- Centre for Skin Sciences, Faculty of Life Sciences, University of Bradford, Bradford BD7 1DP, Victoria, UK
| | - David Balding
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
- Schools of BioSciences and Mathematics and Statistics, University of Melbourne, Melbourne 3010, Australia
| | - Andrés Ruiz-Linares
- Department of Genetics, Evolution and Environment, and UCL Genetics Institute, University College London, London WC1E 6BT, UK
| |
Collapse
|
370
|
Genetic variants near MLST8 and DHX57 affect the epigenetic age of the cerebellum. Nat Commun 2016; 7:10561. [PMID: 26830004 PMCID: PMC4740877 DOI: 10.1038/ncomms10561] [Citation(s) in RCA: 57] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 12/29/2015] [Indexed: 12/17/2022] Open
Abstract
DNA methylation (DNAm) levels lend themselves for defining an epigenetic biomarker of aging known as the ‘epigenetic clock'. Our genome-wide association study (GWAS) of cerebellar epigenetic age acceleration identifies five significant (P<5.0 × 10−8) SNPs in two loci: 2p22.1 (inside gene DHX57) and 16p13.3 near gene MLST8 (a subunit of mTOR complex 1 and 2). We find that the SNP in 16p13.3 has a cis-acting effect on the expression levels of MLST8 (P=6.9 × 10−18) in most brain regions. In cerebellar samples, the SNP in 2p22.1 has a cis-effect on DHX57 (P=4.4 × 10−5). Gene sets found by our GWAS analysis of cerebellar age acceleration exhibit significant overlap with those of Alzheimer's disease (P=4.4 × 10−15), age-related macular degeneration (P=6.4 × 10−6), and Parkinson's disease (P=2.6 × 10−4). Overall, our results demonstrate the utility of a new paradigm for understanding aging and age-related diseases: it will be fruitful to use epigenetic tissue age as endophenotype in GWAS. This genome-wide association study identifies five significant SNPs in two loci which are associated with the epigenetic age of post-mortem cerebellar tissue according to a DNA methylation based biomarker of human aging.
Collapse
|
371
|
Levine ME, Lu AT, Bennett DA, Horvath S. Epigenetic age of the pre-frontal cortex is associated with neuritic plaques, amyloid load, and Alzheimer's disease related cognitive functioning. Aging (Albany NY) 2015; 7:1198-211. [PMID: 26684672 PMCID: PMC4712342 DOI: 10.18632/aging.100864] [Citation(s) in RCA: 298] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
There is an urgent need to develop molecular biomarkers of brain age in order to advance our understanding of age related neurodegeneration. Recently, we developed a highly accurate epigenetic biomarker of tissue age (known as epigenetic clock) which is based on DNA methylation levels. Here we use n=700 dorsolateral prefrontal cortex (DLPFC) samples from Caucasian subjects of the Religious Order Study and the Rush Memory and Aging Project to examine the association between epigenetic age and Alzheimer's disease (AD) related cognitive decline, and AD related neuropathological markers. Epigenetic age acceleration of DLPFC is correlated with several neuropathological measurements including diffuse plaques (r=0.12, p=0.0015), neuritic plaques (r=0.11, p=0.0036), and amyloid load (r=0.091, p=0.016). Further, it is associated with a decline in global cognitive functioning (β=-0.500, p=0.009), episodic memory (β=-0.411, p=0.009) and working memory (β=-0.405, p=0.011) among individuals with AD. The neuropathological markers may mediate the association between epigenetic age and cognitive decline. Genetic complex trait analysis (GCTA) revealed that epigenetic age acceleration is heritable (h2=0.41) and has significant genetic correlations with diffuse plaques (r=0.24, p=0.010) and possibly working memory (r=-0.35, p=0.065). Overall, these results suggest that the epigenetic clock may lend itself as a molecular biomarker of brain age.
Collapse
Affiliation(s)
- Morgan E. Levine
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA,2 Center for Neurobehavioral Genetics, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - Ake T. Lu
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA
| | - David A. Bennett
- 3 Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL 60612, USA,4 Department of Neurological Sciences, Rush University Medical Center, Chicago, IL 60612, USA
| | - Steve Horvath
- 1 Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA 90095, USA,5 Biostatistics, School of Public Health, University of California Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
372
|
Traylor M, Anderson CD, Hurford R, Bevan S, Markus HS. Oxidative phosphorylation and lacunar stroke: Genome-wide enrichment analysis of common variants. Neurology 2015; 86:141-5. [PMID: 26674331 PMCID: PMC4731691 DOI: 10.1212/wnl.0000000000002260] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 09/08/2015] [Indexed: 11/15/2022] Open
Abstract
OBJECTIVE We investigated whether oxidative phosphorylation (OXPHOS) abnormalities were associated with lacunar stroke, hypothesizing that these would be more strongly associated in patients with multiple lacunar infarcts and leukoaraiosis (LA). METHODS In 1,012 MRI-confirmed lacunar stroke cases and 964 age-matched controls recruited from general practice surgeries, we investigated associations between common genetic variants within the OXPHOS pathway and lacunar stroke using a permutation-based enrichment approach. Cases were phenotyped using MRI into those with multiple infarcts or LA (MLI/LA) and those with isolated lacunar infarcts (ILI) based on the number of subcortical infarcts and degree of LA, using the Fazekas grading. Using gene-level association statistics, we tested for enrichment of genes in the OXPHOS pathway with all lacunar stroke and the 2 subtypes. RESULTS There was a specific association with strong evidence of enrichment in the top 1% of genes in the MLI/LA (subtype p = 0.0017) but not in the ILI subtype (p = 1). Genes in the top percentile for the all lacunar stroke analysis were not significantly enriched (p = 0.07). CONCLUSIONS Our results implicate the OXPHOS pathway in the pathogenesis of lacunar stroke, and show the association is specific to patients with the MLI/LA subtype. They show that MRI-based subtyping of lacunar stroke can provide insights into disease pathophysiology, and imply that different radiologic subtypes of lacunar stroke subtypes have distinct underlying pathophysiologic processes.
Collapse
Affiliation(s)
- Matthew Traylor
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston.
| | - Christopher D Anderson
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Robert Hurford
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Steve Bevan
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| | - Hugh S Markus
- From Clinical Neurosciences (M.T., R.H., H.S.M.), University of Cambridge, UK; School of Life Science (S.B.), University of Lincoln, UK; and the Center for Human Genetic Research (C.D.A.), Department of Neurology, Massachusetts General Hospital, Boston
| |
Collapse
|
373
|
Benton MC, Lea RA, Macartney-Coxson D, Bellis C, Carless MA, Curran JE, Hanna M, Eccles D, Chambers GK, Blangero J, Griffiths LR. Serum bilirubin concentration is modified by UGT1A1 haplotypes and influences risk of type-2 diabetes in the Norfolk Island genetic isolate. BMC Genet 2015; 16:136. [PMID: 26628212 PMCID: PMC4667444 DOI: 10.1186/s12863-015-0291-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 11/02/2015] [Indexed: 02/06/2023] Open
Abstract
Background Located in the Pacific Ocean between Australia and New Zealand, the unique population isolate of Norfolk Island has been shown to exhibit increased prevalence of metabolic disorders (type-2 diabetes, cardiovascular disease) compared to mainland Australia. We investigated this well-established genetic isolate, utilising its unique genomic structure to increase the ability to detect related genetic markers. A pedigree-based genome-wide association study of 16 routinely collected blood-based clinical traits in 382 Norfolk Island individuals was performed. Results A striking association peak was located at chromosome 2q37.1 for both total bilirubin and direct bilirubin, with 29 SNPs reaching statistical significance (P < 1.84 × 10−7). Strong linkage disequilibrium was observed across a 200 kb region spanning the UDP-glucuronosyltransferase family, including UGT1A1, an enzyme known to metabolise bilirubin. Given the epidemiological literature suggesting negative association between CVD-risk and serum bilirubin we further explored potential associations using stepwise multivariate regression, revealing significant association between direct bilirubin concentration and type-2 diabetes risk. In the Norfolk Island cohort increased direct bilirubin was associated with a 28 % reduction in type-2 diabetes risk (OR: 0.72, 95 % CI: 0.57-0.91, P = 0.005). When adjusted for genotypic effects the overall model was validated, with the adjusted model predicting a 30 % reduction in type-2 diabetes risk with increasing direct bilirubin concentrations (OR: 0.70, 95 % CI: 0.53-0.89, P = 0.0001). Conclusions In summary, a pedigree-based GWAS of blood-based clinical traits in the Norfolk Island population has identified variants within the UDPGT family directly associated with serum bilirubin levels, which is in turn implicated with reduced risk of developing type-2 diabetes within this population. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0291-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- M C Benton
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - R A Lea
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - D Macartney-Coxson
- Kenepuru Science Centre, Institute of Environmental Science and Research, Wellington, 5240, New Zealand.
| | - C Bellis
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia. .,Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - M A Carless
- Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - J E Curran
- Texas Biomedical Research Institute, San Antonio, TX, 78227-5301, USA.
| | - M Hanna
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - D Eccles
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| | - G K Chambers
- School of Biological Sciences, Victoria University of Wellington, Wellington, 6140, New Zealand.
| | - J Blangero
- South Texas Diabetes and Obesity Institute, University of Texas, Rio Grande Valley School of Medicine, Brownsville, TX, 78520, USA.
| | - L R Griffiths
- Genomics Research Centre, Institute of Health and Biomedical Innovation, Queensland University of Technology, Kelvin Grove, QLD, 4059, Australia.
| |
Collapse
|
374
|
Conjunctival fibrosis and the innate barriers to Chlamydia trachomatis intracellular infection: a genome wide association study. Sci Rep 2015; 5:17447. [PMID: 26616738 PMCID: PMC4663496 DOI: 10.1038/srep17447] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Accepted: 10/29/2015] [Indexed: 01/26/2023] Open
Abstract
Chlamydia trachomatis causes both trachoma and sexually transmitted
infections. These diseases have similar pathology and potentially similar genetic
predisposing factors. We aimed to identify polymorphisms and pathways associated
with pathological sequelae of ocular Chlamydia trachomatis infections in The
Gambia. We report a discovery phase genome-wide association study (GWAS) of scarring
trachoma (1090 cases, 1531 controls) that identified 27 SNPs with strong, but not
genome-wide significant, association with disease
(5 × 10−6 > P > 5 × 10−8).
The most strongly associated SNP (rs111513399,
P = 5.38 × 10−7)
fell within a gene (PREX2) with homology to factors known to facilitate
chlamydial entry to the host cell. Pathway analysis of GWAS data was significantly
enriched for mitotic cell cycle processes (P = 0.001), the
immune response (P = 0.00001) and for multiple cell surface
receptor signalling pathways. New analyses of published transcriptome data sets from
Gambia, Tanzania and Ethiopia also revealed that the same cell cycle and immune
response pathways were enriched at the transcriptional level in various disease
states. Although unconfirmed, the data suggest that genetic associations with
chlamydial scarring disease may be focussed on processes relating to the immune
response, the host cell cycle and cell surface receptor signalling.
Collapse
|
375
|
The role of common genetic variation in educational attainment and income: evidence from the National Child Development Study. Sci Rep 2015; 5:16509. [PMID: 26561353 PMCID: PMC4642349 DOI: 10.1038/srep16509] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Accepted: 10/14/2015] [Indexed: 11/16/2022] Open
Abstract
We investigated the role of common genetic variation in educational attainment and household income. We used data from 5,458 participants of the National Child Development Study to estimate: 1) the associations of rs9320913, rs11584700 and rs4851266 and socioeconomic position and educational phenotypes; and 2) the univariate chip-heritability of each phenotype, and the genetic correlation between each phenotype and educational attainment at age 16. The three SNPs were associated with most measures of educational attainment. Common genetic variation contributed to 6 of 14 socioeconomic background phenotypes, and 17 of 29 educational phenotypes. We found evidence of genetic correlations between educational attainment at age 16 and 4 of 14 social background and 8 of 28 educational phenotypes. This suggests common genetic variation contributes both to differences in educational attainment and its relationship with other phenotypes. However, we remain cautious that cryptic population structure, assortative mating, and dynastic effects may influence these associations.
Collapse
|
376
|
Seldin MF, Alkhairy OK, Lee AT, Lamb JA, Sussman J, Pirskanen-Matell R, Piehl F, Verschuuren JJGM, Kostera-Pruszczyk A, Szczudlik P, McKee D, Maniaol AH, Harbo HF, Lie BA, Melms A, Garchon HJ, Willcox N, Gregersen PK, Hammarstrom L. Genome-Wide Association Study of Late-Onset Myasthenia Gravis: Confirmation of TNFRSF11A and Identification of ZBTB10 and Three Distinct HLA Associations. Mol Med 2015; 21:769-781. [PMID: 26562150 DOI: 10.2119/molmed.2015.00232] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Accepted: 11/09/2015] [Indexed: 01/05/2023] Open
Abstract
To investigate the genetics of late-onset myasthenia gravis (LOMG), we conducted a genome-wide association study imputation of>6 million single nucleotide polymorphisms (SNPs) in 532 LOMG cases (anti-acetylcholine receptor [AChR] antibody positive; onset age≥50 years) and 2,128 controls matched for sex and population substructure. The data confirm reported TNFRSF11A associations (rs4574025, P = 3.9 × 10-7, odds ratio [OR] 1.42) and identify a novel candidate gene, ZBTB10, achieving genome-wide significance (rs6998967, P = 8.9 × 10-10, OR 0.53). Several other SNPs showed suggestive significance including rs2476601 (P = 6.5 × 10-6, OR 1.62) encoding the PTPN22 R620W variant noted in early-onset myasthenia gravis (EOMG) and other autoimmune diseases. In contrast, EOMG-associated SNPs in TNIP1 showed no association in LOMG, nor did other loci suggested for EOMG. Many SNPs within the major histocompatibility complex (MHC) region showed strong associations in LOMG, but with smaller effect sizes than in EOMG (highest OR ~2 versus ~6 in EOMG). Moreover, the strongest associations were in opposite directions from EOMG, including an OR of 0.54 for DQA1*05:01 in LOMG (P = 5.9 × 10-12) versus 2.82 in EOMG (P = 3.86 × 10-45). Association and conditioning studies for the MHC region showed three distinct and largely independent association peaks for LOMG corresponding to (a) MHC class II (highest attenuation when conditioning on DQA1), (b) HLA-A and (c) MHC class III SNPs. Conditioning studies of human leukocyte antigen (HLA) amino acid residues also suggest potential functional correlates. Together, these findings emphasize the value of subgrouping myasthenia gravis patients for clinical and basic investigations and imply distinct predisposing mechanisms in LOMG.
Collapse
Affiliation(s)
- Michael F Seldin
- Department of Biochemistry and Molecular Medicine, and Department of Medicine, University of California, Davis, California, United States of America
| | - Omar K Alkhairy
- Division of Clinical Immunology, Karolinska Institutet at Karolinska University Hospital Huddinge, Stockholm, Sweden
| | - Annette T Lee
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, New York, United States of America
| | - Janine A Lamb
- Centre for Integrated Genomic Medical Research, Manchester Academic Health Science Centre, University of Manchester, Manchester, United Kingdom
| | - Jon Sussman
- Department of Neurology, Greater Manchester Neuroscience Centre, Manchester, United Kingdom
| | | | - Fredrik Piehl
- Department of Neurology, Karolinska University Hospital Solna, Stockholm, Sweden
| | | | | | - Piotr Szczudlik
- Department of Neurology, Medical University of Warsaw, Warsaw, Poland
| | - David McKee
- Department of Neurology, Greater Manchester Neuroscience Centre, Manchester, United Kingdom
| | - Angelina H Maniaol
- Department of Neurology, Oslo University Hospital, Ullevål, Oslo, Norway
| | - Hanne F Harbo
- Department of Neurology, Oslo University Hospital and University of Oslo, Oslo, Norway
| | - Benedicte A Lie
- Department of Medical Genetics, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Arthur Melms
- Department of Neurology, Tübingen University Medical Center, Tübingen, Germany, and Neurologische Klinik, Universitàtsklinikum Erlangen, Erlangen, Germany
| | | | - Nicholas Willcox
- Nuffield Department of Clinical Neurosciences, Weatherall Institute for Molecular Medicine, University of Oxford, Oxford, United Kingdom
| | - Peter K Gregersen
- The Robert S. Boas Center for Genomics and Human Genetics, Feinstein Institute for Medical Research, North Shore-LIJ Health System, Manhasset, New York, United States of America
| | - Lennart Hammarstrom
- Division of Clinical Immunology, Karolinska Institutet at Karolinska University Hospital Huddinge, Stockholm, Sweden
| |
Collapse
|
377
|
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. A global reference for human genetic variation. Nature 2015; 526:68-74. [PMID: 26432245 PMCID: PMC4750478 DOI: 10.1038/nature15393] [Citation(s) in RCA: 10844] [Impact Index Per Article: 1204.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 08/20/2015] [Indexed: 12/04/2022]
Abstract
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing to a diverse set of individuals from multiple populations. Here we report completion of the project, having reconstructed the genomes of 2,504 individuals from 26 populations using a combination of low-coverage whole-genome sequencing, deep exome sequencing, and dense microarray genotyping. We characterized a broad spectrum of genetic variation, in total over 88 million variants (84.7 million single nucleotide polymorphisms (SNPs), 3.6 million short insertions/deletions (indels), and 60,000 structural variants), all phased onto high-quality haplotypes. This resource includes >99% of SNP variants with a frequency of >1% for a variety of ancestries. We describe the distribution of genetic variation across the global sample, and discuss the implications for common disease studies. Results for the final phase of the 1000 Genomes Project are presented including whole-genome sequencing, targeted exome sequencing, and genotyping on high-density SNP arrays for 2,504 individuals across 26 populations, providing a global reference data set to support biomedical genetics. The 1000 Genomes Project has sought to comprehensively catalogue human genetic variation across populations, providing a valuable public genomic resource. The data obtained so far have found applications ranging from association studies and fine mapping studies to the filtering of likely neutral variants in rare-disease cohorts. The authors now report on the final phase of the project, phase 3, which covers previously uncharacterized areas of human genetic diversity in terms of the populations sampled and categories of characterized variation. The sample now includes more than 2,500 individuals from 26 global populations, with low coverage whole-genome and deep exome sequencing, as well as dense microarray genotyping. They find that while most common variants are shared across populations, rarer variants are often restricted to closely related populations. The authors also demonstrate the use of the phase 3 dataset as a reference panel for imputation to improve the resolution in genetic association studies.
Collapse
|
378
|
Coronary risk in relation to genetic variation in MEOX2 and TCF15 in a Flemish population. BMC Genet 2015; 16:116. [PMID: 26428460 PMCID: PMC4591634 DOI: 10.1186/s12863-015-0272-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Accepted: 09/11/2015] [Indexed: 01/07/2023] Open
Abstract
Background In mice MEOX2/TCF15 heterodimers are highly expressed in heart endothelial cells and are involved in the transcriptional regulation of lipid transport. In a general population, we investigated whether genetic variation in these genes predicted coronary heart disease (CHD). Results In 2027 participants randomly recruited from a Flemish population (51.0 % women; mean age 43.6 years), we genotyped six SNPs in MEOX2 and four in TCF15. Over 15.2 years (median), CHD, myocardial infarction, coronary revascularisation and ischaemic cardiomyopathy occurred in 106, 53, 78 and 22 participants. For SNPs, we contrasted CHD risk in minor-allele heterozygotes and homozygotes (variant) vs. major-allele homozygotes (reference) and for haplotypes carriers (variant) vs. non-carriers. In multivariable-adjusted analyses with correction for multiple testing, CHD risk was associated with MEOX2 SNPs (P ≤ 0.049), but not with TCF15 SNPs (P ≥ 0.29). The MEOX2 GTCCGC haplotype (frequency 16.5 %) was associated with the sex- and age-standardised CHD incidence (5.26 vs. 3.03 events per 1000 person-years; P = 0.036); the multivariable-adjusted hazard ratio [HR] of CHD was 1.78 (95 % confidence interval, 1.25–2.56; P = 0.0054). For myocardial infarction, coronary revascularisation, and ischaemic cardiomyopathy, the corresponding HRs were 1.96 (1.16–3.31), 1.87 (1.20–2.91) and 3.16 (1.41–7.09), respectively. The MEOX2 GTCCGC haplotype significantly improved the prediction of CHD over and beyond traditional risk factors and was associated with similar population-attributable risk as smoking (18.7 % vs. 16.2 %). Conclusions Genetic variation in MEOX2, but not TCF15, is a strong predictor of CHD. Further experimental studies should elucidate the underlying molecular mechanisms. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0272-2) contains supplementary material, which is available to authorized users.
Collapse
|
379
|
Abstract
Large population studies of immune system genes are essential for characterizing their role in diseases, including autoimmune conditions. Of key interest are a group of genes encoding the killer cell immunoglobulin-like receptors (KIRs), which have known and hypothesized roles in autoimmune diseases, resistance to viruses, reproductive conditions, and cancer. These genes are highly polymorphic, which makes typing expensive and time consuming. Consequently, despite their importance, KIRs have been little studied in large cohorts. Statistical imputation methods developed for other complex loci (e.g., human leukocyte antigen [HLA]) on the basis of SNP data provide an inexpensive high-throughput alternative to direct laboratory typing of these loci and have enabled important findings and insights for many diseases. We present KIR∗IMP, a method for imputation of KIR copy number. We show that KIR∗IMP is highly accurate and thus allows the study of KIRs in large cohorts and enables detailed investigation of the role of KIRs in human disease.
Collapse
|
380
|
Kanterakis A, Deelen P, van Dijk F, Byelas H, Dijkstra M, Swertz MA. Molgenis-impute: imputation pipeline in a box. BMC Res Notes 2015; 8:359. [PMID: 26286716 PMCID: PMC4541731 DOI: 10.1186/s13104-015-1309-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2014] [Accepted: 07/30/2015] [Indexed: 12/12/2022] Open
Abstract
Background Genotype imputation is an important procedure in current genomic analysis such as genome-wide association studies, meta-analyses and fine mapping. Although high quality tools are available that perform the steps of this process, considerable effort and expertise is required to set up and run a best practice imputation pipeline, particularly for larger genotype datasets, where imputation has to scale out in parallel on computer clusters. Results Here we present MOLGENIS-impute, an ‘imputation in a box’ solution that seamlessly and transparently automates the set up and running of all the steps of the imputation process. These steps include genome build liftover (liftovering), genotype phasing with SHAPEIT2, quality control, sample and chromosomal chunking/merging, and imputation with IMPUTE2. MOLGENIS-impute builds on MOLGENIS-compute, a simple pipeline management platform for submission and monitoring of bioinformatics tasks in High Performance Computing (HPC) environments like local/cloud servers, clusters and grids. All the required tools, data and scripts are downloaded and installed in a single step. Researchers with diverse backgrounds and expertise have tested MOLGENIS-impute on different locations and imputed over 30,000 samples so far using the 1,000 Genomes Project and new Genome of the Netherlands data as the imputation reference. The tests have been performed on PBS/SGE clusters, cloud VMs and in a grid HPC environment. Conclusions MOLGENIS-impute gives priority to the ease of setting up, configuring and running an imputation. It has minimal dependencies and wraps the pipeline in a simple command line interface, without sacrificing flexibility to adapt or limiting the options of underlying imputation tools. It does not require knowledge of a workflow system or programming, and is targeted at researchers who just want to apply best practices in imputation via simple commands. It is built on the MOLGENIS compute workflow framework to enable customization with additional computational steps or it can be included in other bioinformatics pipelines. It is available as open source from: https://github.com/molgenis/molgenis-imputation. Electronic supplementary material The online version of this article (doi:10.1186/s13104-015-1309-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alexandros Kanterakis
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Patrick Deelen
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Freerk van Dijk
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Heorhiy Byelas
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Martijn Dijkstra
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| | - Morris A Swertz
- Department of Genetics, Genomics Coordination Center, University Medical Center Groningen and University of Groningen, Genetics, UMCG, PO Box 30 001, 9700 RB, Groningen, The Netherlands.
| |
Collapse
|
381
|
Li W, Fu G, Rao W, Xu W, Ma L, Guo S, Song Q. GenomeLaser: fast and accurate haplotyping from pedigree genotypes. Bioinformatics 2015; 31:3984-7. [PMID: 26286810 DOI: 10.1093/bioinformatics/btv452] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 07/28/2015] [Indexed: 01/12/2023] Open
Abstract
UNLABELLED We present a software tool called GenomeLaser that determines the haplotypes of each person from unphased high-throughput genotypes in family pedigrees. This method features high accuracy, chromosome-range phasing distance, linear computing, flexible pedigree types and flexible genetic marker types. AVAILABILITY AND IMPLEMENTATION http://www.4dgenome.com/software/genomelaser.html.
Collapse
Affiliation(s)
- Wenzhi Li
- Department of Neurosurgery, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China, Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA
| | - Guoxing Fu
- 4DGENOME Inc, Atlanta, GA, 30033 USA and
| | | | - Wei Xu
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA
| | - Li Ma
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA, 4DGENOME Inc, Atlanta, GA, 30033 USA and
| | - Shiwen Guo
- Department of Neurosurgery, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China
| | - Qing Song
- Cardiovascular Research Institute and Department of Medicine, Morehouse School of Medicine, Atlanta, GA, 30310 USA, 4DGENOME Inc, Atlanta, GA, 30033 USA and Center of Big Data and Bioinformatics, First Affiliated Hospital of Medical School, Xi'an Jiaotong University, Xi'an, Shaanxi, 710061 China
| |
Collapse
|
382
|
Maclean CA, Chue Hong NP, Prendergast JGD. hapbin: An Efficient Program for Performing Haplotype-Based Scans for Positive Selection in Large Genomic Datasets. Mol Biol Evol 2015; 32:3027-9. [PMID: 26248562 PMCID: PMC4651233 DOI: 10.1093/molbev/msv172] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 07/27/2015] [Indexed: 11/19/2022] Open
Abstract
Understanding how the genome is shaped by selective processes forms an integral part of modern biology. However, as genomic datasets continue to grow larger it is becoming increasingly difficult to apply traditional statistics for detecting signatures of selection to these cohorts. There is therefore a pressing need for the development of the next generation of computational and analytical tools for detecting signatures of selection in large genomic datasets. Here, we present hapbin, an efficient multithreaded implementation of extended haplotype homzygosity-based statistics for detecting selection, which is up to 3,400 times faster than the current fastest implementations of these algorithms.
Collapse
Affiliation(s)
- Colin A Maclean
- EPCC, School of Physics and Astronomy, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Neil P Chue Hong
- EPCC, School of Physics and Astronomy, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | | |
Collapse
|
383
|
Multicohort analysis of the maternal age effect on recombination. Nat Commun 2015; 6:7846. [PMID: 26242864 PMCID: PMC4580993 DOI: 10.1038/ncomms8846] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 06/18/2015] [Indexed: 11/09/2022] Open
Abstract
Several studies have reported that the number of crossovers increases with maternal age in humans, but others have found the opposite. Resolving the true effect has implications for understanding the maternal age effect on aneuploidies. Here, we revisit this question in the largest sample to date using single nucleotide polymorphism (SNP)-chip data, comprising over 6,000 meioses from nine cohorts. We develop and fit a hierarchical model to allow for differences between cohorts and between mothers. We estimate that over 10 years, the expected number of maternal crossovers increases by 2.1% (95% credible interval (0.98%, 3.3%)). Our results are not consistent with the larger positive and negative effects previously reported in smaller cohorts. We see heterogeneity between cohorts that is likely due to chance effects in smaller samples, or possibly to confounders, emphasizing that care should be taken when interpreting results from any specific cohort about the effect of maternal age on recombination.
Collapse
|
384
|
VanRaden PM, Sun C, O'Connell JR. Fast imputation using medium or low-coverage sequence data. BMC Genet 2015; 16:82. [PMID: 26168789 PMCID: PMC4501077 DOI: 10.1186/s12863-015-0243-7] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 06/29/2015] [Indexed: 12/23/2022] Open
Abstract
Background Accurate genotype imputation can greatly reduce costs and increase benefits by combining whole-genome sequence data of varying read depth and array genotypes of varying densities. For large populations, an efficient strategy chooses the two haplotypes most likely to form each genotype and updates posterior allele probabilities from prior probabilities within those two haplotypes as each individual’s sequence is processed. Directly using allele read counts can improve imputation accuracy and reduce computation compared with calling or computing genotype probabilities first and then imputing. Results A new algorithm was implemented in findhap (version 4) software and tested using simulated bovine and actual human sequence data with different combinations of reference population size, sequence read depth and error rate. Read depths of ≥8× may be desired for direct investigation of sequenced individuals, but for a given total cost, sequencing more individuals at read depths of 2× to 4× gave more accurate imputation from array genotypes. Imputation accuracy improved further if reference individuals had both low-coverage sequence and high-density (HD) microarray data, and remained high even with a read error rate of 16 %. With read depths of ≤4×, findhap (version 4) had higher accuracy than Beagle (version 4); computing time was up to 400 times faster with findhap than with Beagle. For 10,000 sequenced individuals plus 250 with HD array genotypes to test imputation, findhap used 7 hours, 10 processors and 50 GB of memory for 1 million loci on one chromosome. Computing times increased in proportion to population size but less than proportional to number of variants. Conclusions Simultaneous genotype calling from low-coverage sequence data and imputation from array genotypes of various densities is done very efficiently within findhap by updating allele probabilities within the two haplotypes for each individual. Accuracy of genotype calling and imputation were high with both simulated bovine and actual human genomes reduced to low-coverage sequence and HD microarray data. More efficient imputation allows geneticists to locate and test effects of more DNA variants from more individuals and to include those in future prediction and selection.
Collapse
Affiliation(s)
- Paul M VanRaden
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture, Beltsville, MD, 20705-2350, USA.
| | - Chuanyu Sun
- National Association of Animal Breeders, Columbia, Missouri, 65205, USA.
| | | |
Collapse
|
385
|
Hartati H, Utsunomiya YT, Sonstegard TS, Garcia JF, Jakaria J, Muladno M. Evidence of Bos javanicus x Bos indicus hybridization and major QTLs for birth weight in Indonesian Peranakan Ongole cattle. BMC Genet 2015; 16:75. [PMID: 26141727 PMCID: PMC4491226 DOI: 10.1186/s12863-015-0229-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 06/10/2015] [Indexed: 11/10/2022] Open
Abstract
Background Peranakan Ongole (PO) is a major Indonesian Bos indicus breed that derives from animals imported from India in the late 19th century. Early imports were followed by hybridization with the Bos javanicus subspecies of cattle. Here, we used genomic data to partition the ancestry components of PO cattle and map loci implicated in birth weight. Results We found that B. javanicus contributes about 6-7 % to the average breed composition of PO cattle. Only two nearly fixed B. javanicus haplotypes were identified, suggesting that most of the B. javanicus variants are segregating under drift or by the action of balancing selection. The zebu component of the PO genome was estimated to derive from at least two distinct ancestral pools. Additionally, well-known loci underlying body size in other beef cattle breeds, such as the PLAG1 region on chromosome 14, were found to also affect birth weight in PO cattle. Conclusions This study is the first attempt to characterize PO at the genome level, and contributes evidence of successful, stabilized B. indicus x B. javanicus hybridization. Additionally, previously described loci implicated in body size in worldwide beef cattle breeds also affect birth weight in PO cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12863-015-0229-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hartati Hartati
- Beef Cattle Research Station, Indonesian Agency for Agricultural Research and Development, Ministry of Agriculture, Jln. Pahlawan no. 2 Grati, Pasuruan, East Java, 16784, Indonesia.
| | - Yuri Tani Utsunomiya
- Faculdade de Ciências Agrárias e Veterinárias, UNESP - Univ Estadual Paulista, Jaboticabal, São Paulo, 14884-900, Brazil.
| | - Tad Stewart Sonstegard
- ARS-USDA - Agricultural Research Service - United States Department of Agriculture, Animal Genomics and Improvement Laboratory, Beltsville, MD, 20705, USA.
| | - José Fernando Garcia
- Faculdade de Ciências Agrárias e Veterinárias, UNESP - Univ Estadual Paulista, Jaboticabal, São Paulo, 14884-900, Brazil. .,Faculdade de Medicina Veterinária de Araçatuba, UNESP - Univ Estadual Paulista, Araçatuba, São Paulo, 16050-680, Brazil.
| | - Jakaria Jakaria
- Faculty of Animal Science, Bogor Agriculture University, Jln. Agatis kampus IPB Dramaga, Bogor, 16680, Indonesia.
| | - Muladno Muladno
- Faculty of Animal Science, Bogor Agriculture University, Jln. Agatis kampus IPB Dramaga, Bogor, 16680, Indonesia.
| |
Collapse
|
386
|
Vockley CM, Guo C, Majoros WH, Nodzenski M, Scholtens DM, Hayes MG, Lowe WL, Reddy TE. Massively parallel quantification of the regulatory effects of noncoding genetic variation in a human cohort. Genome Res 2015; 25:1206-14. [PMID: 26084464 PMCID: PMC4510004 DOI: 10.1101/gr.190090.115] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2015] [Accepted: 06/15/2015] [Indexed: 12/30/2022]
Abstract
We report a novel high-throughput method to empirically quantify individual-specific regulatory element activity at the population scale. The approach combines targeted DNA capture with a high-throughput reporter gene expression assay. As demonstration, we measured the activity of more than 100 putative regulatory elements from 95 individuals in a single experiment. In agreement with previous reports, we found that most genetic variants have weak effects on distal regulatory element activity. Because haplotypes are typically maintained within but not between assayed regulatory elements, the approach can be used to identify causal regulatory haplotypes that likely contribute to human phenotypes. Finally, we demonstrate the utility of the method to functionally fine map causal regulatory variants in regions of high linkage disequilibrium identified by expression quantitative trait loci (eQTL) analyses.
Collapse
Affiliation(s)
- Christopher M Vockley
- Department of Cell Biology, Duke University Medical School, Durham, North Carolina 27710, USA; Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA
| | - Cong Guo
- Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA; University Program in Genetics and Genomics, Duke University, Durham, North Carolina 27710, USA
| | - William H Majoros
- Center for Genomic and Computational Biology, Duke University Medical School, Durham, North Carolina 27710, USA; Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA
| | - Michael Nodzenski
- Department of Preventive Medicine, Division of Biostatistics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Denise M Scholtens
- Department of Preventive Medicine, Division of Biostatistics, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - M Geoffrey Hayes
- Division of Endocrinology, Metabolism and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - William L Lowe
- Division of Endocrinology, Metabolism and Molecular Medicine, Department of Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois 60611, USA
| | - Timothy E Reddy
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27710, USA; Department of Biostatistics and Bioinformatics, Duke University Medical School, Durham, North Carolina 27710, USA
| |
Collapse
|
387
|
Lee D, Bigdeli TB, Williamson VS, Vladimirov VI, Riley BP, Fanous AH, Bacanu SA. DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts. Bioinformatics 2015; 31:3099-104. [PMID: 26059716 PMCID: PMC4576696 DOI: 10.1093/bioinformatics/btv348] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Accepted: 05/29/2015] [Indexed: 01/09/2023] Open
Abstract
Motivation: To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels. However, the ever increasing size and ethnic diversity of both reference panels and cohorts makes genotype imputation computationally challenging for moderately sized computer clusters. Moreover, genotype imputation requires subject-level genetic data, which unlike summary statistics provided by virtually all studies, is not publicly available. While there are much less demanding methods which avoid the genotype imputation step by directly imputing SNP statistics, e.g. Directly Imputing summary STatistics (DIST) proposed by our group, their implicit assumptions make them applicable only to ethnically homogeneous cohorts. Results: To decrease computational and access requirements for the analysis of cosmopolitan cohorts, we propose DISTMIX, which extends DIST capabilities to the analysis of mixed ethnicity cohorts. The method uses a relevant reference panel to directly impute unmeasured SNP statistics based only on statistics at measured SNPs and estimated/user-specified ethnic proportions. Simulations show that the proposed method adequately controls the Type I error rates. The 1000 Genomes panel imputation of summary statistics from the ethnically diverse Psychiatric Genetic Consortium Schizophrenia Phase 2 suggests that, when compared to genotype imputation methods, DISTMIX offers comparable imputation accuracy for only a fraction of computational resources. Availability and implementation: DISTMIX software, its reference population data, and usage examples are publicly available at http://code.google.com/p/distmix. Contact:dlee4@vcu.edu Supplementary information:Supplementary Data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Donghyung Lee
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA, Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - T Bernard Bigdeli
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Vernell S Williamson
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Vladimir I Vladimirov
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA, Center for Biomarker Research & Personalized Medicine, Virginia Commonwealth University, Richmond, VA 23298, USA and Lieber Institute for Brain Development, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Brien P Riley
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Ayman H Fanous
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Silviu-Alin Bacanu
- Department of Psychiatry, Virginia Institute for Psychiatric and Behavioral Genetics, Virginia Commonwealth University, Richmond, VA 23298, USA
| |
Collapse
|
388
|
Leveraging Identity-by-Descent for Accurate Genotype Inference in Family Sequencing Data. PLoS Genet 2015; 11:e1005271. [PMID: 26043085 PMCID: PMC4456389 DOI: 10.1371/journal.pgen.1005271] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 05/12/2015] [Indexed: 12/23/2022] Open
Abstract
Sequencing family DNA samples provides an attractive alternative to population based designs to identify rare variants associated with human disease due to the enrichment of causal variants in pedigrees. Previous studies showed that genotype calling accuracy can be improved by modeling family relatedness compared to standard calling algorithms. Current family-based variant calling methods use sequencing data on single variants and ignore the identity-by-descent (IBD) sharing along the genome. In this study we describe a new computational framework to accurately estimate the IBD sharing from the sequencing data, and to utilize the inferred IBD among family members to jointly call genotypes in pedigrees. Through simulations and application to real data, we showed that IBD can be reliably estimated across the genome, even at very low coverage (e.g. 2X), and genotype accuracy can be dramatically improved. Moreover, the improvement is more pronounced for variants with low frequencies, especially at low to intermediate coverage (e.g. 10X to 20X), making our approach effective in studying rare variants in cost-effective whole genome sequencing in pedigrees. We hope that our tool is useful to the research community for identifying rare variants for human disease through family-based sequencing. To identify disease variants that occur less frequently in population, sequencing families in which multiple individuals are affected is more powerful due to the enrichment of causal variants. An important step in such studies is to infer individual genotypes from sequencing data. Existing methods do not utilize full familial transmission information and therefore result in reduced accuracy of inferred genotypes. In this study we describe a new method that infers shared genetic materials among family members and then incorporate the shared genomic information in a novel algorithm that can accurately infer genotypes. Our method is particularly advantageous when inferring low frequency variants with fewer sequence data, making it effective in analyzing genome-wide sequence data. We implemented the algorithm in a computationally efficient tool to facilitate cost-effective sequencing in families for identifying disease genetic variants.
Collapse
|
389
|
The Kalash genetic isolate: ancient divergence, drift, and selection. Am J Hum Genet 2015; 96:775-83. [PMID: 25937445 PMCID: PMC4570283 DOI: 10.1016/j.ajhg.2015.03.012] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 03/26/2015] [Indexed: 02/05/2023] Open
Abstract
The Kalash represent an enigmatic isolated population of Indo-European speakers who have been living for centuries in the Hindu Kush mountain ranges of present-day Pakistan. Previous Y chromosome and mitochondrial DNA markers provided no support for their claimed Greek descent following Alexander III of Macedon's invasion of this region, and analysis of autosomal loci provided evidence of a strong genetic bottleneck. To understand their origins and demography further, we genotyped 23 unrelated Kalash samples on the Illumina HumanOmni2.5M-8 BeadChip and sequenced one male individual at high coverage on an Illumina HiSeq 2000. Comparison with published data from ancient hunter-gatherers and European farmers showed that the Kalash share genetic drift with the Paleolithic Siberian hunter-gatherers and might represent an extremely drifted ancient northern Eurasian population that also contributed to European and Near Eastern ancestry. Since the split from other South Asian populations, the Kalash have maintained a low long-term effective population size (2,319-2,603) and experienced no detectable gene flow from their geographic neighbors in Pakistan or from other extant Eurasian populations. The mean time of divergence between the Kalash and other populations currently residing in this region was estimated to be 11,800 (95% confidence interval = 10,600-12,600) years ago, and thus they represent present-day descendants of some of the earliest migrants into the Indian sub-continent from West Asia.
Collapse
|
390
|
Xue Y, Prado-Martinez J, Sudmant PH, Narasimhan V, Ayub Q, Szpak M, Frandsen P, Chen Y, Yngvadottir B, Cooper DN, de Manuel M, Hernandez-Rodriguez J, Lobon I, Siegismund HR, Pagani L, Quail MA, Hvilsom C, Mudakikwa A, Eichler EE, Cranfield MR, Marques-Bonet T, Tyler-Smith C, Scally A. Mountain gorilla genomes reveal the impact of long-term population decline and inbreeding. Science 2015; 348:242-245. [PMID: 25859046 PMCID: PMC4668944 DOI: 10.1126/science.aaa3952] [Citation(s) in RCA: 243] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2014] [Accepted: 03/03/2015] [Indexed: 12/30/2022]
Abstract
Mountain gorillas are an endangered great ape subspecies and a prominent focus for conservation, yet we know little about their genomic diversity and evolutionary past. We sequenced whole genomes from multiple wild individuals and compared the genomes of all four Gorilla subspecies. We found that the two eastern subspecies have experienced a prolonged population decline over the past 100,000 years, resulting in very low genetic diversity and an increased overall burden of deleterious variation. A further recent decline in the mountain gorilla population has led to extensive inbreeding, such that individuals are typically homozygous at 34% of their sequence, leading to the purging of severely deleterious recessive mutations from the population. We discuss the causes of their decline and the consequences for their future survival.
Collapse
Affiliation(s)
- Yali Xue
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Javier Prado-Martinez
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Peter H. Sudmant
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Vagheesh Narasimhan
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge CB3 0WA, UK
| | - Qasim Ayub
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Michal Szpak
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Peter Frandsen
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Yuan Chen
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Bryndis Yngvadottir
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - David N. Cooper
- Institute of Medical Genetics, Cardiff University, Cardiff CF14 4XN, UK
| | - Marc de Manuel
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Jessica Hernandez-Rodriguez
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Irene Lobon
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
| | - Hans R. Siegismund
- Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Luca Pagani
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
- Department of Biological, Geological and Environmental Sciences, University of Bologna, 40134 Bologna, Italy
| | - Michael A. Quail
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Christina Hvilsom
- Research and Conservation, Copenhagen Zoo, DK-2000 Frederiksberg, Denmark
| | | | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, Seattle, WA 91895, USA
| | - Michael R. Cranfield
- Gorilla Doctors, Karen C. Drayer Wildlife Health Center, University of California, Davis, CA 95616, USA
| | - Tomas Marques-Bonet
- Institut de Biologia Evolutiva (CSIC/UPF), Parque de Investigación Biomédica de Barcelona (PRBB), Barcelona, Catalonia 08003, Spain
- Centro Nacional de Análisis Genómico (Parc Cientific de Barcelona), Baldiri Reixac 4, 08028 Barcelona, Spain
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Aylwyn Scally
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, UK
| |
Collapse
|
391
|
Haplotype phasing and inheritance of copy number variants in nuclear families. PLoS One 2015; 10:e0122713. [PMID: 25853576 PMCID: PMC4390228 DOI: 10.1371/journal.pone.0122713] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2014] [Accepted: 02/12/2015] [Indexed: 11/19/2022] Open
Abstract
DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.
Collapse
|
392
|
Utsunomiya YT, Pérez O'Brien AM, Sonstegard TS, Sölkner J, Garcia JF. Genomic data as the "hitchhiker's guide" to cattle adaptation: tracking the milestones of past selection in the bovine genome. Front Genet 2015; 6:36. [PMID: 25713583 PMCID: PMC4322753 DOI: 10.3389/fgene.2015.00036] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2014] [Accepted: 01/26/2015] [Indexed: 11/13/2022] Open
Abstract
The bovine species have witnessed and played a major role in the drastic socio-economical changes that shaped our culture over the last 10,000 years. During this journey, cattle "hitchhiked" on human development and colonized the world, facing strong selective pressures such as dramatic environmental changes and disease challenge. Consequently, hundreds of specialized cattle breeds emerged and spread around the globe, making up a rich spectrum of genomic resources. Their DNA still carry the scars left from adapting to this wide range of conditions, and we are now empowered with data and analytical tools to track the milestones of past selection in their genomes. In this review paper, we provide a summary of the reconstructed demographic events that shaped cattle diversity, offer a critical synthesis of popular methodologies applied to the search for signatures of selection (SS) in genomic data, and give examples of recent SS studies in cattle. Then, we outline the potential and challenges of the application of SS analysis in cattle, and discuss the future directions in this field.
Collapse
Affiliation(s)
- Yuri T Utsunomiya
- Departamento de Medicina Veterinária Preventiva e Reprodução Animal, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista (UNESP) Jaboticabal, São Paulo, Brazil
| | - Ana M Pérez O'Brien
- Division of Livestock Sciences, Department of Sustainable Agricultural Systems, University of Natural Resources and Life Sciences (BOKU) Vienna, Austria
| | - Tad S Sonstegard
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, United States Department of Agriculture Beltsville, MA, USA
| | - Johann Sölkner
- Division of Livestock Sciences, Department of Sustainable Agricultural Systems, University of Natural Resources and Life Sciences (BOKU) Vienna, Austria
| | - José F Garcia
- Departamento de Medicina Veterinária Preventiva e Reprodução Animal, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista (UNESP) Jaboticabal, São Paulo, Brazil ; Laboratório de Bioquímica e Biologia Molecular Animal, Departamento de Apoio, Saúde e Produção Animal, Faculdade de Medicina Veterinária de Araçatuba, Universidade Estadual Paulista (UNESP) Araçatuba, São Paulo, Brazil
| |
Collapse
|
393
|
Patterson M, Marschall T, Pisanti N, van Iersel L, Stougie L, Klau GW, Schönhuth A. WhatsHap: Weighted Haplotype Assembly for Future-Generation Sequencing Reads. J Comput Biol 2015; 22:498-509. [PMID: 25658651 DOI: 10.1089/cmb.2014.0157] [Citation(s) in RCA: 211] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
The human genome is diploid, which requires assigning heterozygous single nucleotide polymorphisms (SNPs) to the two copies of the genome. The resulting haplotypes, lists of SNPs belonging to each copy, are crucial for downstream analyses in population genetics. Currently, statistical approaches, which are oblivious to direct read information, constitute the state-of-the-art. Haplotype assembly, which addresses phasing directly from sequencing reads, suffers from the fact that sequencing reads of the current generation are too short to serve the purposes of genome-wide phasing. While future-technology sequencing reads will contain sufficient amounts of SNPs per read for phasing, they are also likely to suffer from higher sequencing error rates. Currently, no haplotype assembly approaches exist that allow for taking both increasing read length and sequencing error information into account. Here, we suggest WhatsHap, the first approach that yields provably optimal solutions to the weighted minimum error correction problem in runtime linear in the number of SNPs. WhatsHap is a fixed parameter tractable (FPT) approach with coverage as the parameter. We demonstrate that WhatsHap can handle datasets of coverage up to 20×, and that 15× are generally enough for reliably phasing long reads, even at significantly elevated sequencing error rates. We also find that the switch and flip error rates of the haplotypes we output are favorable when comparing them with state-of-the-art statistical phasers.
Collapse
Affiliation(s)
- Murray Patterson
- 1Laboratoire de Biométrie et Biologie Évolutive (LBBE : UMR CNRS 5558), Université de Lyon 1, Villeurbanne, France
| | - Tobias Marschall
- 2Center for Bioinformatics, Saarland University, Saarbrücken, Germany.,3Max Planck Institute for Informatics, Saarbrücken, Germany
| | - Nadia Pisanti
- 4Department of Computer Science, University of Pisa, Italy.,7Erable Team, INRIA
| | | | - Leen Stougie
- 6VU University, Amsterdam, The Netherlands.,7Erable Team, INRIA
| | - Gunnar W Klau
- 6VU University, Amsterdam, The Netherlands.,7Erable Team, INRIA
| | | |
Collapse
|
394
|
Druet T, Georges M. LINKPHASE3: an improved pedigree-based phasing algorithm robust to genotyping and map errors. ACTA ACUST UNITED AC 2015; 31:1677-9. [PMID: 25573918 DOI: 10.1093/bioinformatics/btu859] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 12/23/2014] [Indexed: 11/12/2022]
Abstract
Many applications in genetics require haplotype reconstruction. We present a phasing program designed for large half-sibs families (as observed in plant and animals) that is robust to genotyping and map errors. We demonstrate that it is more efficient than previous versions and other programs, particularly in the presence of genotyping errors.
Collapse
Affiliation(s)
- Tom Druet
- Unit of Animal Genomics, GIGA-R, University of Liège (B34), 1 avenue de l'Hôpital, B-4000, Liège, Belgium
| | - Michel Georges
- Unit of Animal Genomics, GIGA-R, University of Liège (B34), 1 avenue de l'Hôpital, B-4000, Liège, Belgium
| |
Collapse
|
395
|
Olsson AH, Volkov P, Bacos K, Dayeh T, Hall E, Nilsson EA, Ladenvall C, Rönn T, Ling C. Genome-wide associations between genetic and epigenetic variation influence mRNA expression and insulin secretion in human pancreatic islets. PLoS Genet 2014; 10:e1004735. [PMID: 25375650 PMCID: PMC4222689 DOI: 10.1371/journal.pgen.1004735] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2014] [Accepted: 09/05/2014] [Indexed: 12/29/2022] Open
Abstract
Genetic and epigenetic mechanisms may interact and together affect biological processes and disease development. However, most previous studies have investigated genetic and epigenetic mechanisms independently, and studies examining their interactions throughout the human genome are lacking. To identify genetic loci that interact with the epigenome, we performed the first genome-wide DNA methylation quantitative trait locus (mQTL) analysis in human pancreatic islets. We related 574,553 single nucleotide polymorphisms (SNPs) with genome-wide DNA methylation data of 468,787 CpG sites targeting 99% of RefSeq genes in islets from 89 donors. We identified 67,438 SNP-CpG pairs in cis, corresponding to 36,783 SNPs (6.4% of tested SNPs) and 11,735 CpG sites (2.5% of tested CpGs), and 2,562 significant SNP-CpG pairs in trans, corresponding to 1,465 SNPs (0.3% of tested SNPs) and 383 CpG sites (0.08% of tested CpGs), showing significant associations after correction for multiple testing. These include reported diabetes loci, e.g. ADCY5, KCNJ11, HLA-DQA1, INS, PDX1 and GRB10. CpGs of significant cis-mQTLs were overrepresented in the gene body and outside of CpG islands. Follow-up analyses further identified mQTLs associated with gene expression and insulin secretion in human islets. Causal inference test (CIT) identified SNP-CpG pairs where DNA methylation in human islets is the potential mediator of the genetic association with gene expression or insulin secretion. Functional analyses further demonstrated that identified candidate genes (GPX7, GSTT1 and SNX19) directly affect key biological processes such as proliferation and apoptosis in pancreatic β-cells. Finally, we found direct correlations between DNA methylation of 22,773 (4.9%) CpGs with mRNA expression of 4,876 genes, where 90% of the correlations were negative when CpGs were located in the region surrounding transcription start site. Our study demonstrates for the first time how genome-wide genetic and epigenetic variation interacts to influence gene expression, islet function and potential diabetes risk in humans. Inter-individual variation in genetics and epigenetics affects biological processes and disease susceptibility. However, most studies have investigated genetic and epigenetic mechanisms independently and to uncover novel mechanisms affecting disease susceptibility there is a highlighted need to study interactions between these factors on a genome-wide scale. To identify novel loci affecting islet function and potentially diabetes, we performed the first genome-wide methylation quantitative trait locus (mQTL) analysis in human pancreatic islets including DNA methylation of 468,787 CpG sites located throughout the genome. Our results showed that DNA methylation of 11,735 CpGs in 4,504 unique genes is regulated by genetic factors located in cis (67,438 SNP-CpG pairs). Furthermore, significant mQTLs cover previously reported diabetes loci including KCNJ11, INS, HLA, PDX1 and GRB10. We also found mQTLs associated with gene expression and insulin secretion in human islets. By performing causality inference tests (CIT), we identified CpGs where DNA methylation potentially mediates the genetic impact on gene expression and insulin secretion. Our functional follow-up experiments further demonstrated that identified mQTLs/genes (GPX7, GSTT1 and SNX19) directly affect pancreatic β-cell function. Together, our study provides a detailed map of genome-wide associations between genetic and epigenetic variation, which affect gene expression and insulin secretion in human pancreatic islets.
Collapse
Affiliation(s)
- Anders H. Olsson
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Petr Volkov
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Karl Bacos
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Tasnim Dayeh
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Elin Hall
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Emma A. Nilsson
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Claes Ladenvall
- Department of Clinical Sciences, Diabetes and Endocrinology, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Tina Rönn
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
| | - Charlotte Ling
- Department of Clinical Sciences, Epigenetics and Diabetes, Lund University Diabetes Centre, Clinical Research Centre, Malmö, Sweden
- * E-mail:
| |
Collapse
|
396
|
Ham S, Roh TY. A Follow-up Association Study of Genetic Variants for Bone Mineral Density in a Korean Population. Genomics Inform 2014; 12:114-20. [PMID: 25317110 PMCID: PMC4196375 DOI: 10.5808/gi.2014.12.3.114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2014] [Revised: 08/14/2014] [Accepted: 08/19/2014] [Indexed: 12/25/2022] Open
Abstract
Bone mineral density (BMD) is one of the quantitative traits that are genetically inherited and affected by various factors. Over the past years, genome-wide association studies (GWASs) have searched for many genetic loci that influence BMD. A recent meta-analysis of 17 GWASs for BMD of the femoral neck and lumbar spine is the largest GWAS for BMD to date and offers 64 single-nucleotide polymorphisms (SNPs) in 56 associated loci. We investigated these BMD loci in a Korean population called Korea Association REsource (KARE) to identify their validity in an independent study. The KARE population contains genotypes from 8,842 individuals, and their BMD levels were measured at the distal radius (BMD-RT) and midshaft tibia (BMD-TT). Thirteen genomic loci among 56 loci were significantly associated with BMD variations, and 3 loci were involved in known biological pathways related to BMD. In order to find putative functional variants, nearby SNPs in relation to linkage equilibrium were annotated, and their possible functional effects were predicted. These findings reveal that tens of variants, not a single factor, may contribute to the genetic architecture of BMD; have an important role regardless of ethnic group; and may highlight the importance of a replication study in GWASs to validate genuine loci for BMD variation.
Collapse
Affiliation(s)
- Seokjin Ham
- Department of Life Sciences, POSTECH, Pohang 790-784, Korea
| | - Tae-Young Roh
- Department of Life Sciences, POSTECH, Pohang 790-784, Korea. ; Division of Integrative Biosciences and Biotechnology, POSTECH, Pohang 790-784, Korea
| |
Collapse
|
397
|
Abstract
Genomic information reported as haplotypes rather than genotypes will be increasingly important for personalized medicine. Current technologies generate diploid sequence data that is rarely resolved into its constituent haplotypes. Furthermore, paradigms for thinking about genomic information are based on interpreting genotypes rather than haplotypes. Nevertheless, haplotypes have historically been useful in contexts ranging from population genetics to disease-gene mapping efforts. The main approaches for phasing genomic sequence data are molecular haplotyping, genetic haplotyping, and population-based inference. Long-read sequencing technologies are enabling longer molecular haplotypes, and decreases in the cost of whole-genome sequencing are enabling the sequencing of whole-chromosome genetic haplotypes. Hybrid approaches combining high-throughput short-read assembly with strategic approaches that enable physical or virtual binning of reads into haplotypes are enabling multi-gene haplotypes to be generated from single individuals. These techniques can be further combined with genetic and population approaches. Here, we review advances in whole-genome haplotyping approaches and discuss the importance of haplotypes for genomic medicine. Clinical applications include diagnosis by recognition of compound heterozygosity and by phasing regulatory variation to coding variation. Haplotypes, which are more specific than less complex variants such as single nucleotide variants, also have applications in prognostics and diagnostics, in the analysis of tumors, and in typing tissue for transplantation. Future advances will include technological innovations, the application of standard metrics for evaluating haplotype quality, and the development of databases that link haplotypes to disease.
Collapse
Affiliation(s)
- Gustavo Glusman
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| | - Hannah C Cox
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| | - Jared C Roach
- Institute for Systems Biology, Terry Avenue North, Seattle, WA 98109 USA
| |
Collapse
|
398
|
A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans. Nat Commun 2014; 5:4871. [PMID: 25225788 PMCID: PMC4167609 DOI: 10.1038/ncomms5871] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2014] [Accepted: 07/30/2014] [Indexed: 02/02/2023] Open
Abstract
The analysis of rich catalogues of genetic variation from population-based sequencing provides an opportunity to screen for functional effects. Here we report a rare variant in APOC3 (rs138326449-A, minor allele frequency ~0.25% (UK)) associated with plasma triglyceride (TG) levels (-1.43 s.d. (s.e.=0.27 per minor allele (P-value=8.0 × 10(-8))) discovered in 3,202 individuals with low read-depth, whole-genome sequence. We replicate this in 12,831 participants from five additional samples of Northern and Southern European origin (-1.0 s.d. (s.e.=0.173), P-value=7.32 × 10(-9)). This is consistent with an effect between 0.5 and 1.5 mmol l(-1) dependent on population. We show that a single predicted splice donor variant is responsible for association signals and is independent of known common variants. Analyses suggest an independent relationship between rs138326449 and high-density lipoprotein (HDL) levels. This represents one of the first examples of a rare, large effect variant identified from whole-genome sequencing at a population scale.
Collapse
|
399
|
Chen W, Schaid DJ. PedBLIMP: extending linear predictors to impute genotypes in pedigrees. Genet Epidemiol 2014; 38:531-41. [PMID: 25044249 DOI: 10.1002/gepi.21838] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2014] [Revised: 05/15/2014] [Accepted: 05/19/2014] [Indexed: 12/13/2022]
Abstract
Recently, Wen and Stephens (Wen and Stephens [2010] Ann Appl Stat 4(3):1158-1182) proposed a linear predictor, called BLIMP, that uses conditional multivariate normal moments to impute genotypes with accuracy similar to current state-of-the-art methods. One novelty is that it regularized the estimated covariance matrix based on a model from population genetics. We extended multivariate moments to impute genotypes in pedigrees. Our proposed method, PedBLIMP, utilizes both the linkage-disequilibrium (LD) information estimated from external panel data and the pedigree structure or identity-by-descent (IBD) information. The proposed method was evaluated on a pedigree design where some individuals were genotyped with dense markers and the rest with sparse markers. We found that incorporating the pedigree/IBD information can improve imputation accuracy compared to BLIMP. Because rare variants usually have low LD with other single-nucleotide polymorphisms (SNPs), incorporating pedigree/IBD information largely improved imputation accuracy for rare variants. We also compared PedBLIMP with IMPUTE2 and GIGI. Results show that when sparse markers are in a certain density range, our method can outperform both IMPUTE2 and GIGI.
Collapse
Affiliation(s)
- Wenan Chen
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, United States of America
| | | |
Collapse
|
400
|
Phasing at any level of relatedness. Nat Methods 2014. [DOI: 10.1038/nmeth.2978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|