Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Browning SR. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum Genet 2008;124:439-50. [PMID: 18850115 DOI: 10.1007/s00439-008-0568-7] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Accepted: 09/29/2008] [Indexed: 01/25/2023]

For:	Browning SR. Missing data imputation and haplotype phase inference for genome-wide association studies. Hum Genet 2008;124:439-50. [PMID: 18850115 DOI: 10.1007/s00439-008-0568-7] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Accepted: 09/29/2008] [Indexed: 01/25/2023]

Number

Cited by Other Article(s)

Guenther DT, Follett J, Amouri R, Sassi SB, Hentati F, Farrer MJ. The Evolution of Genetic Variability at the LRRK2 Locus. Genes (Basel) 2024;15:878. [PMID: 39062657 PMCID: PMC11275506 DOI: 10.3390/genes15070878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 06/28/2024] [Accepted: 07/01/2024] [Indexed: 07/28/2024] Open

Yeon J, Le NT, Heo J, Sim SC. Low-density SNP markers with high prediction accuracy of genomic selection for bacterial wilt resistance in tomato. FRONTIERS IN PLANT SCIENCE 2024;15:1402693. [PMID: 38872894 PMCID: PMC11169939 DOI: 10.3389/fpls.2024.1402693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 05/07/2024] [Indexed: 06/15/2024]

Yun JS, Jung SH, Lee SN, Jung SM, Won HH, Kim D, Choi JA. Polygenic risk score-based phenome-wide association for glaucoma and its impact on disease susceptibility in two large biobanks. J Transl Med 2024;22:355. [PMID: 38622600 PMCID: PMC11020996 DOI: 10.1186/s12967-024-05152-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 04/01/2024] [Indexed: 04/17/2024] Open

Abstract

BACKGROUND

Glaucoma is a leading cause of worldwide irreversible blindness. Considerable uncertainty remains regarding the association between a variety of phenotypes and the genetic risk of glaucoma, as well as the impact they exert on the glaucoma development.

METHODS

We investigated the associations of genetic liability for primary open angle glaucoma (POAG) with a wide range of potential risk factors and to assess its impact on the risk of incident glaucoma. The phenome-wide association study (PheWAS) approach was applied to determine the association of POAG polygenic risk score (PRS) with a wide range of phenotypes in 377, 852 participants from the UK Biobank study and 43,623 participants from the Penn Medicine Biobank study, all of European ancestry. Participants were stratified into four risk tiers: low, intermediate, high, and very high-risk. Cox proportional hazard models assessed the relationship of POAG PRS and ocular factors with new glaucoma events.

RESULTS

In both discovery and replication set in the PheWAS, a higher genetic predisposition to POAG was specifically correlated with ocular disease phenotypes. The POAG PRS exhibited correlations with low corneal hysteresis, refractive error, and ocular hypertension, demonstrating a strong association with the onset of glaucoma. Individuals carrying a high genetic burden exhibited a 9.20-fold, 11.88-fold, and 28.85-fold increase in glaucoma incidence when associated with low corneal hysteresis, high myopia, and elevated intraocular pressure, respectively.

CONCLUSION

Genetic susceptibility to POAG primarily influences ocular conditions, with limited systemic associations. Notably, the baseline polygenic risk for POAG robustly associates with new glaucoma events, revealing a large combined effect of genetic and ocular risk factors on glaucoma incidents.

Collapse

Jung SH, Lee YC, Shivakumar M, Kim J, Yun JS, Park WY, Won HH, Kim D. Association between genetic risk and adherence to healthy lifestyle for developing age-related hearing loss. BMC Med 2024;22:141. [PMID: 38532472 DOI: 10.1186/s12916-024-03364-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/18/2024] [Indexed: 03/28/2024] Open

Abstract

BACKGROUND

Previous studies have shown that lifestyle/environmental factors could accelerate the development of age-related hearing loss (ARHL). However, there has not yet been a study investigating the joint association among genetics, lifestyle/environmental factors, and adherence to healthy lifestyle for risk of ARHL. We aimed to assess the association between ARHL genetic variants, lifestyle/environmental factors, and adherence to healthy lifestyle as pertains to risk of ARHL.

METHODS

This case-control study included 376,464 European individuals aged 40 to 69 years, enrolled between 2006 and 2010 in the UK Biobank (UKBB). As a replication set, we also included a total of 26,523 individuals considered of European ancestry and 9834 individuals considered of African-American ancestry through the Penn Medicine Biobank (PMBB). The polygenic risk score (PRS) for ARHL was derived from a sensorineural hearing loss genome-wide association study from the FinnGen Consortium and categorized as low, intermediate, high, and very high. We selected lifestyle/environmental factors that have been previously studied in association with hearing loss. A composite healthy lifestyle score was determined using seven selected lifestyle behaviors and one environmental factor.

RESULTS

Of the 376,464 participants, 87,066 (23.1%) cases belonged to the ARHL group, and 289,398 (76.9%) individuals comprised the control group in the UKBB. A very high PRS for ARHL had a 49% higher risk of ARHL than those with low PRS (adjusted OR, 1.49; 95% CI, 1.36-1.62; P < .001), which was replicated in the PMBB cohort. A very poor lifestyle was also associated with risk of ARHL (adjusted OR, 3.03; 95% CI, 2.75-3.35; P < .001). These risk factors showed joint effects with the risk of ARHL. Conversely, adherence to healthy lifestyle in relation to hearing mostly attenuated the risk of ARHL even in individuals with very high PRS (adjusted OR, 0.21; 95% CI, 0.09-0.52; P < .001).

CONCLUSIONS

Our findings of this study demonstrated a significant joint association between genetic and lifestyle factors regarding ARHL. In addition, our analysis suggested that lifestyle adherence in individuals with high genetic risk could reduce the risk of ARHL.

Collapse

Lee YC, Jung SH, Shivakumar M, Cha S, Park WY, Won HH, Eun YG, Biobank PM, Kim D. Polygenic risk score-based phenome-wide association study of head and neck cancer across two large biobanks. BMC Med 2024;22:120. [PMID: 38486201 PMCID: PMC10941505 DOI: 10.1186/s12916-024-03305-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 02/15/2024] [Indexed: 03/17/2024] Open

Abstract

BACKGROUND

Numerous observational studies have highlighted associations of genetic predisposition of head and neck squamous cell carcinoma (HNSCC) with diverse risk factors, but these findings are constrained by design limitations of observational studies. In this study, we utilized a phenome-wide association study (PheWAS) approach, incorporating a polygenic risk score (PRS) derived from a wide array of genomic variants, to systematically investigate phenotypes associated with genetic predisposition to HNSCC. Furthermore, we validated our findings across heterogeneous cohorts, enhancing the robustness and generalizability of our results.

METHODS

We derived PRSs for HNSCC and its subgroups, oropharyngeal cancer and oral cancer, using large-scale genome-wide association study summary statistics from the Genetic Associations and Mechanisms in Oncology Network. We conducted a comprehensive investigation, leveraging genotyping data and electronic health records from 308,492 individuals in the UK Biobank and 38,401 individuals in the Penn Medicine Biobank (PMBB), and subsequently performed PheWAS to elucidate the associations between PRS and a wide spectrum of phenotypes.

RESULTS

We revealed the HNSCC PRS showed significant association with phenotypes related to tobacco use disorder (OR, 1.06; 95% CI, 1.05-1.08; P = 3.50 × 10-15), alcoholism (OR, 1.06; 95% CI, 1.04-1.09; P = 6.14 × 10-9), alcohol-related disorders (OR, 1.08; 95% CI, 1.05-1.11; P = 1.09 × 10-8), emphysema (OR, 1.11; 95% CI, 1.06-1.16; P = 5.48 × 10-6), chronic airway obstruction (OR, 1.05; 95% CI, 1.03-1.07; P = 2.64 × 10-5), and cancer of bronchus (OR, 1.08; 95% CI, 1.04-1.13; P = 4.68 × 10-5). These findings were replicated in the PMBB cohort, and sensitivity analyses, including the exclusion of HNSCC cases and the major histocompatibility complex locus, confirmed the robustness of these associations. Additionally, we identified significant associations between HNSCC PRS and lifestyle factors related to smoking and alcohol consumption.

CONCLUSIONS

The study demonstrated the potential of PRS-based PheWAS in revealing associations between genetic risk factors for HNSCC and various phenotypic traits. The findings emphasized the importance of considering genetic susceptibility in understanding HNSCC and highlighted shared genetic bases between HNSCC and other health conditions and lifestyles.

Collapse

Childebayeva A, Zavala EI. Review: Computational analysis of human skeletal remains in ancient DNA and forensic genetics. iScience 2023;26:108066. [PMID: 37927550 PMCID: PMC10622734 DOI: 10.1016/j.isci.2023.108066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2023] Open

Kriaridou C, Tsairidou S, Fraslin C, Gorjanc G, Looseley ME, Johnston IA, Houston RD, Robledo D. Evaluation of low-density SNP panels and imputation for cost-effective genomic selection in four aquaculture species. Front Genet 2023;14:1194266. [PMID: 37252666 PMCID: PMC10213886 DOI: 10.3389/fgene.2023.1194266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2023] [Accepted: 04/26/2023] [Indexed: 05/31/2023] Open

Abstract

Genomic selection can accelerate genetic progress in aquaculture breeding programmes, particularly for traits measured on siblings of selection candidates. However, it is not widely implemented in most aquaculture species, and remains expensive due to high genotyping costs. Genotype imputation is a promising strategy that can reduce genotyping costs and facilitate the broader uptake of genomic selection in aquaculture breeding programmes. Genotype imputation can predict ungenotyped SNPs in populations genotyped at a low-density (LD), using a reference population genotyped at a high-density (HD). In this study, we used datasets of four aquaculture species (Atlantic salmon, turbot, common carp and Pacific oyster), phenotyped for different traits, to investigate the efficacy of genotype imputation for cost-effective genomic selection. The four datasets had been genotyped at HD, and eight LD panels (300-6,000 SNPs) were generated in silico. SNPs were selected to be: i) evenly distributed according to physical position ii) selected to minimise the linkage disequilibrium between adjacent SNPs or iii) randomly selected. Imputation was performed with three different software packages (AlphaImpute2, FImpute v.3 and findhap v.4). The results revealed that FImpute v.3 was faster and achieved higher imputation accuracies. Imputation accuracy increased with increasing panel density for both SNP selection methods, reaching correlations greater than 0.95 in the three fish species and 0.80 in Pacific oyster. In terms of genomic prediction accuracy, the LD and the imputed panels performed similarly, reaching values very close to the HD panels, except in the pacific oyster dataset, where the LD panel performed better than the imputed panel. In the fish species, when LD panels were used for genomic prediction without imputation, selection of markers based on either physical or genetic distance (instead of randomly) resulted in a high prediction accuracy, whereas imputation achieved near maximal prediction accuracy independently of the LD panel, showing higher reliability. Our results suggests that, in fish species, well-selected LD panels may achieve near maximal genomic selection prediction accuracy, and that the addition of imputation will result in maximal accuracy independently of the LD panel. These strategies represent effective and affordable methods to incorporate genomic selection into most aquaculture settings.

Collapse

Wienbrandt L, Ellinghaus D. EagleImp: fast and accurate genome-wide phasing and imputation in a single tool. Bioinformatics 2022;38:4999-5006. [PMID: 36130053 PMCID: PMC9665855 DOI: 10.1093/bioinformatics/btac637] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 09/15/2022] [Accepted: 09/19/2022] [Indexed: 12/24/2022] Open

Abstract

MOTIVATION

Reference-based phasing and genotype imputation algorithms have been developed with sublinear theoretical runtime behaviour, but runtimes are still high in practice when large genome-wide reference datasets are used.

RESULTS

We developed EagleImp, a software based on the methods used in the existing tools Eagle2 and PBWT, which allows accurate and accelerated phasing and imputation in a single tool by algorithmic and technical improvements and new features. We compared accuracy and runtime of EagleImp with Eagle2, PBWT and prominent imputation servers using whole-genome sequencing data from the 1000 Genomes Project, the Haplotype Reference Consortium and simulated data with 1 million reference genomes. EagleImp was 2-30 times faster (depending on the single or multiprocessor configuration selected and the size of the reference panel) than Eagle2 combined with PBWT, with the same or better phasing and imputation quality in all tested scenarios. For common variants investigated in typical genome-wide association studies, EagleImp provided same or higher imputation accuracy than the Sanger Imputation Service, Michigan Imputation Server and the newly developed TOPMed Imputation Server, despite larger (not publicly available) reference panels. Additional features include automated chromosome splitting and memory management at runtime to avoid job aborts, fast reading and writing of large files and various user-configurable algorithm and output options. Due to the technical optimizations, EagleImp can perform fast and accurate reference-based phasing and imputation and is ready for future large reference panels in the order of 1 million genomes.

AVAILABILITY AND IMPLEMENTATION

EagleImp is implemented in C++ and freely available for download at https://github.com/ikmb/eagleimp.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Speck A, Trouvé JP, Enjalbert J, Geffroy V, Joets J, Moreau L. Genetic Architecture of Powdery Mildew Resistance Revealed by a Genome-Wide Association Study of a Worldwide Collection of Flax (Linum usitatissimum L.). FRONTIERS IN PLANT SCIENCE 2022;13:871633. [PMID: 35812909 PMCID: PMC9263915 DOI: 10.3389/fpls.2022.871633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 04/22/2022] [Indexed: 06/15/2023]

Ausmees K, Sanchez-Quinto F, Jakobsson M, Nettelblad C. An empirical evaluation of genotype imputation of ancient DNA. G3 (BETHESDA, MD.) 2022;12:6575448. [PMID: 35482488 PMCID: PMC9157144 DOI: 10.1093/g3journal/jkac089] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 04/05/2022] [Indexed: 12/12/2022]

Genome-wide analysis reveals associations between climate and regional patterns of adaptive divergence and dispersal in American pikas. Heredity (Edinb) 2021;127:443-454. [PMID: 34537819 PMCID: PMC8551249 DOI: 10.1038/s41437-021-00472-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/06/2021] [Accepted: 09/06/2021] [Indexed: 02/07/2023] Open

Abstract

Understanding the role of adaptation in species' responses to climate change is important for evaluating the evolutionary potential of populations and informing conservation efforts. Population genomics provides a useful approach for identifying putative signatures of selection and the underlying environmental factors or biological processes that may be involved. Here, we employed a population genomic approach within a space-for-time study design to investigate the genetic basis of local adaptation and reconstruct patterns of movement across rapidly changing environments in a thermally sensitive mammal, the American pika (Ochotona princeps). Using genotypic data at 49,074 single-nucleotide polymorphisms (SNPs), we analyzed patterns of genome-wide diversity, structure, and migration along three independent elevational transects located at the northern extent (Tweedsmuir South Provincial Park, British Columbia, Canada) and core (North Cascades National Park, Washington, USA) of the Cascades lineage. We identified 899 robust outlier SNPs within- and among-transects. Of those annotated to genes with known function, many were linked with cellular processes related to climate stress including ATP-binding, ATP citrate synthase activity, ATPase activity, hormone activity, metal ion-binding, and protein-binding. Moreover, we detected evidence for contrasting patterns of directional migration along transects across geographic regions that suggest an increased propensity for American pikas to disperse among lower elevation populations at higher latitudes where environments are generally cooler. Ultimately, our data indicate that fine-scale demographic patterns and adaptive processes may vary among populations of American pikas, providing an important context for evaluating biotic responses to climate change in this species and other alpine-adapted mammals.

Collapse

Meger J, Ulaszewski B, Burczyk J. Genomic signatures of natural selection at phenology-related genes in a widely distributed tree species Fagus sylvatica L. BMC Genomics 2021;22:583. [PMID: 34332553 PMCID: PMC8325806 DOI: 10.1186/s12864-021-07907-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 07/20/2021] [Indexed: 11/17/2022] Open

Abstract

BACKGROUND

Diversity among phenology-related genes is predicted to be a contributing factor in local adaptations seen in widely distributed plant species that grow in climatically variable geographic areas, such as forest trees. European beech (Fagus sylvatica L.) is widespread, and is one of the most important broadleaved tree species in Europe; however, its potential for adaptation to climate change is a matter of uncertainty, and little is known about the molecular basis of climate change-relevant traits like bud burst.

RESULTS

We explored single nucleotide polymorphisms (SNP) at candidate genes related to bud burst in beech individuals sampled across 47 populations from Europe. SNP diversity was monitored for 380 candidate genes using a sequence capture approach, providing 2909 unlinked SNP loci. We used two complementary analytical methods to find loci significantly associated with geographic variables, climatic variables (expressed as principal components), or phenotypic variables (spring and autumn phenology, height, survival). Redundancy analysis (RDA) was used to detect candidate markers across two spatial scales (entire study area and within subregions). We revealed 201 candidate SNPs at the broadest scale, 53.2% of which were associated with phenotypic variables. Additive polygenic scores, which provide a measure of the cumulative signal across significant candidate SNPs, were correlated with a climate variable (first principal component, PC1) related to temperature and precipitation availability, and spring phenology. However, different genotype-environment associations were identified within Southeastern Europe as compared to the entire geographic range of European beech.

CONCLUSIONS

Environmental conditions play important roles as drivers of genetic diversity of phenology-related genes that could influence local adaptation in European beech. Selection in beech favors genotypes with earlier bud burst under warmer and wetter habitats within its range; however, selection pressures may differ across spatial scales.

Collapse

Jenkins CA, Schofield EC, Mellersh CS, De Risio L, Ricketts SL. Improving the resolution of canine genome-wide association studies using genotype imputation: A study of two breeds. Anim Genet 2021;52:703-713. [PMID: 34252218 PMCID: PMC8514152 DOI: 10.1111/age.13117] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 05/07/2021] [Accepted: 06/24/2021] [Indexed: 01/08/2023]

Charon C, Allodji R, Meyer V, Deleuze JF. Impact of pre- and post-variant filtration strategies on imputation. Sci Rep 2021;11:6214. [PMID: 33737531 PMCID: PMC7973508 DOI: 10.1038/s41598-021-85333-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 02/22/2021] [Indexed: 01/04/2023] Open

Genome-wide haplotype association study in imaging genetics using whole-brain sulcal openings of 16,304 UK Biobank subjects. Eur J Hum Genet 2021;29:1424-1437. [PMID: 33664500 PMCID: PMC8440755 DOI: 10.1038/s41431-021-00827-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 12/18/2020] [Accepted: 02/04/2021] [Indexed: 11/29/2022] Open

Negisho K, Shibru S, Pillen K, Ordon F, Wehner G. Genetic diversity of Ethiopian durum wheat landraces. PLoS One 2021;16:e0247016. [PMID: 33596260 PMCID: PMC7888639 DOI: 10.1371/journal.pone.0247016] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 01/30/2021] [Indexed: 01/27/2023] Open

Abstract

Genetic diversity and population structure assessment in crops is essential for marker trait association, marker assisted breeding and crop germplasm conservation. We analyzed a set of 285 durum wheat accessions comprising 215 Ethiopian durum wheat landraces, 10 released durum wheat varieties, 10 advanced durum wheat lines from Ethiopia, and 50 durum wheat lines from CIMMYT. We investigated the genetic diversity and population structure for the complete panel as well as for the 215 landraces, separately based on 11,919 SNP markers with known physical positions. The whole panel was clustered into two populations representing on the one hand mainly the landraces, and on the other hand mainly released, advanced and CIMMYT lines. Further population structure analysis of the landraces uncovered 4 subgroups emphasizing the high degree of genetic diversity within Ethiopian durum landraces. Population structure based AMOVA for both sets unveiled significant (P < 0.001) variation between populations and within populations. Total variation within population accessions (81%, 76%) was higher than total variation between populations (19%, 24%) for both sets. Population structure analysis based genetic differentiation (FST) and gene flow (Nm) for the whole set and the Ethiopian landraces were 0.19 and 0.24, 1.04, and 0.81, respectively indicating high genetic differentiation and limited gene flow. Diversity indices verify that the landrace panel was more diverse with (I = 0.7, He = 0.46, uHe = 0.46) than the advanced lines (I = 0.6, He = 0.42, uHe = 0.42). Similarly, differences within the landrace clusters were observed. In summary a high genetic diversity within Ethiopian durum wheat landraces was detected, which may be a target for national and international wheat improvement programs to exploit valuable traits for biotic and abiotic stresses.

Collapse

Scott MF, Ladejobi O, Amer S, Bentley AR, Biernaskie J, Boden SA, Clark M, Dell'Acqua M, Dixon LE, Filippi CV, Fradgley N, Gardner KA, Mackay IJ, O'Sullivan D, Percival-Alwyn L, Roorkiwal M, Singh RK, Thudi M, Varshney RK, Venturini L, Whan A, Cockram J, Mott R. Multi-parent populations in crops: a toolbox integrating genomics and genetic mapping with breeding. Heredity (Edinb) 2020;125:396-416. [PMID: 32616877 PMCID: PMC7784848 DOI: 10.1038/s41437-020-0336-6] [Citation(s) in RCA: 78] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 06/16/2020] [Accepted: 06/16/2020] [Indexed: 11/21/2022] Open

Affiliation(s)

Michael F Scott UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK.
Olufunmilayo Ladejobi UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK.
Samer Amer University of Reading, Reading, RG6 6AH, UK Faculty of Agriculture, Alexandria University, Alexandria, 23714, Egypt
Alison R Bentley The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Jay Biernaskie Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX1 3RB, UK
Scott A Boden School of Agriculture, Food and Wine, University of Adelaide, Glen Osmond, SA, 5064, Australia
Matt Clark Natural History Museum, London, UK
Matteo Dell'Acqua Institute of Life Sciences, Scuola Superiore Sant'Anna, Pisa, Italy
Laura E Dixon Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
Carla V Filippi Instituto de Agrobiotecnología y Biología Molecular (IABIMO), INTA-CONICET, Nicolas Repetto y Los Reseros s/n, 1686, Hurlingham, Buenos Aires, Argentina
Nick Fradgley The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Keith A Gardner The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Ian J Mackay SRUC, West Mains Road, Kings Buildings, Edinburgh, EH9 3JG, UK
Donal O'Sullivan University of Reading, Reading, RG6 6AH, UK
Lawrence Percival-Alwyn The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Manish Roorkiwal Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Rakesh Kumar Singh International Center for Biosaline Agriculture, Academic City, Dubai, United Arab Emirates
Mahendar Thudi Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Rajeev Kumar Varshney Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
Luca Venturini Natural History Museum, London, UK
Alex Whan CSIRO, GPO Box 1700, Canberra, ACT, 2601, Australia
James Cockram The John Bingham Laboratory, NIAB, 93 Lawrence Weaver Road, Cambridge, CB3 0LE, UK
Richard Mott UCL Genetics Institute, Gower Street, London, WC1E 6BT, UK

Collapse

Cubry P, Pidon H, Ta KN, Tranchant-Dubreuil C, Thuillet AC, Holzinger M, Adam H, Kam H, Chrestin H, Ghesquière A, François O, Sabot F, Vigouroux Y, Albar L, Jouannic S. Genome Wide Association Study Pinpoints Key Agronomic QTLs in African Rice Oryza glaberrima. RICE (NEW YORK, N.Y.) 2020;13:66. [PMID: 32936396 PMCID: PMC7494698 DOI: 10.1186/s12284-020-00424-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 08/31/2020] [Indexed: 05/08/2023]

Soifer L, Fong NL, Yi N, Ireland AT, Lam I, Sooknah M, Paw JS, Peluso P, Concepcion GT, Rank D, Hastie AR, Jojic V, Ruby JG, Botstein D, Roy MA. Fully Phased Sequence of a Diploid Human Genome Determined de Novo from the DNA of a Single Individual. G3 (BETHESDA, MD.) 2020;10:2911-2925. [PMID: 32631951 PMCID: PMC7466960 DOI: 10.1534/g3.119.400995] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 06/26/2020] [Indexed: 12/17/2022]

Akdemir D, Knox R, Isidro y Sánchez J. Combining Partially Overlapping Multi-Omics Data in Databases Using Relationship Matrices. FRONTIERS IN PLANT SCIENCE 2020;11:947. [PMID: 32765543 PMCID: PMC7381228 DOI: 10.3389/fpls.2020.00947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 06/10/2020] [Indexed: 05/08/2023]

Magdy T, Kuo HH, Burridge PW. Precise and Cost-Effective Nanopore Sequencing for Post-GWAS Fine-Mapping and Causal Variant Identification. iScience 2020;23:100971. [PMID: 32203907 PMCID: PMC7096756 DOI: 10.1016/j.isci.2020.100971] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 01/13/2020] [Accepted: 03/05/2020] [Indexed: 01/01/2023] Open

Zhang F, Wang Y, Mukiibi R, Chen L, Vinsky M, Plastow G, Basarab J, Stothard P, Li C. Genetic architecture of quantitative traits in beef cattle revealed by genome wide association studies of imputed whole genome sequence variants: I: feed efficiency and component traits. BMC Genomics 2020;21:36. [PMID: 31931702 PMCID: PMC6956504 DOI: 10.1186/s12864-019-6362-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Accepted: 12/02/2019] [Indexed: 01/27/2023] Open

Abstract

BACKGROUND

Genome wide association studies (GWAS) on residual feed intake (RFI) and its component traits including daily dry matter intake (DMI), average daily gain (ADG), and metabolic body weight (MWT) were conducted in a population of 7573 animals from multiple beef cattle breeds based on 7,853,211 imputed whole genome sequence variants. The GWAS results were used to elucidate genetic architectures of the feed efficiency related traits in beef cattle.

RESULTS

The DNA variant allele substitution effects approximated a bell-shaped distribution for all the traits while the distribution of additive genetic variances explained by single DNA variants followed a scaled inverse chi-squared distribution to a greater extent. With a threshold of P-value < 1.00E-05, 16, 72, 88, and 116 lead DNA variants on multiple chromosomes were significantly associated with RFI, DMI, ADG, and MWT, respectively. In addition, lead DNA variants with potentially large pleiotropic effects on DMI, ADG, and MWT were found on chromosomes 6, 14 and 20. On average, missense, 3'UTR, 5'UTR, and other regulatory region variants exhibited larger allele substitution effects in comparison to other functional classes. Intergenic and intron variants captured smaller proportions of additive genetic variance per DNA variant. Instead 3'UTR and synonymous variants explained a greater amount of genetic variance per DNA variant for all the traits examined while missense, 5'UTR and other regulatory region variants accounted for relatively more additive genetic variance per sequence variant for RFI and ADG, respectively. In total, 25 to 27 enriched cellular and molecular functions were identified with lipid metabolism and carbohydrate metabolism being the most significant for the feed efficiency traits.

CONCLUSIONS

RFI is controlled by many DNA variants with relatively small effects whereas DMI, ADG, and MWT are influenced by a few DNA variants with large effects and many DNA variants with small effects. Nucleotide polymorphisms in regulatory region and synonymous functional classes play a more important role per sequence variant in determining variation of the feed efficiency traits. The genetic architecture as revealed by the GWAS of the imputed 7,853,211 DNA variants will improve our understanding on the genetic control of feed efficiency traits in beef cattle.

Collapse

Sure independence screening in the presence of missing data. Stat Pap (Berl) 2019. [DOI: 10.1007/s00362-019-01115-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Statistical methods for genome-wide association studies. Semin Cancer Biol 2019;55:53-60. [DOI: 10.1016/j.semcancer.2018.04.008] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2017] [Revised: 04/27/2018] [Accepted: 04/28/2018] [Indexed: 12/12/2022]

Ghoreishifar SM, Moradi-Shahrbabak H, Moradi-Shahrbabak M, Nicolazzi EL, Williams JL, Iamartino D, Nejati-Javaremi A. Accuracy of imputation of single-nucleotide polymorphism marker genotypes for water buffaloes (Bubalus bubalis) using different reference population sizes and imputation tools. Livest Sci 2018. [DOI: 10.1016/j.livsci.2018.08.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Wu Y, Hormozdiari F, Joo JWJ, Eskin E. Improving Imputation Accuracy by Inferring Causal Variants in Genetic Studies. J Comput Biol 2018;26:1203-1213. [PMID: 30272994 DOI: 10.1089/cmb.2018.0139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Song Q, Xu W, Li W, He S, Liu J, Wang G, Ma L. Accurate haplotype imputation with individualized ancestry-adjusted reference panels. Genomics 2018;110:329-335. [PMID: 29198611 DOI: 10.1016/j.ygeno.2017.11.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 11/01/2017] [Accepted: 11/29/2017] [Indexed: 11/23/2022]

Das S, Abecasis GR, Browning BL. Genotype Imputation from Large Reference Panels. Annu Rev Genomics Hum Genet 2018;19:73-96. [PMID: 29799802 DOI: 10.1146/annurev-genom-083117-021602] [Citation(s) in RCA: 110] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Liao B, Wang X, Zhu W, Li X, Cai L, Chen H. New multilocus linkage disequilibrium measure for tag SNP selection. J Bioinform Comput Biol 2017;15:1750001. [DOI: 10.1142/s0219720017500019] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Genotype Imputation Methods and Their Effects on Genomic Predictions in Cattle. ACTA ACUST UNITED AC 2017. [DOI: 10.1007/s40362-017-0041-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Wang J, Shete S. Testing Departure from Hardy-Weinberg Proportions. Methods Mol Biol 2017;1666:83-115. [PMID: 28980243 DOI: 10.1007/978-1-4939-7274-6_6] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Xiang T, Christensen OF, Vitezica ZG, Legarra A. Genomic evaluation by including dominance effects and inbreeding depression for purebred and crossbred performance with an application in pigs. Genet Sel Evol 2016;48:92. [PMID: 27887565 PMCID: PMC5123321 DOI: 10.1186/s12711-016-0271-4] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 11/15/2016] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Improved performance of crossbred animals is partly due to heterosis. One of the major genetic bases of heterosis is dominance, but it is seldom used in pedigree-based genetic evaluation of livestock. Recently, a trivariate genomic best linear unbiased prediction (GBLUP) model including dominance was developed, which can distinguish purebreds from crossbred animals explicitly. The objectives of this study were: (1) methodological, to show that inclusion of marker-based inbreeding accounts for directional dominance and inbreeding depression in purebred and crossbred animals, to revisit variance components of additive and dominance genetic effects using this model, and to develop marker-based estimators of genetic correlations between purebred and crossbred animals and of correlations of allele substitution effects between breeds; (2) to evaluate the impact of accounting for dominance effects and inbreeding depression on predictive ability for total number of piglets born (TNB) in a pig dataset composed of two purebred populations and their crossbreds. We also developed an equivalent model that makes the estimation of variance components tractable.

RESULTS

For TNB in Danish Landrace and Yorkshire populations and their reciprocal crosses, the estimated proportions of dominance genetic variance to additive genetic variance ranged from 5 to 11%. Genetic correlations between breeding values for purebred and crossbred performances for TNB ranged from 0.79 to 0.95 for Landrace and from 0.43 to 0.54 for Yorkshire across models. The estimated correlation of allele substitution effects between Landrace and Yorkshire was low for purebred performances, but high for crossbred performances. Predictive ability for crossbred animals was similar with or without dominance. The inbreeding depression effect increased predictive ability and the estimated inbreeding depression parameter was more negative for Landrace than for Yorkshire animals and was in between for crossbred animals.

CONCLUSIONS

Methodological developments led to closed-form estimators of inbreeding depression, variance components and correlations that can be easily interpreted in a quantitative genetics context. Our results confirm that genetic correlations of breeding values between purebred and crossbred performances within breed are positive and moderate. Inclusion of dominance in the GBLUP model does not improve predictive ability for crossbred animals, whereas inclusion of inbreeding depression does.

Collapse

SparRec: An effective matrix completion framework of missing data imputation for GWAS. Sci Rep 2016;6:35534. [PMID: 27762341 PMCID: PMC5071878 DOI: 10.1038/srep35534] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 09/30/2016] [Indexed: 11/08/2022] Open

Brandariz SP, González Reymúndez A, Lado B, Malosetti M, Garcia AAF, Quincke M, von Zitzewitz J, Castro M, Matus I, del Pozo A, Castro AJ, Gutiérrez L. Ascertainment bias from imputation methods evaluation in wheat. BMC Genomics 2016;17:773. [PMID: 27716058 PMCID: PMC5050639 DOI: 10.1186/s12864-016-3120-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2016] [Accepted: 09/23/2016] [Indexed: 02/01/2023] Open

Abstract

BACKGROUND

Whole-genome genotyping techniques like Genotyping-by-sequencing (GBS) are being used for genetic studies such as Genome-Wide Association (GWAS) and Genomewide Selection (GS), where different strategies for imputation have been developed. Nevertheless, imputation error may lead to poor performance (i.e. smaller power or higher false positive rate) when complete data is not required as it is for GWAS, and each marker is taken at a time. The aim of this study was to compare the performance of GWAS analysis for Quantitative Trait Loci (QTL) of major and minor effect using different imputation methods when no reference panel is available in a wheat GBS panel.

RESULTS

In this study, we compared the power and false positive rate of dissecting quantitative traits for imputed and not-imputed marker score matrices in: (1) a complete molecular marker barley panel array, and (2) a GBS wheat panel with missing data. We found that there is an ascertainment bias in imputation method comparisons. Simulating over a complete matrix and creating missing data at random proved that imputation methods have a poorer performance. Furthermore, we found that when QTL were simulated with imputed data, the imputation methods performed better than the not-imputed ones. On the other hand, when QTL were simulated with not-imputed data, the not-imputed method and one of the imputation methods performed better for dissecting quantitative traits. Moreover, larger differences between imputation methods were detected for QTL of major effect than QTL of minor effect. We also compared the different marker score matrices for GWAS analysis in a real wheat phenotype dataset, and we found minimal differences indicating that imputation did not improve the GWAS performance when a reference panel was not available.

CONCLUSIONS

Poorer performance was found in GWAS analysis when an imputed marker score matrix was used, no reference panel is available, in a wheat GBS panel.

Collapse

Grünwald NJ, McDonald BA, Milgroom MG. Population Genomics of Fungal and Oomycete Pathogens. ANNUAL REVIEW OF PHYTOPATHOLOGY 2016;54:323-46. [PMID: 27296138 DOI: 10.1146/annurev-phyto-080614-115913] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]

Hormozdiari F, Kang EY, Bilow M, Ben-David E, Vulpe C, McLachlan S, Lusis AJ, Han B, Eskin E. Imputing Phenotypes for Genome-wide Association Studies. Am J Hum Genet 2016;99:89-103. [PMID: 27292110 PMCID: PMC5005435 DOI: 10.1016/j.ajhg.2016.04.013] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 04/28/2016] [Indexed: 01/23/2023] Open

Palmer C, Pe’er I. Bias Characterization in Probabilistic Genotype Data and Improved Signal Detection with Multiple Imputation. PLoS Genet 2016;12:e1006091. [PMID: 27310603 PMCID: PMC4910998 DOI: 10.1371/journal.pgen.1006091] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 05/09/2016] [Indexed: 11/22/2022] Open

Abstract

Missing data are an unavoidable component of modern statistical genetics. Different array or sequencing technologies cover different single nucleotide polymorphisms (SNPs), leading to a complicated mosaic pattern of missingness where both individual genotypes and entire SNPs are sporadically absent. Such missing data patterns cannot be ignored without introducing bias, yet cannot be inferred exclusively from nonmissing data. In genome-wide association studies, the accepted solution to missingness is to impute missing data using external reference haplotypes. The resulting probabilistic genotypes may be analyzed in the place of genotype calls. A general-purpose paradigm, called Multiple Imputation (MI), is known to model uncertainty in many contexts, yet it is not widely used in association studies. Here, we undertake a systematic evaluation of existing imputed data analysis methods and MI. We characterize biases related to uncertainty in association studies, and find that bias is introduced both at the imputation level, when imputation algorithms generate inconsistent genotype probabilities, and at the association level, when analysis methods inadequately model genotype uncertainty. We find that MI performs at least as well as existing methods or in some cases much better, and provides a straightforward paradigm for adapting existing genotype association methods to uncertain data.

Genetic research has been focused at analysis of datapoints that are assumed to be deterministically known. However, the majority of current, high throughput data is only probabilistically known, and proper methods for handing such uncertain genotypes are limited. Here, we build on existing theory from the field of statistics to introduce a general framework for handling probabilistic genotype data obtained through genotype imputation. This framework, called Multiple Imputation, matches or improves upon existing methods for handling uncertainty in basic analysis of genetic association. As opposed to such methods, our work furthermore extends to more advanced analysis, such as mixed-effects models, with no additional complication. Importantly, it generates posterior probabilities of association that are intrinsically weighted by the certainty of the underlying data, a feature unmatched by other existing methods. Multiple Imputation is also fully compatible with meta-analysis. Finally, our analysis of probabilistic genotype data brings into focus the accuracy and unreliability of imputation’s estimated probabilities. Taken together, these results substantially increase the utility of imputed genotypes in statistical genetics, and may have strong implications for analysis of sequencing data moving forward.

Collapse

New Genetic Approaches to AD: Lessons from APOE-TOMM40 Phylogenetics. Curr Neurol Neurosci Rep 2016;16:48. [DOI: 10.1007/s11910-016-0643-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Xiang T, Nielsen B, Su G, Legarra A, Christensen OF. Application of single-step genomic evaluation for crossbred performance in pig1. J Anim Sci 2016;94:936-48. [DOI: 10.2527/jas.2015-9930] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Boettger LM, Salem RM, Handsaker RE, Peloso GM, Kathiresan S, Hirschhorn JN, McCarroll SA. Recurring exon deletions in the HP (haptoglobin) gene contribute to lower blood cholesterol levels. Nat Genet 2016;48:359-66. [PMID: 26901066 PMCID: PMC4811681 DOI: 10.1038/ng.3510] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 01/20/2016] [Indexed: 02/08/2023]

Wijsman EM. Family-based approaches: design, imputation, analysis, and beyond. BMC Genet 2016;17 Suppl 2:9. [PMID: 26866700 PMCID: PMC4895701 DOI: 10.1186/s12863-015-0318-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Li W, Xu W, Fu G, Ma L, Richards J, Rao W, Bythwood T, Guo S, Song Q. High-accuracy haplotype imputation using unphased genotype data as the references. Gene 2015;572:279-84. [PMID: 26232609 PMCID: PMC5373555 DOI: 10.1016/j.gene.2015.07.082] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Revised: 06/24/2015] [Accepted: 07/28/2015] [Indexed: 12/19/2022]

Li W, Xu W, Li Q, Ma L, Song Q. References for Haplotype Imputation in the Big Data Era. ACTA ACUST UNITED AC 2015;4. [PMID: 27274952 PMCID: PMC4888899 DOI: 10.4172/2168-9547.1000143] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Wang Y, Wylie T, Stothard P, Lin G. Whole genome SNP genotype piecemeal imputation. BMC Bioinformatics 2015;16:340. [PMID: 26498158 PMCID: PMC4619096 DOI: 10.1186/s12859-015-0770-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2015] [Accepted: 10/09/2015] [Indexed: 11/10/2022] Open

Abstract

Background

Despite ongoing reductions in the cost of sequencing technologies, whole genome SNP genotype imputation is often used as an alternative for obtaining abundant SNP genotypes for genome wide association studies. Several existing genotype imputation methods can be efficient for this purpose, while achieving various levels of imputation accuracy. Recent empirical results have shown that the two-step imputation may improve accuracy by imputing the low density genotyped study animals to a medium density array first and then to the target density. We are interested in building a series of staircase arrays that lead the low density array to the high density array or even the whole genome, such that genotype imputation along these staircases can achieve the highest accuracy.

Results

For genotype imputation from a lower density to a higher density, we first show how to select untyped SNPs to construct a medium density array. Subsequently, we determine for each selected SNP those untyped SNPs to be imputed in the add-one two-step imputation, and lastly how the clusters of imputed genotype are pieced together as the final imputation result. We design extensive empirical experiments using several hundred sequenced and genotyped animals to demonstrate that our novel two-step piecemeal imputation always achieves an improvement compared to the one-step imputation by the state-of-the-art methods Beagle and FImpute. Using the two-step piecemeal imputation, we present some preliminary success on whole genome SNP genotype imputation for genotyped animals via a series of staircase arrays.

Conclusions

From a low SNP density to the whole genome, intermediate pseudo-arrays can be computationally constructed by selecting the most informative SNPs for untyped SNP genotype imputation. Such pseudo-array staircases are able to impute more accurately than the classic one-step imputation.

Collapse

Xiang T, Ma P, Ostersen T, Legarra A, Christensen OF. Imputation of genotypes in Danish purebred and two-way crossbred pigs using low-density panels. Genet Sel Evol 2015;47:54. [PMID: 26122927 PMCID: PMC4486706 DOI: 10.1186/s12711-015-0134-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2014] [Accepted: 06/13/2015] [Indexed: 01/30/2023] Open

Abstract

Background

Genotype imputation is commonly used as an initial step in genomic selection since the accuracy of genomic selection does not decline if accurately imputed genotypes are used instead of actual genotypes but for a lower cost. Performance of imputation has rarely been investigated in crossbred animals and, in particular, in pigs. The extent and pattern of linkage disequilibrium differ in crossbred versus purebred animals, which may impact the performance of imputation. In this study, first we compared different scenarios of imputation from 5 K to 8 K single nucleotide polymorphisms (SNPs) in genotyped Danish Landrace and Yorkshire and crossbred Landrace-Yorkshire datasets and, second, we compared imputation from 8 K to 60 K SNPs in genotyped purebred and simulated crossbred datasets. All imputations were done using software Beagle version 3.3.2. Then, we investigated the reasons that could explain the differences observed.

Results

Genotype imputation performs as well in crossbred animals as in purebred animals when both parental breeds are included in the reference population. When the size of the reference population is very large, it is not necessary to use a reference population that combines the two breeds to impute the genotypes of purebred animals because a within-breed reference population can provide a very high level of imputation accuracy (correct rate ≥ 0.99, correlation ≥ 0.95). However, to ensure that similar imputation accuracies are obtained for crossbred animals, a reference population that combines both parental purebred animals is required. Imputation accuracies are higher when a larger proportion of haplotypes are shared between the reference population and the validation (imputed) populations.

Conclusions

The results from both real data and pedigree-based simulated data demonstrate that genotype imputation from low-density panels to medium-density panels is highly accurate in both purebred and crossbred pigs. In crossbred pigs, combining the parental purebred animals in the reference population is necessary to obtain high imputation accuracy.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0134-4) contains supplementary material, which is available to authorized users.

Collapse

Verma SS, de Andrade M, Tromp G, Kuivaniemi H, Pugh E, Namjou-Khales B, Mukherjee S, Jarvik GP, Kottyan LC, Burt A, Bradford Y, Armstrong GD, Derr K, Crawford DC, Haines JL, Li R, Crosslin D, Ritchie MD. Imputation and quality control steps for combining multiple genome-wide datasets. Front Genet 2014;5:370. [PMID: 25566314 PMCID: PMC4263197 DOI: 10.3389/fgene.2014.00370] [Citation(s) in RCA: 97] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 10/03/2014] [Indexed: 12/16/2022] Open

Affiliation(s)

Shefali S Verma Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
Mariza de Andrade Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic Rochester, MN, USA
Gerard Tromp The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
Helena Kuivaniemi The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
Elizabeth Pugh Center for Inherited Disease Research, John Hopkins University Baltimore, MD, USA
Bahram Namjou-Khales Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
Shubhabrata Mukherjee Department of Medicine, University of Washington Seattle, WA, USA
Gail P Jarvik Department of Medicine, University of Washington Seattle, WA, USA
Leah C Kottyan Cincinnati Children's Hospital Medical Center Cincinnati, OH, USA
Amber Burt Department of Medicine, University of Washington Seattle, WA, USA
Yuki Bradford Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
Gretta D Armstrong Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA
Kimberly Derr The Sigfried and Janet Weis Center for Research, Geisinger Health System Danville, PA, USA
Dana C Crawford Center for Human Genetics Research, Vanderbilt University Nashville, TN, USA ; Department of Epidemiology and Biostatistics, Case Western University Cleveland, OH, USA
Jonathan L Haines Department of Epidemiology and Biostatistics, Case Western University Cleveland, OH, USA
Rongling Li Division of Genomic Medicine, National Human Genome Research Institute Bethesda, MD, USA
David Crosslin Department of Medicine, University of Washington Seattle, WA, USA
Marylyn D Ritchie Department of Biochemistry and Molecular Biology, Center for Systems Genomics, The Pennsylvania State University Pennsylvania, PA, USA

Collapse

Zeng P, Zhao Y, Qian C, Zhang L, Zhang R, Gou J, Liu J, Liu L, Chen F. Statistical analysis for genome-wide association study. J Biomed Res 2014;29:285-97. [PMID: 26243515 PMCID: PMC4547377 DOI: 10.7555/jbr.29.20140007] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 06/07/2014] [Accepted: 09/27/2014] [Indexed: 12/19/2022] Open

Kauwe JSK, Bailey MH, Ridge PG, Perry R, Wadsworth ME, Hoyt KL, Staley LA, Karch CM, Harari O, Cruchaga C, Ainscough BJ, Bales K, Pickering EH, Bertelsen S, Fagan AM, Holtzman DM, Morris JC, Goate AM. Genome-wide association study of CSF levels of 59 alzheimer's disease candidate proteins: significant associations with proteins involved in amyloid processing and inflammation. PLoS Genet 2014;10:e1004758. [PMID: 25340798 PMCID: PMC4207667 DOI: 10.1371/journal.pgen.1004758] [Citation(s) in RCA: 96] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2014] [Accepted: 09/16/2014] [Indexed: 01/25/2023] Open

Abstract

Cerebrospinal fluid (CSF) 42 amino acid species of amyloid beta (Aβ42) and tau levels are strongly correlated with the presence of Alzheimer's disease (AD) neuropathology including amyloid plaques and neurodegeneration and have been successfully used as endophenotypes for genetic studies of AD. Additional CSF analytes may also serve as useful endophenotypes that capture other aspects of AD pathophysiology. Here we have conducted a genome-wide association study of CSF levels of 59 AD-related analytes. All analytes were measured using the Rules Based Medicine Human DiscoveryMAP Panel, which includes analytes relevant to several disease-related processes. Data from two independently collected and measured datasets, the Knight Alzheimer's Disease Research Center (ADRC) and Alzheimer's Disease Neuroimaging Initiative (ADNI), were analyzed separately, and combined results were obtained using meta-analysis. We identified genetic associations with CSF levels of 5 proteins (Angiotensin-converting enzyme (ACE), Chemokine (C-C motif) ligand 2 (CCL2), Chemokine (C-C motif) ligand 4 (CCL4), Interleukin 6 receptor (IL6R) and Matrix metalloproteinase-3 (MMP3)) with study-wide significant p-values (p<1.46×10⁻¹⁰) and significant, consistent evidence for association in both the Knight ADRC and the ADNI samples. These proteins are involved in amyloid processing and pro-inflammatory signaling. SNPs associated with ACE, IL6R and MMP3 protein levels are located within the coding regions of the corresponding structural gene. The SNPs associated with CSF levels of CCL4 and CCL2 are located in known chemokine binding proteins. The genetic associations reported here are novel and suggest mechanisms for genetic control of CSF and plasma levels of these disease-related proteins. Significant SNPs in ACE and MMP3 also showed association with AD risk. Our findings suggest that these proteins/pathways may be valuable therapeutic targets for AD. Robust associations in cognitively normal individuals suggest that these SNPs also influence regulation of these proteins more generally and may therefore be relevant to other diseases.

The use of quantitative endophenotypes from cerebrospinal fluid has led to the identification of several genetic variants that alter risk or rate of progression of Alzheimer's disease. Here we have analyzed the levels of 58 disease-related proteins in the cerebrospinal fluid for association with millions of variants across the human genome. We have identified significant, replicable associations with 5 analytes, Angiotensin-converting enzyme, Chemokine (C-C motif) ligand 2, Chemokine (C-C motif) ligand 4, Interleukin 6 receptor and Matrix metalloproteinase-3. Our results suggest that these variants play a regulatory role in the respective protein levels and are relevant to the inflammatory and amyloid processing pathways. Variants in associated with ACE and those associated with MMP3 levels also show association with risk for Alzheimer's disease in the expected directions. These associations are consistent in cerebrospinal fluid and plasma and in samples with only cognitively normal individuals suggesting that they are relevant in the regulation of these protein levels beyond the context of Alzheimer's disease.

Collapse

Affiliation(s)

John S. K. Kauwe Department of Biology, Brigham Young University, Provo, Utah, United States of America
Matthew H. Bailey Department of Biology, Brigham Young University, Provo, Utah, United States of America
Perry G. Ridge Department of Biology, Brigham Young University, Provo, Utah, United States of America
Rachel Perry Department of Biology, Brigham Young University, Provo, Utah, United States of America
Mark E. Wadsworth Department of Biology, Brigham Young University, Provo, Utah, United States of America
Kaitlyn L. Hoyt Department of Biology, Brigham Young University, Provo, Utah, United States of America
Lyndsay A. Staley Department of Biology, Brigham Young University, Provo, Utah, United States of America
Celeste M. Karch Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri, United States of America Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America
Oscar Harari Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri, United States of America
Carlos Cruchaga Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri, United States of America Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America
Benjamin J. Ainscough The Genome Institute, Washington University School of Medicine, St Louis, Missouri, United States of America
Kelly Bales Neuroscience Research Unit, Worldwide Research and Development, Pfizer Inc., Groton, Connecticut, United States of America
Eve H. Pickering Neuroscience Research Unit, Worldwide Research and Development, Pfizer Inc., Groton, Connecticut, United States of America
Sarah Bertelsen Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri, United States of America
the Alzheimer's Disease Neuroimaging Initiative
Anne M. Fagan Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America Knight Alzheimer's Disease Research Center, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Neurology, Washington University School of Medicine, St Louis, Missouri, United States of America
David M. Holtzman Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America Knight Alzheimer's Disease Research Center, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Neurology, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Developmental Biology, Washington University School of Medicine, St Louis, Missouri, United States of America
John C. Morris Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America Knight Alzheimer's Disease Research Center, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Neurology, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Pathology and Immunology, Washington University School of Medicine, St Louis, Missouri, United States of America
Alison M. Goate Department of Psychiatry, Washington University School of Medicine, St Louis, Missouri, United States of America Hope Center for Neurological Disorders, Washington University School of Medicine, St Louis, Missouri, United States of America Knight Alzheimer's Disease Research Center, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Neurology, Washington University School of Medicine, St Louis, Missouri, United States of America Department of Genetics, Washington University School of Medicine, St Louis, Missouri, United States of America * E-mail:

Collapse

Impact of pre-imputation SNP-filtering on genotype imputation results. BMC Genet 2014;15:88. [PMID: 25112433 PMCID: PMC4236550 DOI: 10.1186/s12863-014-0088-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 07/18/2014] [Indexed: 11/10/2022] Open

Abstract

Background

Imputation of partially missing or unobserved genotypes is an indispensable tool for SNP data analyses. However, research and understanding of the impact of initial SNP-data quality control on imputation results is still limited. In this paper, we aim to evaluate the effect of different strategies of pre-imputation quality filtering on the performance of the widely used imputation algorithms MaCH and IMPUTE.

Results

We considered three scenarios: imputation of partially missing genotypes with usage of an external reference panel, without usage of an external reference panel, as well as imputation of completely un-typed SNPs using an external reference panel. We first created various datasets applying different SNP quality filters and masking certain percentages of randomly selected high-quality SNPs. We imputed these SNPs and compared the results between the different filtering scenarios by using established and newly proposed measures of imputation quality. While the established measures assess certainty of imputation results, our newly proposed measures focus on the agreement with true genotypes. These measures showed that pre-imputation SNP-filtering might be detrimental regarding imputation quality. Moreover, the strongest drivers of imputation quality were in general the burden of missingness and the number of SNPs used for imputation. We also found that using a reference panel always improves imputation quality of partially missing genotypes. MaCH performed slightly better than IMPUTE2 in most of our scenarios. Again, these results were more pronounced when using our newly defined measures of imputation quality.

Conclusion

Even a moderate filtering has a detrimental effect on the imputation quality. Therefore little or no SNP filtering prior to imputation appears to be the best strategy for imputing small to moderately sized datasets. Our results also showed that for these datasets, MaCH performs slightly better than IMPUTE2 in most scenarios at the cost of increased computing time.

Collapse

Efficiency of haplotype-based methods to fine-map QTLs and embryonic lethal variants affecting fertility: Illustration with a deletion segregating in Nordic Red cattle. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.04.030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]