1
|
Gomez-Raya L, Gómez Izquierdo E, de Mercado de la Peña E, Garcia-Ruiz F, Rauw WM. First-degree relationships and genotyping errors deciphered by a high-density SNP array in a Duroc × Iberian pig cross. BMC Genom Data 2022; 23:14. [PMID: 35177001 PMCID: PMC8851823 DOI: 10.1186/s12863-022-01025-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 01/06/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Two individuals with a first-degree relationship share about 50 percent of their alleles. Parent-offspring relationships cannot be homozygous for alternative alleles (genetic exclusion). METHODS Applying the concept of genetic exclusion to HD arrays typed in animals for experimental purposes or genomic selection allows estimation of the rate of rejection of first-degree relationships as the rate at which two individuals typed for a large number of Single Nucleotide Polymorphisms (SNPs) do not share at least one allele. An Expectation-Maximization algorithm is applied to estimate parentage. In addition, genotyping errors are estimated in true parent-offspring relationships. Samples from nine candidate Duroc sires and 55 Iberian dams producing 214 Duroc × Iberian barrows were typed for the HD porcine Affymetrix array. RESULTS We were able to establish paternity and maternity of 75 and 85 piglets, respectively. Rate of rejection in true parent-offspring relationships was estimated as 0.000735. This is a lower bound of the genotyping error since rate of rejection depends on allele frequencies. After accounting for allele frequencies, our estimate of the genotyping error is 0.6%. A total of 7,744 SNPs were rejected in five or more true parent-offspring relationships facilitating identification of "problematic" SNPs with inconsistent inheritance in multiple parent-offspring relationships. CONCLUSIONS This study shows that animal experiments and routine genotyping in genomic selection allow to establish or to verify first-degree relationships as well as to estimate genotyping errors for each batch of animals or experiment.
Collapse
Affiliation(s)
- L Gomez-Raya
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Ctra. de La Coruña km 7.5, 28040, Madrid, Spain.
| | - E Gómez Izquierdo
- Centro de Pruebas de Porcino, Instituto Tecnológico Agrario Junta de Castilla y León (ITACyL), Ctra Riaza-Toro S/N, 40353, Hontalbilla, Spain
| | - E de Mercado de la Peña
- Departamento de Reproducción Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Avda. Puerta de Hierro s/n, 28040, Madrid, Spain
| | - F Garcia-Ruiz
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Ctra. de La Coruña km 7.5, 28040, Madrid, Spain
| | - W M Rauw
- Departamento de Mejora Genética Animal, Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Ctra. de La Coruña km 7.5, 28040, Madrid, Spain
| |
Collapse
|
2
|
Rafter P, Gormley IC, Parnell AC, Naderi S, Berry DP. The Contribution of Copy Number Variants and Single Nucleotide Polymorphisms to the Additive Genetic Variance of Carcass Traits in Cattle. Front Genet 2021; 12:761503. [PMID: 34795696 PMCID: PMC8593468 DOI: 10.3389/fgene.2021.761503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 10/04/2021] [Indexed: 11/13/2022] Open
Abstract
The relative contributions of both copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) to the additive genetic variance of carcass traits in cattle is not well understood. A detailed understanding of the relative importance of CNVs in cattle may have implications for study design of both genomic predictions and genome-wide association studies. The first objective of the present study was to quantify the relative contributions of CNV data and SNP genotype data to the additive genetic variance of carcass weight, fat, and conformation for 945 Charolais, 923 Holstein-Friesian, and 974 Limousin sires. The second objective was to jointly consider SNP and CNV data in a least absolute selection and shrinkage operator (LASSO) regression model to identify genomic regions associated with carcass weight, fat, and conformation within each of the three breeds separately. A genomic relationship matrix (GRM) based on just CNV data did not capture any variance in the three carcass traits when jointly evaluated with a SNP-derived GRM. In the LASSO regression analysis, a total of 987 SNPs and 18 CNVs were associated with at least one of the three carcass traits in at least one of the three breeds. The quantitative trait loci (QTLs) corresponding to the associated SNPs and CNVs overlapped with several candidate genes including previously reported candidate genes such as MSTN and RSAD2, and several potential novel candidate genes such as ACTN2 and THOC1. The results of the LASSO regression analysis demonstrated that CNVs can be used to detect associations with carcass traits which were not detected using the set of SNPs available in the present study. Therefore, the CNVs and SNPs available in the present study were not redundant forms of genomic data.
Collapse
Affiliation(s)
- Pierce Rafter
- Animal & Grassland Research and Innovation Centre, Fermoy, Ireland.,School of Mathematics and Statistics, University College Dublin, Dublin, Ireland
| | | | | | - Saeid Naderi
- Irish Cattle Breeding Federation, Bandon, Ireland
| | - Donagh P Berry
- Animal & Grassland Research and Innovation Centre, Fermoy, Ireland
| |
Collapse
|
3
|
Hou L, Sun N, Mane S, Sayward F, Rajeevan N, Cheung KH, Cho K, Pyarajan S, Aslan M, Miller P, Harvey PD, Gaziano JM, Concato J, Zhao H. Impact of genotyping errors on statistical power of association tests in genomic analyses: A case study. Genet Epidemiol 2016; 41:152-162. [PMID: 28019059 DOI: 10.1002/gepi.22027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Revised: 08/15/2016] [Accepted: 10/10/2016] [Indexed: 12/13/2022]
Abstract
A key step in genomic studies is to assess high throughput measurements across millions of markers for each participant's DNA, either using microarrays or sequencing techniques. Accurate genotype calling is essential for downstream statistical analysis of genotype-phenotype associations, and next generation sequencing (NGS) has recently become a more common approach in genomic studies. How the accuracy of variant calling in NGS-based studies affects downstream association analysis has not, however, been studied using empirical data in which both microarrays and NGS were available. In this article, we investigate the impact of variant calling errors on the statistical power to identify associations between single nucleotides and disease, and on associations between multiple rare variants and disease. Both differential and nondifferential genotyping errors are considered. Our results show that the power of burden tests for rare variants is strongly influenced by the specificity in variant calling, but is rather robust with regard to sensitivity. By using the variant calling accuracies estimated from a substudy of a Cooperative Studies Program project conducted by the Department of Veterans Affairs, we show that the power of association tests is mostly retained with commonly adopted variant calling pipelines. An R package, GWAS.PC, is provided to accommodate power analysis that takes account of genotyping errors (http://zhaocenter.org/software/).
Collapse
Affiliation(s)
- Lin Hou
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Ning Sun
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Shrikant Mane
- Department of Genetics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Fred Sayward
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Nallakkandi Rajeevan
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Kei-Hoi Cheung
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Kelly Cho
- Massachusetts Area Veterans Epidemiology Research and Information Center (MAVERIC), VA Cooperative Studies Program, VA Boston Healthcare System, Boston, Massachusetts, United States of America.,Department of Medicine, Harvard University School of Medicine, Boston, Massachusetts, United States of America
| | - Saiju Pyarajan
- Massachusetts Area Veterans Epidemiology Research and Information Center (MAVERIC), VA Cooperative Studies Program, VA Boston Healthcare System, Boston, Massachusetts, United States of America.,Department of Medicine, Harvard University School of Medicine, Boston, Massachusetts, United States of America
| | - Mihaela Aslan
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Department of Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Perry Miller
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Center for Medical Informatics, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Philip D Harvey
- Bruce W. Carter Miami Veterans Affairs (VA) Medical Center, Miami, Florida, United States of America.,Department of Psychiatry, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - J Michael Gaziano
- Massachusetts Area Veterans Epidemiology Research and Information Center (MAVERIC), VA Cooperative Studies Program, VA Boston Healthcare System, Boston, Massachusetts, United States of America.,Department of Medicine, Harvard University School of Medicine, Boston, Massachusetts, United States of America
| | - John Concato
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Department of Medicine, Yale University School of Medicine, New Haven, Connecticut, United States of America
| | - Hongyu Zhao
- Clinical Epidemiology Research Center (CERC), Veterans Affairs (VA) Cooperative Studies Program, VA Connecticut Healthcare System, West Haven, Connecticut, United States of America.,Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| |
Collapse
|
4
|
Eynard SE, Windig JJ, Hiemstra SJ, Calus MPL. Whole-genome sequence data uncover loss of genetic diversity due to selection. Genet Sel Evol 2016; 48:33. [PMID: 27080121 PMCID: PMC4831198 DOI: 10.1186/s12711-016-0210-4] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 03/23/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Whole-genome sequence (WGS) data give access to more complete structural genetic information of individuals, including rare variants, not fully covered by single nucleotide polymorphism chips. We used WGS to investigate the amount of genetic diversity remaining after selection using optimal contribution (OC), considering different methods to estimate the relationships used in OC. OC was applied to minimise average relatedness of the selection candidates and thus miminise the loss of genetic diversity in a conservation strategy, e.g. for establishment of gene bank collections. Furthermore, OC was used to maximise average genetic merit of the selection candidates at a given level of relatedness, similar to a genetic improvement strategy. In this study, we used data from 277 bulls from the 1000 bull genomes project. We measured genetic diversity as the number of variants still segregating after selection using WGS data, and compared strategies that targeted conservation of rare (minor allele frequency <5 %) versus common variants. RESULTS When OC without restriction on the number of selected individuals was applied, loss of variants was minimal and most individuals were selected, which is often unfeasible in practice. When 20 individuals were selected, the number of segregating rare variants was reduced by 29 % for the conservation strategy, and by 34 % for the genetic improvement strategy. The overall number of segregating variants was reduced by 30 % when OC was restricted to selecting five individuals, for both conservation and genetic improvement strategies. For common variants, this loss was about 15 %, while it was much higher, 72 %, for rare variants. Fewer rare variants were conserved with the genetic improvement strategy compared to the conservation strategy. CONCLUSIONS The use of WGS for genetic diversity quantification revealed that selection results in considerable losses of genetic diversity for rare variants. Using WGS instead of SNP chip data to estimate relationships slightly reduced the loss of rare variants, while using 50 K SNP chip data was sufficient to conserve common variants. The loss of rare variants could be mitigated by a few percent (up to 8 %) depending on which method is chosen to estimate relationships from WGS data.
Collapse
Affiliation(s)
- Sonia E Eynard
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands. .,GABI, INRA, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, France. .,Centre for Genetic Resources, the Netherlands, Wageningen UR, P.O. Box 338, 3700 AH, Wageningen, The Netherlands.
| | - Jack J Windig
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands.,Centre for Genetic Resources, the Netherlands, Wageningen UR, P.O. Box 338, 3700 AH, Wageningen, The Netherlands
| | - Sipke J Hiemstra
- Centre for Genetic Resources, the Netherlands, Wageningen UR, P.O. Box 338, 3700 AH, Wageningen, The Netherlands
| | - Mario P L Calus
- Animal Breeding and Genomics Centre, Wageningen UR Livestock Research, P.O. Box 338, 6700 AH, Wageningen, The Netherlands
| |
Collapse
|