1
|
Genomic evidence for the suitability of Göttingen Minipigs with a rare seizure phenotype as a model for human epilepsy. Neurogenetics 2024; 25:103-117. [PMID: 38383918 PMCID: PMC11076379 DOI: 10.1007/s10048-024-00750-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/02/2024] [Indexed: 02/23/2024]
Abstract
Epilepsy is a complex genetic disorder that affects about 2% of the global population. Although the frequency and severity of epileptic seizures can be reduced by a range of pharmacological interventions, there are no disease-modifying treatments for epilepsy. The development of new and more effective drugs is hindered by a lack of suitable animal models. Available rodent models may not recapitulate all key aspects of the disease. Spontaneous epileptic convulsions were observed in few Göttingen Minipigs (GMPs), which may provide a valuable alternative animal model for the characterisation of epilepsy-type diseases and for testing new treatments. We have characterised affected GMPs at the genome level and have taken advantage of primary fibroblast cultures to validate the functional impact of fixed genetic variants on the transcriptome level. We found numerous genes connected to calcium metabolism that have not been associated with epilepsy before, such as ADORA2B, CAMK1D, ITPKB, MCOLN2, MYLK, NFATC3, PDGFD, and PHKB. Our results have identified two transcription factor genes, EGR3 and HOXB6, as potential key regulators of CACNA1H, which was previously linked to epilepsy-type disorders in humans. Our findings provide the first set of conclusive results to support the use of affected subsets of GMPs as an alternative and more reliable model system to study human epilepsy. Further neurological and pharmacological validation of the suitability of GMPs as an epilepsy model is therefore warranted.
Collapse
|
2
|
Optimization of breeding program design through stochastic simulation with kernel regression. G3 (BETHESDA, MD.) 2023; 13:jkad217. [PMID: 37742059 PMCID: PMC10700053 DOI: 10.1093/g3journal/jkad217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/29/2023] [Revised: 07/29/2023] [Accepted: 09/02/2023] [Indexed: 09/25/2023]
Abstract
In recent years, breeding programs have increased significantly in size and complexity, with various highly interdependent parameters and many contrasting breeding goals. As a result, resource allocation in these programs has become more complex, and deriving an optimal breeding strategy has become increasingly challenging. To address this, a common practice is to reduce the optimization problem to a set of scenarios that differ only in a few parameters and can therefore be analyzed in detail. The goal of this article is to provide a framework for the numerical optimization of breeding programs that goes beyond the simple comparison of scenarios. For this, we first determine the space of potential breeding programs only limited by basic constraints like the budget and housing capacities. Subsequently, the goal is to identify the optimal breeding program by finding the parametrization that maximizes the target function by combining different breeding goals. To assess the value of the target function for a parametrization, we propose using stochastic simulations and the subsequent use of a kernel regression method to cope with the stochasticity of simulation outcomes. This procedure is performed iteratively to narrow down the most promising areas of the search space and perform more and more simulations in these areas of interest. In a simplified example applied to a dairy cattle program, our proposed framework has shown its ability to identify an optimal breeding strategy that aligns with a target function aiming at genetic gain and genetic diversity conservation limited by budget constraints.
Collapse
|
3
|
How economic weights translate into genetic and phenotypic progress, and vice versa. Genet Sel Evol 2023; 55:38. [PMID: 37291496 DOI: 10.1186/s12711-023-00807-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 04/27/2023] [Indexed: 06/10/2023] Open
Abstract
BACKGROUND This paper highlights the relationships between economic weights, genetic progress, and phenotypic progress in genomic breeding programs that aim at generating genetic progress in complex, i.e., multi-trait, breeding objectives via a combination of estimated breeding values for different trait complexes. RESULTS Based on classical selection index theory in combination with quantitative genetic models, we provide a methodological framework for calculating expected genetic and phenotypic progress for all components of a complex breeding objective. We further provide an approach to study the sensitivity of the system to modifications, e.g. to changes in the economic weights. We propose a novel approach to derive the covariance structure of the stochastic errors of estimated breeding values from the observed correlations of estimated breeding values. We define 'realized economic weights' as those weights that would coincide with the observed composition of the genetic trend and show, how they can be calculated. The suggested methodology is illustrated with an index that aims at achieving a breeding goal composed of six trait complexes, that was applied in German Holstein cattle breeding until 2021. CONCLUSIONS Based on the presented results, the main conclusions are (i) the composition of the observed genetic progress matches the expectations well, with predictions being slightly better when the covariance of estimation errors is taken into account; (ii) the composition of the expected phenotypic trend deviates significantly from the expected genetic trend due to the differences in trait heritabilities; and (iii) the realized economic weights derived from the observed genetic trend deviate substantially from the predefined ones, in one case even with a reversed sign. Further results highlight the implications of the change to a modified breeding goal based on the example of a new index comprising eight, partly new, trait complexes, which is used since 2021 in the German Holstein breeding program. The proposed framework and the analytical tools and software provided will be useful to define more rational and generally accepted breeding objectives in the future.
Collapse
|
4
|
Ghat: an R package for identifying adaptive polygenic traits. G3 (BETHESDA, MD.) 2023; 13:jkac319. [PMID: 36454082 PMCID: PMC9911052 DOI: 10.1093/g3journal/jkac319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 01/21/2022] [Accepted: 11/14/2022] [Indexed: 12/03/2022]
Abstract
Identifying selection on polygenic complex traits in crops and livestock is important for understanding evolution and helps prioritize important characteristics for breeding. Quantitative trait loci (QTL) that contribute to polygenic trait variation often exhibit small or infinitesimal effects. This hinders the ability to detect QTL-controlling polygenic traits because enormously high statistical power is needed for their detection. Recently, we circumvented this challenge by introducing a method to identify selection on complex traits by evaluating the relationship between genome-wide changes in allele frequency and estimates of effect size. The approach involves calculating a composite statistic across all markers that capture this relationship, followed by implementing a linkage disequilibrium-aware permutation test to evaluate if the observed pattern differs from that expected due to drift during evolution and population stratification. In this manuscript, we describe "Ghat," an R package developed to implement this method to test for selection on polygenic traits. We demonstrate the package by applying it to test for polygenic selection on 15 published European wheat traits including yield, biomass, quality, morphological characteristics, and disease resistance traits. Moreover, we applied Ghat to different simulated populations with different breeding histories and genetic architectures. The results highlight the power of Ghat to identify selection on complex traits. The Ghat package is accessible on CRAN, the Comprehensive R Archival Network, and on GitHub.
Collapse
|
5
|
Fourth Report on Chicken Genes and Chromosomes 2022. Cytogenet Genome Res 2023; 162:405-528. [PMID: 36716736 DOI: 10.1159/000529376] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Accepted: 01/22/2023] [Indexed: 02/01/2023] Open
|
6
|
Genomic prediction using information across years with epistatic models and dimension reduction via haplotype blocks. PLoS One 2023; 18:e0282288. [PMID: 37000811 PMCID: PMC10065328 DOI: 10.1371/journal.pone.0282288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 02/12/2023] [Indexed: 04/01/2023] Open
Abstract
The importance of accurate genomic prediction of phenotypes in plant breeding is undeniable, as higher prediction accuracy can increase selection responses. In this regard, epistasis models have shown to be capable of increasing the prediction accuracy while their high computational load is challenging. In this study, we investigated the predictive ability obtained in additive and epistasis models when utilizing haplotype blocks versus pruned sets of SNPs by including phenotypic information from the last growing season. This was done by considering a single biological trait in two growing seasons (2017 and 2018) as separate traits in a multi-trait model. Thus, bivariate variants of the Genomic Best Linear Unbiased Prediction (GBLUP) as an additive model, Epistatic Random Regression BLUP (ERRBLUP) and selective Epistatic Random Regression BLUP (sERRBLUP) as epistasis models were compared with respect to their prediction accuracies for the second year. The prediction accuracies of bivariate GBLUP, ERRBLUP and sERRBLUP were assessed with eight phenotypic traits for 471/402 doubled haploid lines in the European maize landrace Kemater Landmais Gelb/Petkuser Ferdinand Rot. The results indicate that the obtained prediction accuracies are similar when utilizing a pruned set of SNPs or haplotype blocks, while utilizing haplotype blocks reduces the computational load significantly compared to the pruned sets of SNPs. The number of interactions considered in the model was reduced from 323.5/456.4 million for the pruned SNP panel to 4.4/5.5 million in the haplotype block dataset for Kemater and Petkuser landraces, respectively. Since the computational load scales linearly with the number of parameters in the model, this leads to a reduction in computational time of 98.9% from 13.5 hours for the pruned set of markers to 9 minutes for the haplotype block dataset. We further investigated the impact of genomic correlation, phenotypic correlation and trait heritability as factors affecting the bivariate models' prediction accuracy, identifying the genomic correlation between years as the most influential one. As computational load is substantially reduced, while the accuracy of genomic prediction is unchanged, the here proposed framework to use haplotype blocks in sERRBLUP provided a solution for the practical implementation of sERRBLUP in real breeding programs. Furthermore, our results indicate that sERRBLUP is not only suitable for prediction across different locations, but also for the prediction across growing seasons.
Collapse
|
7
|
learnMET: an R package to apply machine learning methods for genomic prediction using multi-environment trial data. G3 GENES|GENOMES|GENETICS 2022; 12:6705235. [PMID: 36124944 PMCID: PMC9635651 DOI: 10.1093/g3journal/jkac226] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 07/29/2022] [Indexed: 12/04/2022]
Abstract
We introduce the R-package learnMET, developed as a flexible framework to enable a collection of analyses on multi-environment trial breeding data with machine learning-based models. learnMET allows the combination of genomic information with environmental data such as climate and/or soil characteristics. Notably, the package offers the possibility of incorporating weather data from field weather stations, or to retrieve global meteorological datasets from a NASA database. Daily weather data can be aggregated over specific periods of time based on naive (for instance, nonoverlapping 10-day windows) or phenological approaches. Different machine learning methods for genomic prediction are implemented, including gradient-boosted decision trees, random forests, stacked ensemble models, and multilayer perceptrons. These prediction models can be evaluated via a collection of cross-validation schemes that mimic typical scenarios encountered by plant breeders working with multi-environment trial experimental data in a user-friendly way. The package is published under an MIT license and accessible on GitHub.
Collapse
|
8
|
Epigenetic Regulation of Phenotypic Sexual Plasticity Inducing Skewed Sex Ratio in Zebrafish. Front Cell Dev Biol 2022; 10:880779. [PMID: 35912111 PMCID: PMC9334531 DOI: 10.3389/fcell.2022.880779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 06/21/2022] [Indexed: 11/13/2022] Open
Abstract
The plasticity of sexual phenotype in response to environmental conditions results in biased sex ratios, and their variation has an effect on population dynamics. Epigenetic modifications can modulate sex ratio variation in species, where sex is determined by genetic and environmental factors. However, the role of epigenetic mechanisms underlying skewed sex ratios is far from being clear and is still an object of debate in evolutionary developmental biology. In this study, we used zebrafish as a model animal to investigate the effect of DNA methylation on sex ratio variation in sex-biased families in response to environmental temperature. Two sex-biased families with a significant difference in sex ratio were selected for genome-wide DNA methylation analysis using reduced representation bisulfite sequencing (RRBS). The results showed significant genome-wide methylation differences between male-biased and female-biased families, with a greater number of methylated CpG sites in testes than ovaries. Likewise, pronounced differences between testes and ovaries were identified within both families, where the male-biased family exhibited a higher number of methylated sites than the female-biased family. The effect of temperature showed more methylated positions in the high incubation temperature than the control temperature. We found differential methylation of many reproduction-related genes (e.g., sox9a, nr5a2, lhx8a, gata4) and genes involved in epigenetic mechanisms (e.g., dnmt3bb.1, dimt1l, hdac11, h1m) in both families. We conclude that epigenetic modifications can influence the sex ratio variation in zebrafish families and may generate skewed sex ratios, which could have a negative consequence for population fitness in species with genotype-environment interaction sex-determining system under rapid environmental changes.
Collapse
|
9
|
Increasing calling accuracy, coverage, and read-depth in sequence data by the use of haplotype blocks. PLoS Genet 2021; 17:e1009944. [PMID: 34941872 PMCID: PMC8699914 DOI: 10.1371/journal.pgen.1009944] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2021] [Accepted: 11/13/2021] [Indexed: 01/16/2023] Open
Abstract
High-throughput genotyping of large numbers of lines remains a key challenge in plant genetics, requiring geneticists and breeders to find a balance between data quality and the number of genotyped lines under a variety of different existing genotyping technologies when resources are limited. In this work, we are proposing a new imputation pipeline (“HBimpute”) that can be used to generate high-quality genomic data from low read-depth whole-genome-sequence data. The key idea of the pipeline is the use of haplotype blocks from the software HaploBlocker to identify locally similar lines and subsequently use the reads of all locally similar lines in the variant calling for a specific line. The effectiveness of the pipeline is showcased on a dataset of 321 doubled haploid lines of a European maize landrace, which were sequenced at 0.5X read-depth. The overall imputing error rates are cut in half compared to state-of-the-art software like BEAGLE and STITCH, while the average read-depth is increased to 83X, thus enabling the calling of copy number variation. The usefulness of the obtained imputed data panel is further evaluated by comparing the performance of sequence data in common breeding applications to that of genomic data generated with a genotyping array. For both genome-wide association studies and genomic prediction, results are on par or even slightly better than results obtained with high-density array data (600k). In particular for genomic prediction, we observe slightly higher data quality for the sequence data compared to the 600k array in the form of higher prediction accuracies. This occurred specifically when reducing the data panel to the set of overlapping markers between sequence and array, indicating that sequencing data can benefit from the same marker ascertainment as used in the array process to increase the quality and usability of genomic data. High-throughput genotyping of large numbers of lines remains a key challenge in plant genetics and breeding. Cost, precision, and throughput must be balanced to achieve optimal efficiency given available technologies and finite resources. Although genotyping arrays are still considered the gold standard in high-throughput quantitative genetics, recent advances in sequencing provide new opportunities. Both the quality and cost of genomic data generated based on sequencing are highly dependent on the used read-depth. In this work, we propose a new imputation pipeline (“HBimpute”) that uses haplotype blocks to detect individuals of the same genetic origin and subsequently uses all reads of those individuals in the variant calling. Thus, the obtained virtual read-depth is artificially increased, leading to higher calling accuracy, coverage, and the ability to call copy number variation based on low read-depth sequencing data. To conclude, our approach makes sequencing a cost-competitive alternative to genotyping arrays with the added benefit of allowing the calling of structural variation.
Collapse
|
10
|
Prediction of Maize Phenotypic Traits With Genomic and Environmental Predictors Using Gradient Boosting Frameworks. FRONTIERS IN PLANT SCIENCE 2021; 12:699589. [PMID: 34880880 PMCID: PMC8647909 DOI: 10.3389/fpls.2021.699589] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 10/15/2021] [Indexed: 05/26/2023]
Abstract
The development of crop varieties with stable performance in future environmental conditions represents a critical challenge in the context of climate change. Environmental data collected at the field level, such as soil and climatic information, can be relevant to improve predictive ability in genomic prediction models by describing more precisely genotype-by-environment interactions, which represent a key component of the phenotypic response for complex crop agronomic traits. Modern predictive modeling approaches can efficiently handle various data types and are able to capture complex nonlinear relationships in large datasets. In particular, machine learning techniques have gained substantial interest in recent years. Here we examined the predictive ability of machine learning-based models for two phenotypic traits in maize using data collected by the Maize Genomes to Fields (G2F) Initiative. The data we analyzed consisted of multi-environment trials (METs) dispersed across the United States and Canada from 2014 to 2017. An assortment of soil- and weather-related variables was derived and used in prediction models alongside genotypic data. Linear random effects models were compared to a linear regularized regression method (elastic net) and to two nonlinear gradient boosting methods based on decision tree algorithms (XGBoost, LightGBM). These models were evaluated under four prediction problems: (1) tested and new genotypes in a new year; (2) only unobserved genotypes in a new year; (3) tested and new genotypes in a new site; (4) only unobserved genotypes in a new site. Accuracy in forecasting grain yield performance of new genotypes in a new year was improved by up to 20% over the baseline model by including environmental predictors with gradient boosting methods. For plant height, an enhancement of predictive ability could neither be observed by using machine learning-based methods nor by using detailed environmental information. An investigation of key environmental factors using gradient boosting frameworks also revealed that temperature at flowering stage, frequency and amount of water received during the vegetative and grain filling stage, and soil organic matter content appeared as important predictors for grain yield in our panel of environments.
Collapse
|
11
|
Accounting for epistasis improves genomic prediction of phenotypes with univariate and bivariate models across environments. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2021; 134:2913-2930. [PMID: 34115154 PMCID: PMC8354961 DOI: 10.1007/s00122-021-03868-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 05/24/2021] [Indexed: 06/12/2023]
Abstract
The accuracy of genomic prediction of phenotypes can be increased by including the top-ranked pairwise SNP interactions into the prediction model. We compared the predictive ability of various prediction models for a maize dataset derived from 910 doubled haploid lines from two European landraces (Kemater Landmais Gelb and Petkuser Ferdinand Rot), which were tested at six locations in Germany and Spain. The compared models were Genomic Best Linear Unbiased Prediction (GBLUP) as an additive model, Epistatic Random Regression BLUP (ERRBLUP) accounting for all pairwise SNP interactions, and selective Epistatic Random Regression BLUP (sERRBLUP) accounting for a selected subset of pairwise SNP interactions. These models have been compared in both univariate and bivariate statistical settings for predictions within and across environments. Our results indicate that modeling all pairwise SNP interactions into the univariate/bivariate model (ERRBLUP) is not superior in predictive ability to the respective additive model (GBLUP). However, incorporating only a selected subset of interactions with the highest effect variances in univariate/bivariate sERRBLUP can increase predictive ability significantly compared to the univariate/bivariate GBLUP. Overall, bivariate models consistently outperform univariate models in predictive ability. Across all studied traits, locations and landraces, the increase in prediction accuracy from univariate GBLUP to univariate sERRBLUP ranged from 5.9 to 112.4 percent, with an average increase of 47 percent. For bivariate models, the change ranged from -0.3 to + 27.9 percent comparing the bivariate sERRBLUP to the bivariate GBLUP, with an average increase of 11 percent. This considerable increase in predictive ability achieved by sERRBLUP may be of interest for "sparse testing" approaches in which only a subset of the lines/hybrids of interest is observed at each location.
Collapse
|
12
|
Genotypic and Dietary Effects on Egg Quality of Local Chicken Breeds and Their Crosses Fed with Faba Beans. Animals (Basel) 2021; 11:1947. [PMID: 34210033 PMCID: PMC8300114 DOI: 10.3390/ani11071947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 12/28/2022] Open
Abstract
The quality of chicken eggs is an important criterion for food safety and the consumers' choice at the point of sale. Several studies have shown that egg quality can be influenced by the chickens' genotype and by the composition of the diet. The present study aimed to evaluate the effect of faba beans as a substitute for soybeans in the diet of chickens originating from traditional low-performance breeds in comparison with high-performing laying type hens and their crosses on egg quality parameters. Chickens of six different genotypes were fed either with a feed mix containing 20% faba beans with high or low vicin contents or, as a control, a feed mix containing soybeans. The genotypes studied were the local breeds Vorwerkhuhn and Bresse Gauloise, as well as commercial White Rock parent hens and their crosses. Yolk weight, Haugh units, yolk and shell color, the frequency of blood and meat spots and the composition of the eggs were significantly influenced by the genotype. The feeding of faba beans had an effect on yolk and shell color, Haugh units and shell portion, while there was no significant influence on the frequency of blood and meat spots.
Collapse
|
13
|
In Silico Prediction of Transcription Factor Collaborations Underlying Phenotypic Sexual Dimorphism in Zebrafish ( Danio rerio). Genes (Basel) 2021; 12:873. [PMID: 34200177 PMCID: PMC8227731 DOI: 10.3390/genes12060873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 06/02/2021] [Accepted: 06/05/2021] [Indexed: 11/17/2022] Open
Abstract
The transcriptional regulation of gene expression in higher organisms is essential for different cellular and biological processes. These processes are controlled by transcription factors and their combinatorial interplay, which are crucial for complex genetic programs and transcriptional machinery. The regulation of sex-biased gene expression plays a major role in phenotypic sexual dimorphism in many species, causing dimorphic gene expression patterns between two different sexes. The role of transcription factor (TF) in gene regulatory mechanisms so far has not been studied for sex determination and sex-associated colour patterning in zebrafish with respect to phenotypic sexual dimorphism. To address this open biological issue, we applied bioinformatics approaches for identifying the predicted TF pairs based on their binding sites for sex and colour genes in zebrafish. In this study, we identified 25 (e.g., STAT6-GATA4; JUN-GATA4; SOX9-JUN) and 14 (e.g., IRF-STAT6; SOX9-JUN; STAT6-GATA4) potentially cooperating TFs based on their binding patterns in promoter regions for sex determination and colour pattern genes in zebrafish, respectively. The comparison between identified TFs for sex and colour genes revealed several predicted TF pairs (e.g., STAT6-GATA4; JUN-SOX9) are common for both phenotypes, which may play a pivotal role in phenotypic sexual dimorphism in zebrafish.
Collapse
|
14
|
Abstract
Background Population genetic studies based on genotyped single nucleotide polymorphisms (SNPs) are influenced by a non-random selection of the SNPs included in the used genotyping arrays. The resulting bias in the estimation of allele frequency spectra and population genetics parameters like heterozygosity and genetic distances relative to whole genome sequencing (WGS) data is known as SNP ascertainment bias. Full correction for this bias requires detailed knowledge of the array design process, which is often not available in practice. This study suggests an alternative approach to mitigate ascertainment bias of a large set of genotyped individuals by using information of a small set of sequenced individuals via imputation without the need for prior knowledge on the array design. Results The strategy was first tested by simulating additional ascertainment bias with a set of 1566 chickens from 74 populations that were genotyped for the positions of the Affymetrix Axiom™ 580 k Genome-Wide Chicken Array. Imputation accuracy was shown to be consistently higher for populations used for SNP discovery during the simulated array design process. Reference sets of at least one individual per population in the study set led to a strong correction of ascertainment bias for estimates of expected and observed heterozygosity, Wright’s Fixation Index and Nei’s Standard Genetic Distance. In contrast, unbalanced reference sets (overrepresentation of populations compared to the study set) introduced a new bias towards the reference populations. Finally, the array genotypes were imputed to WGS by utilization of reference sets of 74 individuals (one per population) to 98 individuals (additional commercial chickens) and compared with a mixture of individually and pooled sequenced populations. The imputation reduced the slope between heterozygosity estimates of array data and WGS data from 1.94 to 1.26 when using the smaller balanced reference panel and to 1.44 when using the larger but unbalanced reference panel. This generally supported the results from simulation but was less favorable, advocating for a larger reference panel when imputing to WGS. Conclusions The results highlight the potential of using imputation for mitigation of SNP ascertainment bias but also underline the need for unbiased reference sets. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07663-6.
Collapse
|
15
|
Genetic diversity in global chicken breeds in relation to their genetic distances to wild populations. Genet Sel Evol 2021; 53:36. [PMID: 33853523 PMCID: PMC8048360 DOI: 10.1186/s12711-021-00628-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 03/30/2021] [Indexed: 12/03/2022] Open
Abstract
Background Migration of a population from its founder population is expected to cause a reduction of its genetic diversity and facilitates differentiation between the population and its founder population, as predicted by the theory of genetic isolation by distance. Consistent with that theory, a model of expansion from a single founder predicts that patterns of genetic diversity in populations can be explained well by their geographic expansion from their founders, which is correlated with genetic differentiation. Methods To investigate this in chicken, we estimated the relationship between the genetic diversity of 160 domesticated chicken populations and their genetic distances to wild chicken populations. Results Our results show a strong inverse relationship, i.e. 88.6% of the variation in the overall genetic diversity of domesticated chicken populations was explained by their genetic distance to the wild populations. We also investigated whether the patterns of genetic diversity of different types of single nucleotide polymorphisms (SNPs) and genes are similar to that of the overall genome. Among the SNP classes, the non-synonymous SNPs deviated most from the overall genome. However, genetic distance to the wild chicken still explained more variation in domesticated chicken diversity across all SNP classes, which ranged from 83.0 to 89.3%. Conclusions Genetic distance between domesticated chicken populations and their wild relatives can predict the genetic diversity of the domesticated populations. On the one hand, genes with little genetic variation across populations, regardless of the genetic distance to the wild population, are associated with major functions such as brain development. Changes in such genes may be detrimental to the species. On the other hand, genetic diversity seems to change at a faster rate within genes that are associated with e.g. protein transport and protein and lipid metabolic processes. In general, such genes may be flexible to changes according to the populations’ needs. These results contribute to the knowledge of the evolutionary patterns of different functional genomic regions in the chicken. Supplementary Information The online version contains supplementary material available at 10.1186/s12711-021-00628-z.
Collapse
|
16
|
Abstract
Single nucleotide polymorphisms (SNPs), genotyped with arrays, have become a widely used marker type in population genetic analyses over the last 10 years. However, compared to whole genome re-sequencing data, arrays are known to lack a substantial proportion of globally rare variants and tend to be biased towards variants present in populations involved in the development process of the respective array. This affects population genetic estimators and is known as SNP ascertainment bias. We investigated factors contributing to ascertainment bias in array development by redesigning the Axiom™ Genome-Wide Chicken Array in silico and evaluating changes in allele frequency spectra and heterozygosity estimates in a stepwise manner. A sequential reduction of rare alleles during the development process was shown. This was mainly caused by the identification of SNPs in a limited set of populations and a within-population selection of common SNPs when aiming for equidistant spacing. These effects were shown to be less severe with a larger discovery panel. Additionally, a generally massive overestimation of expected heterozygosity for the ascertained SNP sets was shown. This overestimation was 24% higher for populations involved in the discovery process than not involved populations in case of the original array. The same was observed after the SNP discovery step in the redesign. However, an unequal contribution of populations during the SNP selection can mask this effect but also adds uncertainty. Finally, we make suggestions for the design of specialized arrays for large scale projects where whole genome re-sequencing techniques are still too expensive.
Collapse
|
17
|
Phenotype Prediction Under Epistasis. Methods Mol Biol 2021; 2212:105-120. [PMID: 33733353 DOI: 10.1007/978-1-0716-0947-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/13/2023]
Abstract
Reliable methods of phenotype prediction from genomic data play an increasingly important role in many areas of plant and animal breeding. Thus, developing methods that enhance prediction accuracy is of major interest. Here, we provide three methods for this purpose: (1) Genomic Best Linear Unbiased Prediction (GBLUP) as a model just accounting for additive SNP effects; (2) Epistatic Random Regression BLUP (ERRBLUP) as a full epistatic model which incorporates all pairwise SNP interactions, and (3) selective Epistatic Random Regression BLUP (sERRBLUP) as an epistatic model which incorporates a subset of pairwise SNP interactions selected based on their absolute effect sizes or the effect variances, which is computed based on solutions from the ERRBLUP model. We compared the predictive ability obtained from GBLUP, ERRBLUP, and sERRBLUP with genotypes from a publicly available wheat dataset and respective simulated phenotypes. Results showed that sERRBLUP provides a substantial increase in prediction accuracy compared to the other methods when the optimal proportion of SNP interactions is kept in the model, especially when an optimal proportion of SNP interactions is selected based on the SNP interaction effect sizes. All methods described here are implemented in the R-package EpiGP, which is able to process large-scale genomic data in a computationally efficient way.
Collapse
|
18
|
MoBPSweb: A web-based framework to simulate and compare breeding programs. G3 (BETHESDA, MD.) 2021; 11:jkab023. [PMID: 33712818 PMCID: PMC8022963 DOI: 10.1093/g3journal/jkab023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 01/11/2021] [Indexed: 11/13/2022]
Abstract
In this study, we introduce a new web-based simulation framework ("MoBPSweb") that combines a unified language to describe breeding programs with the simulation software MoBPS, standing for "Modular Breeding Program Simulator." Thereby, MoBPSweb provides a flexible environment to log, simulate, evaluate, and compare breeding programs. Inputs can be provided via modules ranging from a Vis.js-based environment for "drawing" the breeding program to a variety of modules to provide phenotype information, economic parameters, and other relevant information. Similarly, results of the simulation study can be extracted and compared to other scenarios via output modules (e.g., observed phenotypes, the accuracy of breeding value estimation, inbreeding rates), while all simulations and downstream analysis are executed in the highly efficient R-package MoBPS.
Collapse
|
19
|
Harvest Moon: Some personal thoughts on past and future directions in animal breeding research. J Anim Breed Genet 2021; 138:135-136. [PMID: 33543808 DOI: 10.1111/jbg.12538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
20
|
|
21
|
A unifying concept of animal breeding programmes. J Anim Breed Genet 2021; 138:137-150. [PMID: 33486850 DOI: 10.1111/jbg.12534] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 10/22/2020] [Accepted: 12/13/2020] [Indexed: 11/26/2022]
Abstract
Modern animal breeding programmes are constantly evolving with advances in breeding theory, biotechnology and genetics. Surprisingly, there seems to be no generally accepted succinct definition of what exactly a breeding programme is, neither is there a unified language to describe breeding programmes in a comprehensive, unambiguous and reproducible way. In this work, we try to fill this gap by suggesting a general definition of breeding programmes that also pertains to cases where genetic progress is not achieved through selection, but, for example, through transgenic technologies, or the aim is not to generate genetic progress, but, for example, to maintain genetic diversity. The key idea of the underlying concept is to represent a breeding programme in modular form as a directed graph that is composed of nodes and edges, where nodes represent cohorts of breeding units, usually individuals, and edges represent breeding activities, like "selection" or "reproduction." We claim, that by defining a comprehensive set of nodes and edges, it is possible to represent any breeding programme of arbitrary complexity by such a graph, which thus comprises a full description of the breeding programme. This concept is implemented in a web-based tool (MoBPSweb, available at www.mobps.de) and has a link to the R-package MoBPS (Modular Breeding Program Simulator) to simulate the described breeding programmes. The approach is illustrated by showcasing three different breeding programmes of increasing complexity. The concept allows a formal description of breeding programmes, which is requested, for example, in legal regulations of the European Union, but so far cannot be provided in a standardized format. In the discussion, we point out potential limitations of the concept and argue that the general approach can be easily extended to account for novel breeding technologies, to breeding of crops or experimental species, but also to modelling diversity dynamics in natural populations.
Collapse
|
22
|
The Modular Breeding Program Simulator (MoBPS) allows efficient simulation of complex breeding programs. ANIMAL PRODUCTION SCIENCE 2021. [DOI: 10.1071/an21076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Context
Breeding programs aim at improving the genetic characteristics of livestock populations with respect to productivity, fitness and adaptation, while controlling negative effects such as inbreeding or health and welfare issues. As breeding is affected by a variety of interdependent factors, the analysis of the effect of certain breeding actions and the optimisation of a breeding program are highly complex tasks.
Aims
This study was conducted to display the potential of using stochastic simulation to analyse, evaluate and compare breeding programs and to show how the Modular Breeding Program Simulator (MoBPS) simulation framework can further enhance this.
Methods
In this study, a simplified version of the breeding program of Göttingen Minipigs was simulated to analyse the impact of genotyping and optimum contribution selection in regard to both genetic gain and diversity. The software MoBPS was used as the backend simulation software and was extended to allow for a more realistic modelling of pig breeding programs. Among others, extensions include the simulation of phenotypes with discrete observations (e.g. teat count), variable litter sizes, and a breeding value estimation in the associated R-package miraculix that utilises a graphics processing unit.
Key results
Genotyping with the subsequent use of genomic best linear unbiased prediction (GBLUP) led to substantial increases in genetic gain (15.3%) compared with a pedigree-based BLUP, while reducing the increase of inbreeding by 24.8%. The additional use of optimum genetic selection was shown to be favourable compared with the plain selection of top boars. The use of graphics processing unit-based breeding value estimation with known heritability was ~100 times faster than the state-of-the-art R-package rrBLUP.
Conclusions
The results regarding the effect of both genotyping and optimal contribution selection are in line with well established results. Paired with additional new features such as the modelling of discrete phenotypes and adaptable litter sizes, this confirms MoBPS to be a unique tool for the realistic modelling of modern breeding programs.
Implications
The MoBPS framework provides a powerful tool for scientists and breeders to perform stochastic simulations to optimise the practical design of modern breeding programs to secure standardised breeding of high-quality animals and answer associated research questions.
Collapse
|
23
|
ANOVA-HD: Analysis of variance when both input and output layers are high-dimensional. PLoS One 2020; 15:e0243251. [PMID: 33315963 PMCID: PMC7735570 DOI: 10.1371/journal.pone.0243251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Accepted: 11/17/2020] [Indexed: 11/21/2022] Open
Abstract
Modern genomic data sets often involve multiple data-layers (e.g., DNA-sequence, gene expression), each of which itself can be high-dimensional. The biological processes underlying these data-layers can lead to intricate multivariate association patterns. We propose and evaluate two methods to determine the proportion of variance of an output data set that can be explained by an input data set when both data panels are high dimensional. Our approach uses random-effects models to estimate the proportion of variance of vectors in the linear span of the output set that can be explained by regression on the input set. We consider a method based on an orthogonal basis (Eigen-ANOVA) and one that uses random vectors (Monte Carlo ANOVA, MC-ANOVA) in the linear span of the output set. Using simulations, we show that the MC-ANOVA method gave nearly unbiased estimates. Estimates produced by Eigen-ANOVA were also nearly unbiased, except when the shared variance was very high (e.g., >0.9). We demonstrate the potential insight that can be obtained from the use of MC-ANOVA and Eigen-ANOVA by applying these two methods to the study of multi-locus linkage disequilibrium in chicken (Gallus gallus) genomes and to the assessment of inter-dependencies between gene expression, methylation, and copy-number-variants in data from breast cancer tumors from humans (Homo sapiens). Our analyses reveal that in chicken breeding populations ~50,000 evenly-spaced SNPs are enough to fully capture the span of whole-genome-sequencing genomes. In the study of multi-omic breast cancer data, we found that the span of copy-number-variants can be fully explained using either methylation or gene expression data and that roughly 74% of the variance in gene expression can be predicted from methylation data.
Collapse
|
24
|
Using Local Convolutional Neural Networks for Genomic Prediction. Front Genet 2020; 11:561497. [PMID: 33281867 PMCID: PMC7689358 DOI: 10.3389/fgene.2020.561497] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Accepted: 10/12/2020] [Indexed: 11/18/2022] Open
Abstract
The prediction of breeding values and phenotypes is of central importance for both livestock and crop breeding. In this study, we analyze the use of artificial neural networks (ANN) and, in particular, local convolutional neural networks (LCNN) for genomic prediction, as a region-specific filter corresponds much better with our prior genetic knowledge on the genetic architecture of traits than traditional convolutional neural networks. Model performances are evaluated on a simulated maize data panel (n = 10,000; p = 34,595) and real Arabidopsis data (n = 2,039; p = 180,000) for a variety of traits based on their predictive ability. The baseline LCNN, containing one local convolutional layer (kernel size: 10) and two fully connected layers with 64 nodes each, is outperforming commonly proposed ANNs (multi layer perceptrons and convolutional neural networks) for basically all considered traits. For traits with high heritability and large training population as present in the simulated data, LCNN are even outperforming state-of-the-art methods like genomic best linear unbiased prediction (GBLUP), Bayesian models and extended GBLUP, indicated by an increase in predictive ability of up to 24%. However, for small training populations, these state-of-the-art methods outperform all considered ANNs. Nevertheless, the LCNN still outperforms all other considered ANNs by around 10%. Minor improvements to the tested baseline network architecture of the LCNN were obtained by increasing the kernel size and of reducing the stride, whereas the number of subsequent fully connected layers and their node sizes had neglectable impact. Although gains in predictive ability were obtained for large scale data sets by using LCNNs, the practical use of ANNs comes with additional problems, such as the need of genotyping all considered individuals, the lack of estimation of heritability and reliability. Furthermore, breeding values are additive by design, whereas ANN-based estimates are not. However, ANNs also comes with new opportunities, as networks can easily be extended to account for additional inputs (omics, weather etc.) and outputs (multi-trait models), and computing time increases linearly with the number of individuals. With advances in high-throughput phenotyping and cheaper genotyping, ANNs can become a valid alternative for genomic prediction.
Collapse
|
25
|
|
26
|
Genome-wide detection of signatures of selection in indicine and Brazilian locally adapted taurine cattle breeds using whole-genome re-sequencing data. BMC Genomics 2020; 21:624. [PMID: 32917133 PMCID: PMC7488563 DOI: 10.1186/s12864-020-07035-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 08/27/2020] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND The cattle introduced by European conquerors during the Brazilian colonization period were exposed to a process of natural selection in different types of biomes throughout the country, leading to the development of locally adapted cattle breeds. In this study, whole-genome re-sequencing data from indicine and Brazilian locally adapted taurine cattle breeds were used to detect genomic regions under selective pressure. Within-population and cross-population statistics were combined separately in a single score using the de-correlated composite of multiple signals (DCMS) method. Putative sweep regions were revealed by assessing the top 1% of the empirical distribution generated by the DCMS statistics. RESULTS A total of 33,328,447 biallelic SNPs with an average read depth of 12.4X passed the hard filtering process and were used to access putative sweep regions. Admixture has occurred in some locally adapted taurine populations due to the introgression of exotic breeds. The genomic inbreeding coefficient based on runs of homozygosity (ROH) concurred with the populations' historical background. Signatures of selection retrieved from the DCMS statistics provided a comprehensive set of putative candidate genes and revealed QTLs disclosing cattle production traits and adaptation to the challenging environments. Additionally, several candidate regions overlapped with previous regions under selection described in the literature for other cattle breeds. CONCLUSION The current study reported putative sweep regions that can provide important insights to better understand the selective forces shaping the genome of the indicine and Brazilian locally adapted taurine cattle breeds. Such regions likely harbor traces of natural selection pressures by which these populations have been exposed and may elucidate footprints for adaptation to the challenging climatic conditions.
Collapse
|
27
|
Pan-genomic open reading frames: A potential supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction. PLoS Genet 2020; 16:e1008995. [PMID: 32833967 PMCID: PMC7470747 DOI: 10.1371/journal.pgen.1008995] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2019] [Revised: 09/03/2020] [Accepted: 07/15/2020] [Indexed: 11/19/2022] Open
Abstract
Pan-genomic open reading frames (ORFs) potentially carry protein-coding gene or coding variant information in a population. In this study, we suggest that pan-genomic ORFs are promising to be utilized in estimation of heritability and genomic prediction. A Saccharomyces cerevisiae dataset with whole-genome SNPs, pan-genomic ORFs, and the copy numbers of those ORFs is used to test the effectiveness of ORF data as a predictor in three prediction models for 35 traits. Our results show that the ORF-based heritability can capture more genetic effects than SNP-based heritability for all traits. Compared to SNP-based genomic prediction (GBLUP), pan-genomic ORF-based genomic prediction (OBLUP) is distinctly more accurate for all traits, and the predictive abilities on average are more than doubled across all traits. For four traits, the copy number of ORF-based prediction(CBLUP) is more accurate than OBLUP. When using different numbers of isolates in training sets in ORF-based prediction, the predictive abilities for all traits increased as more isolates are added in the training sets, suggesting that with very large training sets the prediction accuracy will be in the range of the square root of the heritability. We conclude that pan-genomic ORFs have the potential to be a supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction.
Collapse
|
28
|
Egg Production and Bone Stability of Local Chicken Breeds and Their Crosses Fed with Faba Beans. Animals (Basel) 2020; 10:E1480. [PMID: 32842714 PMCID: PMC7552325 DOI: 10.3390/ani10091480] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 08/20/2020] [Accepted: 08/20/2020] [Indexed: 11/16/2022] Open
Abstract
Poultry production is raising concerns within the public regarding the practice of culling day-old chicks and the importation of soy from overseas for feedstuff. Therefore, an alternative approach to poultry production was tested. In two consecutive experiments, two traditional chicken breeds, Vorwerkhuhn and Bresse Gauloise, and White Rock as a commercial layer genotype as well as crossbreds thereof were fed diets containing either 20% vicin-rich or vicin-poor faba beans, though addressing both subjects of debate. Hen performance traits and bone stability were recorded. All parameters were considerably influenced by the genotype with White Rock showing the significantly highest (p < 0.05) laying performance (99.4% peak production) and mean egg weights (56.6 g) of the purebreds, but the lowest bone breaking strength (tibiotarsus 197.2 N, humerus 230.2 N). Regarding crossbreds, the Bresse Gauloise × White Rock cross performed best (peak production 98.1%, mean egg weight 58.0 g). However, only limited dietary effects were found as only the feeding of 20% vicin-rich faba beans led to a significant reduction of egg weights of at most 1.1 g (p < 0.05) and to a significant reduction of the shell stability in the crossbred genotypes. In terms of dual-purpose usage, crossing of Bresse Gauloise with White Rock seems to be the most promising variant studied here.
Collapse
|
29
|
Abstract
The R-package MoBPS provides a computationally efficient and flexible framework to simulate complex breeding programs and compare their economic and genetic impact. Simulations are performed on the base of individuals. MoBPS utilizes a highly efficient implementation with bit-wise data storage and matrix multiplications from the associated R-package miraculix allowing to handle large scale populations. Individual haplotypes are not stored but instead automatically derived based on points of recombination and mutations. The modular structure of MoBPS allows to combine rather coarse simulations, as needed to generate founder populations, with a very detailed modeling of todays' complex breeding programs, making use of all available biotechnologies. MoBPS provides pre-implemented functions for common breeding practices such as optimum genetic contributions and single-step GBLUP but also allows the user to replace certain steps with personalized and/or self-written solutions.
Collapse
|
30
|
Relationship between Bone Stability and Egg Production in Genetically Divergent Chicken Layer Lines. Animals (Basel) 2020; 10:ani10050850. [PMID: 32423072 PMCID: PMC7278460 DOI: 10.3390/ani10050850] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/07/2020] [Accepted: 05/11/2020] [Indexed: 11/16/2022] Open
Abstract
Impaired animal welfare due to skeletal disorders is likely one of the greatest issues currently facing the egg production industry. Reduced bone stability in laying hens is frequently attributed to long-term selection for increased egg production. The present study sought to analyse the relationship between bone stability traits and egg production. The study comprised four purebred layer lines, differing in their phylogenetic origin and performance level, providing extended insight into the phenotypic variability in bone characteristics in laying hens. Data collection included basic production parameters, bone morphometry, bone mineral density (BMD) and bone breaking strength (BBS) of the tibiotarsus and humerus. Using a multifactorial model and regression analyses, BMD proved to be of outstanding importance for bone stability. Only for the tibiotarsus were morphometric parameters and the bone weight associated with BBS. Within the chicken lines, no effect of total eggshell production on BBS or BMD could be detected, suggesting that a high egg yield itself is not necessarily a risk for poor bone health. Considering the complexity of osteoporosis, the estimated genetic parameters confirmed the importance of genetics in addressing the challenge of improving bone strength in layers.
Collapse
|
31
|
Abstract
BACKGROUND Göttingen Minipigs (GMP) is the smallest commercially available minipig breed under a controlled breeding scheme and is globally bred in five isolated colonies. The genetic isolation harbors the risk of stratification which might compromise the identity of the breed and its usability as an animal model for biomedical and human disease. We conducted whole genome re-sequencing of two DNA-pools per colony to assess genomic differentiation within and between colonies. We added publicly available samples from 13 various pig breeds and discovered overall about 32 M loci, ~ 16 M. thereof variable in GMPs. Individual samples were virtually pooled breed-wise. FST between virtual and DNA pools, a phylogenetic tree, principal component analysis (PCA) and evaluation of functional SNP classes were conducted. An F-test was performed to reveal significantly differentiated allele frequencies between colonies. Variation within a colony was quantified as expected heterozygosity. RESULTS Phylogeny and PCA showed that the GMP is easily discriminable from all other breads, but that there is also differentiation between the GMP colonies. Dependent on the contrast between GMP colonies, 4 to 8% of all loci had significantly different allele frequencies. Functional annotation revealed that functionally non-neutral loci are less prone to differentiation. Annotation of highly differentiated loci revealed a couple of deleterious mutations in genes with putative effects in the GMPs . CONCLUSION Differentiation and annotation results suggest that the underlying mechanisms are rather drift events than directed selection and limited to neutral genome regions. Animal exchange seems not yet necessary. The Relliehausen colony appears to be the genetically most unique GMP sub-population and could be a valuable resource if animal exchange is required to maintain uniformity of the GMP.
Collapse
|
32
|
Improving Imputation Quality in BEAGLE for Crop and Livestock Data. G3 (BETHESDA, MD.) 2020; 10:177-188. [PMID: 31676508 PMCID: PMC6945036 DOI: 10.1534/g3.119.400798] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Accepted: 10/31/2019] [Indexed: 12/14/2022]
Abstract
Imputation is one of the key steps in the preprocessing and quality control protocol of any genetic study. Most imputation algorithms were originally developed for the use in human genetics and thus are optimized for a high level of genetic diversity. Different versions of BEAGLE were evaluated on genetic datasets of doubled haploids of two European maize landraces, a commercial breeding line and a diversity panel in chicken, respectively, with different levels of genetic diversity and structure which can be taken into account in BEAGLE by parameter tuning. Especially for phasing BEAGLE 5.0 outperformed the newest version (5.1) which in turn also lead to improved imputation. Earlier versions were far more dependent on the adaption of parameters in all our tests. For all versions, the parameter ne (effective population size) had a major effect on the error rate for imputation of ungenotyped markers, reducing error rates by up to 98.5%. Further improvement was obtained by tuning of the parameters affecting the structure of the haplotype cluster that is used to initialize the underlying Hidden Markov Model of BEAGLE. The number of markers with extremely high error rates for the maize datasets were more than halved by the use of a flint reference genome (F7, PE0075 etc.) instead of the commonly used B73. On average, error rates for imputation of ungenotyped markers were reduced by 8.5% by excluding genetically distant individuals from the reference panel for the chicken diversity panel. To optimize imputation accuracy one has to find a balance between representing as much of the genetic diversity as possible while avoiding the introduction of noise by including genetically distant individuals.
Collapse
|
33
|
Efficient phenotypic sex classification of zebrafish using machine learning methods. Ecol Evol 2019; 9:13332-13343. [PMID: 31871648 PMCID: PMC6912926 DOI: 10.1002/ece3.5788] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 09/09/2019] [Accepted: 09/17/2019] [Indexed: 12/14/2022] Open
Abstract
Sex determination in zebrafish by manual approaches according to current guidelines relies on human observation. These guidelines for sex recognition have proven to be subjective and highly labor-intensive. To address this problem, we present a methodology to automatically classify the phenotypic sex using two machine learning methods: Deep Convolutional Neural Networks (DCNNs) based on the whole fish appearance and Support Vector Machine (SVM) based on caudal fin coloration. Machine learning techniques in sex classification provide potential efficiency with the advantage of automatization and robustness in the prediction process. Furthermore, since developmental plasticity can be influenced by environmental conditions, we have investigated the impact of elevated water temperature during embryogenesis on sex and sex-related differences in color intensity of adult zebrafish. The estimated color intensity based on SVM was then applied to detect the association between coloration and body weight and length. Phenotypic sex classifications using machine learning methods resulted in a high degree of association with the real sex in nontreated animals. In temperature-induced animals, DCNNs reached a performance of 100%, whereas 20% of males were misclassified using SVM due to a lower color intensity. Furthermore, a positive association between color intensity and body weight and length was observed in males. Our study demonstrates that high ambient temperature leads to a lower color intensity in male animals and a positive association of male caudal fin coloration with body weight and length, which appears to play a significant role in sexual attraction. The software developed for sex classification in this study is readily applicable to other species with sex-linked visible phenotypic differences.
Collapse
|
34
|
HaploBlocker: Creation of Subgroup-Specific Haplotype Blocks and Libraries. Genetics 2019; 212:1045-1061. [PMID: 31152070 PMCID: PMC6707469 DOI: 10.1534/genetics.119.302283] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 05/30/2019] [Indexed: 11/18/2022] Open
Abstract
The concept of haplotype blocks has been shown to be useful in genetics. Fields of application range from the detection of regions under positive selection to statistical methods that make use of dimension reduction. We propose a novel approach ("HaploBlocker") for defining and inferring haplotype blocks that focuses on linkage instead of the commonly used population-wide measures of linkage disequilibrium. We define a haplotype block as a sequence of genetic markers that has a predefined minimum frequency in the population, and only haplotypes with a similar sequence of markers are considered to carry that block, effectively screening a dataset for group-wise identity-by-descent. From these haplotype blocks, we construct a haplotype library that represents a large proportion of genetic variability with a limited number of blocks. Our method is implemented in the associated R-package HaploBlocker, and provides flexibility not only to optimize the structure of the obtained haplotype library for subsequent analyses, but also to handle datasets of different marker density and genetic diversity. By using haplotype blocks instead of single nucleotide polymorphisms (SNPs), local epistatic interactions can be naturally modeled, and the reduced number of parameters enables a wide variety of new methods for further genomic analyses such as genomic prediction and the detection of selection signatures. We illustrate our methodology with a dataset comprising 501 doubled haploid lines in a European maize landrace genotyped at 501,124 SNPs. With the suggested approach, we identified 2991 haplotype blocks with an average length of 2685 SNPs that together represent 94% of the dataset.
Collapse
|
35
|
The SYNBREED chicken diversity panel: a global resource to assess chicken diversity at high genomic resolution. BMC Genomics 2019; 20:345. [PMID: 31064348 PMCID: PMC6505202 DOI: 10.1186/s12864-019-5727-9] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2018] [Accepted: 04/23/2019] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Since domestication, chickens did not only disperse into the different parts of the world but they have also undergone significant genomic changes in this process. Many breeds, strains or lines have been formed and those represent the diversity of the species. However, other than the natural evolutionary forces, management practices (including those that threaten the persistence of genetic diversity) following domestication have shaped the genetic make-up of and diversity between today's chicken breeds. As part of the SYNBREED project, samples from a wide variety of chicken populations have been collected across the globe and were genotyped with a high density SNP array. The panel consists of the wild type, commercial layers and broilers, indigenous village/local type and fancy chicken breeds. The SYNBREED chicken diversity panel (SCDP) is made available to serve as a public basis to study the genetic structure of chicken diversity. In the current study we analyzed the genetic diversity between and within the populations in the SCDP, which is important for making informed decisions for effective management of farm animal genetic resources. RESULTS Many of the fancy breeds cover a wide spectrum and clustered with other breeds of similar supposed origin as shown by the phylogenetic tree and principal component analysis. However, the fancy breeds as well as the highly selected commercial layer lines have reduced genetic diversity within the population, with the average observed heterozygosity estimates lower than 0.205 across their breeds' categories and the average proportion of polymorphic loci lower than 0.680. We show that there is still a lot of genetic diversity preserved within the wild and less selected African, South American and some local Asian and European breeds with the average observed heterozygosity greater than 0.225 and the average proportion of polymorphic loci larger than 0.720 within their breeds' categories. CONCLUSIONS It is important that such highly diverse breeds are maintained for the sustainability and flexibility of future chicken breeding. This diversity panel provides opportunities for exploitation for further chicken molecular genetic studies. With the possibility to further expand, it constitutes a very useful community resource for chicken genetic diversity research.
Collapse
|
36
|
Genetic mechanism underlying sexual plasticity and its association with colour patterning in zebrafish (Danio rerio). BMC Genomics 2019; 20:341. [PMID: 31060508 PMCID: PMC6503382 DOI: 10.1186/s12864-019-5722-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2018] [Accepted: 04/22/2019] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Elevated water temperature, as is expected through climate change, leads to masculinization in fish species with sexual plasticity, resulting in changes in population dynamics. These changes are one important ecological consequence, contributing to the risk of extinction in small and inbred fish populations under natural conditions, due to male-biased sex ratio. Here we investigated the effect of elevated water temperature during embryogenesis on sex ratio and sex-biased gene expression profiles between two different tissues, namely gonad and caudal fin of adult zebrafish males and females, to gain new insights into the molecular mechanisms underlying sex determination (SD) and colour patterning related to sexual attractiveness. RESULTS Our study demonstrated sex ratio imbalances with 25.5% more males under high-temperature condition, resulting from gonadal masculinization. The result of transcriptome analysis showed a significantly upregulated expression of male SD genes (e.g. dmrt1, amh, cyp11c1 and sept8b) and downregulation of female SD genes (e.g. zp2.1, vtg1, cyp19a1a and bmp15) in male gonads compared to female gonads. Contrary to expectations, we found highly differential expression of colour pattern (CP) genes in the gonads, suggesting the 'neofunctionalisation' of those genes in the zebrafish reproduction system. However, in the caudal fin, no differential expression of CP genes was identified, suggesting the observed differences in colouration between males and females in adult fish may be due to post-transcriptional regulation of key enzymes involved in pigment synthesis and distribution. CONCLUSIONS Our study demonstrates male-biased sex ratio under high temperature condition and support a polygenic SD (PSD) system in laboratory zebrafish. We identify a subset of pathways (tight junction, gap junction and apoptosis), enriched for SD and CP genes, which appear to be co-regulated in the same pathway, providing evidence for involvement of those genes in the regulation of phenotypic sexual dimorphism in zebrafish.
Collapse
|
37
|
Genetics of adaptation in modern chicken. PLoS Genet 2019; 15:e1007989. [PMID: 31034467 PMCID: PMC6508745 DOI: 10.1371/journal.pgen.1007989] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 05/09/2019] [Accepted: 01/28/2019] [Indexed: 11/17/2022] Open
Abstract
We carried out whole genome resequencing of 127 chicken including red jungle fowl and multiple populations of commercial broilers and layers to perform a systematic screening of adaptive changes in modern chicken (Gallus gallus domesticus). We uncovered >21 million high quality SNPs of which 34% are newly detected variants. This panel comprises >115,000 predicted amino-acid altering substitutions as well as 1,100 SNPs predicted to be stop-gain or -loss, several of which reach high frequencies. Signatures of selection were investigated both through analyses of fixation and differentiation to reveal selective sweeps that may have had prominent roles during domestication and breed development. Contrasting wild and domestic chicken we confirmed selection at the BCO2 and TSHR loci and identified 34 putative sweeps co-localized with ALX1, KITLG, EPGR, IGF1, DLK1, JPT2, CRAMP1, and GLI3, among others. Analysis of enrichment between groups of wild vs. commercials and broilers vs. layers revealed a further panel of candidate genes including CORIN, SKIV2L2 implicated in pigmentation and LEPR, MEGF10 and SPEF2, suggestive of production-oriented selection. SNPs with marked allele frequency differences between wild and domestic chicken showed a highly significant deficiency in the proportion of amino-acid altering mutations (P<2.5×10-6). The results contribute to the understanding of major genetic changes that took place during the evolution of modern chickens and in poultry breeding.
Collapse
|
38
|
Abstract
Gene expression profiles potentially hold valuable information for the prediction of breeding values and phenotypes. In this study, the utility of transcriptome data for phenotype prediction was tested with 185 inbred lines of Drosophila melanogaster for nine traits in two sexes. We incorporated the transcriptome data into genomic prediction via two methods: GTBLUP and GRBLUP, both combining single nucleotide polymorphisms (SNPs) and transcriptome data. The genotypic data was used to construct the common additive genomic relationship, which was used in genomic best linear unbiased prediction (GBLUP) or jointly in a linear mixed model with a transcriptome-based linear kernel (GTBLUP), or with a transcriptome-based Gaussian kernel (GRBLUP). We studied the predictive ability of the models and discuss a concept of "omics-augmented broad sense heritability" for the multi-omics era. For most traits, GRBLUP and GBLUP provided similar predictive abilities, but GRBLUP explained more of the phenotypic variance. There was only one trait (olfactory perception to Ethyl Butyrate in females) in which the predictive ability of GRBLUP (0.23) was significantly higher than the predictive ability of GBLUP (0.21). Our results suggest that accounting for transcriptome data has the potential to improve genomic predictions if transcriptome data can be included on a larger scale.
Collapse
|
39
|
Analysis of porcine body size variation using re-sequencing data of miniature and large pigs. BMC Genomics 2018; 19:687. [PMID: 30231878 PMCID: PMC6146782 DOI: 10.1186/s12864-018-5009-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 08/14/2018] [Indexed: 12/30/2022] Open
Abstract
Background Domestication has led to substantial phenotypic and genetic variation in domestic animals. In pigs, the size of so called minipigs differs by one order of magnitude compared to breeds of large body size. We used biallelic SNPs identified from re-sequencing data to compare various publicly available wild and domestic populations against two minipig breeds to gain better understanding of the genetic background of the extensive body size variation. We combined two complementary measures, expected heterozygosity and the composite likelihood ratio test implemented in “SweepFinder”, to identify signatures of selection in Minipigs. We intersected these sweep regions with a measure of differentiation, namely FST, to remove regions of low variation across pigs. An extraordinary large sweep between 52 and 61 Mb on chromosome X was separately analyzed based on SNP-array data of F2 individuals from a cross of Goettingen Minipigs and large pigs. Results Selective sweep analysis identified putative sweep regions for growth and subsequent gene annotation provided a comprehensive set of putative candidate genes. A long swept haplotype on chromosome X, descending from the Goettingen Minipig founders was associated with a reduction of adult body length by 3% in F2 cross-breds. Conclusion The resulting set of genes in putative sweep regions implies that the genetic background of body size variation in pigs is polygenic rather than mono- or oligogenic. Identified genes suggest alterations in metabolic functions and a possible insulin resistance to contribute to miniaturization. A size QTL located within the sweep on chromosome X, with an estimated effect of 3% on body length, is comparable to the largest known in pigs or other species. The androgen receptor AR, previously known to influence pig performance and carcass traits, is the most obvious potential candidate gene within this region. Electronic supplementary material The online version of this article (10.1186/s12864-018-5009-y) contains supplementary material, which is available to authorized users.
Collapse
|
40
|
Special Issue: Quantitative and statistical genetics-papers in honour of Daniel Gianola. J Anim Breed Genet 2018; 134:173-174. [PMID: 28508484 DOI: 10.1111/jbg.12279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
41
|
|
42
|
Genome-wide scan reveals population stratification and footprints of recent selection in Nelore cattle. Genet Sel Evol 2018; 50:22. [PMID: 29720080 PMCID: PMC5930444 DOI: 10.1186/s12711-018-0381-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Accepted: 02/20/2018] [Indexed: 12/11/2022] Open
Abstract
Background This study aimed at (1) assessing the genomic stratification of experimental lines of Nelore cattle that have experienced different selection regimes for growth traits, and (2) identifying genomic regions that have undergone recent selection. We used a sample of 763 animals genotyped with the Illumina BovineHD BeadChip, among which 674 animals originated from two lines that are maintained under directional selection for increased yearling body weight and 89 animals from a control line that is maintained under stabilizing selection. Results Multidimensional analysis of the genomic dissimilarity matrix and admixture analysis revealed a substantial level of population stratification between the directional selection lines and the stabilizing selection control line. Two of the three tests used to detect selection signatures (FST, XP-EHH and iHS) revealed six candidate regions with indications of selection, which strongly indicates truly positive signals. The set of identified candidate genes included several genes with roles that are functionally related to growth metabolism, such as COL14A1, CPT1C, CRH, TBC1D1, and XKR4. Conclusions The current study identified genetic stratification that resulted from almost four decades of divergent selection in an experimental Nelore population, and highlighted autosomal genomic regions that present patterns of recent selection. Our findings provide a basis for a better understanding of the metabolic mechanism that underlies the growth traits, which are modified by selection for yearling body weight. Electronic supplementary material The online version of this article (10.1186/s12711-018-0381-2) contains supplementary material, which is available to authorized users.
Collapse
|
43
|
The effect of the H -1 scaling factors τ and ω on the structure of H in the single-step procedure. Genet Sel Evol 2018; 50:16. [PMID: 29653506 PMCID: PMC5899415 DOI: 10.1186/s12711-018-0386-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 03/27/2018] [Indexed: 01/12/2023] Open
Abstract
Background The single-step covariance matrix H combines the pedigree-based relationship matrix \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {A}}$$\end{document}A with the more accurate information on realized relatedness of genotyped individuals represented by the genomic relationship matrix \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {G}}$$\end{document}G. In particular, to improve convergence behavior of iterative approaches and to reduce inflation, two weights \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\tau$$\end{document}τ and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\omega$$\end{document}ω have been introduced in the definition of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {H}}^{-1}$$\end{document}H-1, which blend the inverse of a part of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {A}}$$\end{document}A with the inverse of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {G}}$$\end{document}G. Since the definition of this blending is based on the equation describing \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {H}}^{-1}$$\end{document}H-1, its impact on the structure of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {H}}$$\end{document}H is not obvious. In a joint discussion, we considered the question of the shape of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {H}}$$\end{document}H for non-trivial \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\tau$$\end{document}τ and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\omega$$\end{document}ω. Results Here, we present the general matrix \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathbf {H}}$$\end{document}H as a function of these parameters and discuss its structure and properties. Moreover, we screen for optimal values of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\tau$$\end{document}τ and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\omega$$\end{document}ω with respect to predictive ability, inflation and iterations up to convergence on a well investigated, publicly available wheat data set. Conclusion Our results may help the reader to develop a better understanding for the effects of changes of \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\tau$$\end{document}τ and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\omega$$\end{document}ω on the covariance model. In particular, we give theoretical arguments that as a general tendency, inflation will be reduced by increasing \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\tau$$\end{document}τ or by decreasing \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\omega$$\end{document}ω.
Collapse
|
44
|
Efficiency of different strategies to mitigate ascertainment bias when using SNP panels in diversity studies. BMC Genomics 2018; 19:22. [PMID: 29304727 PMCID: PMC5756397 DOI: 10.1186/s12864-017-4416-9] [Citation(s) in RCA: 63] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2017] [Accepted: 12/22/2017] [Indexed: 12/30/2022] Open
Abstract
Background Single nucleotide polymorphism (SNP) panels have been widely used to study genomic variations within and between populations. Methods of SNP discovery have been a matter of debate for their potential of introducing ascertainment bias, and genetic diversity results obtained from the SNP genotype data can be misleading. We used a total of 42 chicken populations where both individual genotyped array data and pool whole genome resequencing (WGS) data were available. We compared allele frequency distributions and genetic diversity measures (expected heterozygosity (He), fixation index (FST) values, genetic distances and principal components analysis (PCA)) between the two data types. With the array data, we applied different filtering options (SNPs polymorphic in samples of two Gallus gallus wild populations, linkage disequilibrium (LD) based pruning and minor allele frequency (MAF) filtering, and combinations thereof) to assess their potential to mitigate the ascertainment bias. Results Rare SNPs were underrepresented in the array data. Array data consistently overestimated He compared to WGS data, however, with a similar ranking of the breeds, as demonstrated by Spearman’s rank correlations ranging between 0.956 and 0.985. LD based pruning resulted in a reduced overestimation of He compared to the other filters and slightly improved the relationship with the WGS results. The raw array data and those with polymorphic SNPs in the wild samples underestimated pairwise FST values between breeds which had low FST (<0.15) in the WGS, and overestimated this parameter for high WGS FST (>0.15). LD based pruned data underestimated FST in a consistent manner. The genetic distance matrix from LD pruned data was more closely related to that of WGS than the other array versions. PCA was rather robust in all array versions, since the population structure on the PCA plot was generally well captured in comparison to the WGS data. Conclusions Among the tested filtering strategies, LD based pruning was found to account for the effects of ascertainment bias in the relatively best way, producing results which are most comparable to those obtained from WGS data and therefore is recommended for practical use. Electronic supplementary material The online version of this article (doi: 10.1186/s12864-017-4416-9) contains supplementary material, which is available to authorized users.
Collapse
|
45
|
Phenotypic and genetic relationships between age at first calving, its component traits, and survival of heifers up to second calving. J Dairy Sci 2017; 101:425-432. [PMID: 29128222 DOI: 10.3168/jds.2017-12957] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2017] [Accepted: 09/19/2017] [Indexed: 11/19/2022]
Abstract
The aim of this study was to answer the question whether models for genetic evaluations of longevity should include a correction for age at first calving (AFC). For this purpose, phenotypic and genetic relationships between AFC, its component traits age at first insemination (AFI) and interval from first to last insemination (FLI), and survival of different periods of the first lactation (S1: 0 to 49 d, S2: 50 to 249 d, S3: 250 d to second calving) were investigated. Data of 721,919 German Holstein heifers, being inseminated for the first time during the years from 2003 to 2012, were used for the analyses. Phenotypic correlations of AFI, FLI, and AFC to S1 to S3 were negative. Mean estimated heritabilities were 0.239 (AFI), 0.007 (FLI), and 0.103 (AFC) and 0.023 (S1), 0.016 (S2), and 0.028 (S3) on the observed scale. The genetic correlation between AFI and FLI was close to zero. Genetic correlations between AFI and the survival traits were -0.08 (S1), -0.02 (S2), and -0.10 (S3); those between FLI and the survival traits were -0.14 (S1), -0.20 (S2), and -0.44 (S3); and those between AFC and the survival traits were -0.09 (S1), -0.06 (S2), and -0.20 (S3). Some of these genetic correlations were different from zero, which suggests that correcting for AFC in genetic evaluations for longevity in dairy cows might remove functional genetic variance and should be reconsidered.
Collapse
|
46
|
Liver transcriptome analysis reveals important factors involved in the metabolic adaptation of the transition cow. J Dairy Sci 2017; 100:9311-9323. [DOI: 10.3168/jds.2016-12454] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 07/20/2017] [Indexed: 11/19/2022]
|
47
|
Assessing the degree of stratification between closely related Holstein-Friesian populations. J Appl Genet 2017; 58:521-526. [PMID: 28986737 PMCID: PMC5655691 DOI: 10.1007/s13353-017-0409-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2017] [Revised: 09/20/2017] [Accepted: 09/22/2017] [Indexed: 12/18/2022]
Abstract
Genomic information is an important part of the routine evaluation of dairy cattle and provides the wide availability of animals genotyped using single nucleotide polymorphism (SNP) microarrays. We analyzed 2243 Polish and 2294 German Holstein-Friesian bulls genotyped using the Illumina BovineSNP50 BeadChip. For each bull, estimated breeding values (EBVs) calculated from national routine genetic evaluation were available for production traits and for somatic cell score (SCS). Separately for each population, we estimated SNP haplotypes, pairwise linkage disequilibrium (LD), and SNP effects. The SNP genetic covariance between both populations was estimated using a bivariate mixed model. The average LD was lower in the Polish than in the German population and, with increasing genomic distance, LD decays 1.7 times more rapidly in German than in Polish cattle. The comparison of SNP allele frequencies for base populations estimated separately using Polish and German data revealed a very good agreement. The comparison of genetic effects corresponding to various window lengths defined in bp emerged a systematic pattern: regardless of the length of the compared region, few significant differences were found for production traits, while many were observed for SCS. For each trait, the German population had much higher SNP variances than the Polish population and the genetic covariance estimates were all positive. Depending on traits’ inheritance mode, the additive genetic variation can be stored in many genes following the infinitesimal model (like for SCS) or distributed between genes with high effects and the polygenic “background” (like for production traits). Accounting for those differences has implications on the prospective international genomic evaluation.
Collapse
|
48
|
Empirical comparison between different methods for genomic prediction of number of piglets born alive in moderate sized breeding populations. J Anim Sci 2017; 95:1434-1443. [PMID: 28464085 DOI: 10.2527/jas.2016.0991] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Currently used multi-step methods to incorporate genomic information in the prediction of breeding values (BV) implicitly involve many assumptions which, if violated, may result in loss of information, inaccuracies and bias. To overcome this, single-step genomic best linear unbiased prediction (ssGBLUP) was proposed combining pedigree, phenotype and genotype of all individuals for genetic evaluation. Our objective was to implement ssGBLUP for genomic predictions in pigs and to compare the accuracy of ssGBLUP with that of multi-step methods with empirical data of moderately sized pig breeding populations. Different predictions were performed: conventional parent average (PA), direct genomic value (DGV) calculated with genomic BLUP (GBLUP), a GEBV obtained by blending the DGV with PA, and ssGBLUP. Data comprised individuals from a German Landrace (LR) and Large White (LW) population. The trait 'number of piglets born alive' (NBA) was available for 182,054 litters of 41,090 LR sows and 15,750 litters from 4534 LW sows. The pedigree contained 174,021 animals, of which 147,461 (26,560) animals were LR (LW) animals. In total, 526 LR and 455 LW animals were genotyped with the Illumina PorcineSNP60 BeadChip. After quality control and imputation, 495 LR (424 LW) animals with 44,368 (43,678) SNP on 18 autosomes remained for the analysis. Predictive abilities, i.e., correlations between de-regressed proofs and genomic BV, were calculated with a five-fold cross validation and with a forward prediction for young genotyped validation animals born after 2011. Generally, predictive abilities for LR were rather small (0.08 for GBLUP, 0.19 for GEBV and 0.18 for ssGBLUP). For LW, ssGBLUP had the greatest predictive ability (0.45). For both breeds, assessment of reliabilities for young genotyped animals indicated that genomic prediction outperforms PA with ssGBLUP providing greater reliabilities (0.40 for LR and 0.32 for LW) than GEBV (0.35 for LR and 0.29 for LW). Grouping of animals according to information sources revealed that genomic prediction had the highest potential benefit for genotyped animals without their own phenotype. Although, ssGBLUP did not generally outperform GBLUP or GEBV, the results suggest that ssGBLUP can be a useful and conceptually convincing approach for practical genomic prediction of NBA in moderately sized LR and LW populations.
Collapse
|
49
|
Accuracy of genomic breeding values revisited: Assessment of two established approaches and a novel one to determine the accuracy in two-step genomic prediction. J Anim Breed Genet 2017; 134:242-255. [PMID: 28508487 DOI: 10.1111/jbg.12273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 03/17/2017] [Indexed: 11/28/2022]
Abstract
Selection decisions in genomic selection schemes are made based on genomic breeding values (GBV) of candidates. Thus, the accuracy of GBV is a relevant parameter, as it reflects the stability of prediction and the possibility that the GBV might change when more information becomes available. Accuracy of genomic prediction defined as the correlation between GBV and true breeding values (TBV), however, is difficult to assess, considering TBV of the candidates are not available in reality. In previous studies, several methods were proposed to assess the accuracy of GBV including methods using population parameters or parameters inferred from mixed-model equations. In practice, most approaches tended to overestimate the accuracy of genomic prediction. We thus tested approaches used in previous studies in order to assess the magnitude of bias. Analyses were performed based on simulated data under a variety of scenarios mimicking different livestock breeding programmes. Furthermore, we proposed a novel method and tested it both with simulated data and in a real Holstein data set. The new method provided a better prediction for the accuracy of GBV in the simulated scenarios.
Collapse
|
50
|
A reaction norm sire model to study the effect of metabolic challenge in early lactation on the functional longevity of dairy cows. J Dairy Sci 2017; 100:3742-3753. [DOI: 10.3168/jds.2016-12031] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 01/20/2017] [Indexed: 11/19/2022]
|