1
|
Boggio GM, Monteiro HF, Lima FS, Figueiredo CC, Bisinotto RS, Santos JEP, Mion B, Schenkel FS, Ribeiro ES, Weigel KA, Rosa GJM, Peñagaricano F. Investigating relationships between the host genome, rumen microbiome, and dairy cow feed efficiency using mediation analysis with structural equation modeling. J Dairy Sci 2024:S0022-0302(24)00938-X. [PMID: 38908714 DOI: 10.3168/jds.2024-24675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/21/2024] [Indexed: 06/24/2024]
Abstract
The rumen microbiome is crucial for converting feed into absorbable nutrients used for milk synthesis, and the efficiency of this process directly impacts the profitability and sustainability of the dairy industry. Recent studies have found that the rumen microbial composition explains part of the variation in feed efficiency traits, including dry matter intake, milk energy, and residual feed intake. The main goal of this study was to reveal relationships between the host genome, rumen microbiome, and dairy cow feed efficiency using structural equation models. Our specific objectives were to (i) infer the mediation effects of the rumen microbiome on feed efficiency traits, (ii) estimate the direct and total heritability of feed efficiency traits, and (iii) calculate the direct and total breeding values of feed efficiency traits. Data consisted of dry matter intake, milk energy, and residual feed intake records, SNP genotype data, and 16S rRNA rumen microbial abundances from 448 mid-lactation Holstein cows from 2 research farms. We implemented structural equation models such that the host genome directly affects the phenotype (GP → P) and the rumen microbiome (GM → P), while the microbiome affects the phenotype (M → P), partially mediating the effect of the host genome on the phenotype (G → M → P). We found that 7 to 30% of microbes within the rumen microbial community had structural coefficients different from zero. We classified these microbes into 3 groups that could have different uses in dairy farming. Microbes with heritability <0.10 but significant causal effects on feed efficiency are attractive for external interventions. On the other hand, 2 groups of microbes with heritability ≥0.10, significant causal effects, and genetic covariances and causal effects with the same or opposite sign to feed efficiency are attractive for selective breeding, improving or decreasing the trait heritability and response to selection, respectively. In general, the inclusion of the different microbes in genomic models tends to decrease the trait heritability rather than increase it, ranging from -15% to +5%, depending on the microbial group and phenotypic trait. Our findings provide more understanding to target rumen microbes that can be manipulated, either through selection or management interventions, to improve feed efficiency traits.
Collapse
Affiliation(s)
| | - Hugo F Monteiro
- Department of Population Health and Reproduction, University of California, Davis 95616
| | - Fabio S Lima
- Department of Population Health and Reproduction, University of California, Davis 95616
| | - Caio C Figueiredo
- Department of Veterinary Clinical Sciences, Washington State University, Pullman 99163
| | - Rafael S Bisinotto
- Department of Large Animal Clinical Sciences, University of Florida, Gainesville 32610
| | - José E P Santos
- Department of Animal Sciences, University of Florida, Gainesville 32611
| | - Bruna Mion
- Department of Animal Biosciences, University of Guelph, Guelph N1G-2W1
| | - Flavio S Schenkel
- Department of Animal Biosciences, University of Guelph, Guelph N1G-2W1
| | - Eduardo S Ribeiro
- Department of Animal Biosciences, University of Guelph, Guelph N1G-2W1
| | - Kent A Weigel
- Department of Animal and Dairy Sciences, University of Wisconsin, Madison 53706
| | - Guilherme J M Rosa
- Department of Animal and Dairy Sciences, University of Wisconsin, Madison 53706
| | | |
Collapse
|
2
|
Momen M, Bhatta M, Hussain W, Yu H, Morota G. Modeling multiple phenotypes in wheat using data-driven genomic exploratory factor analysis and Bayesian network learning. PLANT DIRECT 2021; 5:e00304. [PMID: 33532691 PMCID: PMC7833463 DOI: 10.1002/pld3.304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 12/03/2020] [Accepted: 12/16/2020] [Indexed: 06/12/2023]
Abstract
Inferring trait networks from a large volume of genetically correlated diverse phenotypes such as yield, architecture, and disease resistance can provide information on the manner in which complex phenotypes are interrelated. However, studies on statistical methods tailored to multidimensional phenotypes are limited, whereas numerous methods are available for evaluating the massive number of genetic markers. Factor analysis operates at the level of latent variables predicted to generate observed responses. The objectives of this study were to illustrate the manner in which data-driven exploratory factor analysis can map observed phenotypes into a smaller number of latent variables and infer a genomic latent factor network using 45 agro-morphological, disease, and grain mineral phenotypes measured in synthetic hexaploid wheat lines (Triticum aestivum L.). In total, eight latent factors including grain yield, architecture, flag leaf-related traits, grain minerals, yellow rust, two types of stem rust, and leaf rust were identified as common sources of the observed phenotypes. The genetic component of the factor scores for each latent variable was fed into a Bayesian network to obtain a trait structure reflecting the genetic interdependency among traits. Three directed paths were consistently identified by two Bayesian network algorithms. Flag leaf-related traits influenced leaf rust, and yellow rust and stem rust influenced grain yield. Additional paths that were identified included flag leaf-related traits to minerals and minerals to architecture. This study shows that data-driven exploratory factor analysis can reveal smaller dimensional common latent phenotypes that are likely to give rise to numerous observed field phenotypes without relying on prior biological knowledge. The inferred genomic latent factor structure from the Bayesian network provides insights for plant breeding to simultaneously improve multiple traits, as an intervention on one trait will affect the values of focal phenotypes in an interrelated complex trait system.
Collapse
Affiliation(s)
- Mehdi Momen
- Department of Animal and Poultry SciencesVirginia Polytechnic Institute and State UniversityBlacksburgVAUSA
| | - Madhav Bhatta
- Department of AgronomyUniversity of Wisconsin‐MadisonMadisonWIUSA
| | - Waseem Hussain
- International Rice Research InstituteLos BanosPhilippines
| | - Haipeng Yu
- Department of Animal and Poultry SciencesVirginia Polytechnic Institute and State UniversityBlacksburgVAUSA
| | - Gota Morota
- Department of Animal and Poultry SciencesVirginia Polytechnic Institute and State UniversityBlacksburgVAUSA
| |
Collapse
|
3
|
Tiezzi F, Fix J, Schwab C, Shull C, Maltecca C. Gut microbiome mediates host genomic effects on phenotypes: a case study with fat deposition in pigs. Comput Struct Biotechnol J 2020; 19:530-544. [PMID: 33510859 PMCID: PMC7809165 DOI: 10.1016/j.csbj.2020.12.038] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 12/22/2020] [Accepted: 12/23/2020] [Indexed: 01/02/2023] Open
Abstract
A large number of studies have highlighted the importance of gut microbiome composition in shaping fat deposition in mammals. Several studies have also highlighted how host genome controls the abundance of certain species that make up the gut microbiota. We propose a systematic approach to infer how the host genome can control the gut microbiome, which in turn contributes to the host phenotype determination. We implemented a mediation test that can be applied to measured and latent dependent variables to describe fat deposition in swine (Sus scrofa). In this study, we identify several host genomic features having a microbiome-mediated effects on fat deposition. This demonstrates how the host genome can affect the phenotypic trait by inducing a change in gut microbiome composition that leads to a change in the phenotype. Host genomic variants identified through our analysis are different than the ones detected in a traditional genome-wide association study. In addition, the use of latent dependent variables allows for the discovery of additional host genomic features that do not show a significant effect on the measured variables. Microbiome-mediated host genomic effects can help understand the genetic determination of fat deposition. Since their contribution to the overall genetic variance is usually not included in association studies, they can contribute to filling the missing heritability gap and provide further insights into the host genome – gut microbiome interplay. Further studies should focus on the portability of these effects to other populations as well as their preservation when pro-/pre-/anti-biotics are used (i.e. remediation).
Collapse
Key Words
- BEL, Weight of the belly cut
- BF1, Backfat depth measured in vivo at the age of 118.1±1.16 d
- BF2, Backfat depth measured in vivo at the age of 145.9±1.53 d
- BF3, Backfat depth measured in vivo at the age of 174.3±1.43 d
- BF4, Backfat depth measured in vivo at the age of 196.6±8.03 d
- BFt, Backfat measured post mortem (after slaughter at 196.6±8.03 d)
- Causal effect
- FATg, Latent variable built on BF1, BF2, and BF3
- FATt, Latent variable built on BF4, BFt, and BEL
- Fat deposition
- G, host genomic features, represented in this study by SNP
- Gut microbiome
- Latent variables
- M, gut microbiome features, represented in this study by OUT
- Mod1, Model 1, used to estimate the total effect of G on P. Reported in Fig. 1a
- Mod1L, Model 1L, used to estimate the total effect of G on
- Mod2, Model 2, used to estimate the effect of M on P. Reported in Fig. 1b
- Mod2L, Model 2L, used to estimate the effect of M on
- Mod3, Model 3, used to estimate the effect of G on M. Reported in Fig. S1
- Mod4, Model 4, used to estimate the direct and mediated effects of G on P. Reported in Fig. 1c
- Mod4L, Model 4, used to estimate the direct and mediated effects of G on. Reported in Fig. 1d
- OUT, Operational Taxonomic Units
- P, Phenotype recorded on the host
- S2a, S2b, S3a, S3b, S3c, Gut microbiome OUT selected used as mediator variables. See Table 2
- SEM, Structural equation model
- SNP, Single Nucleotide Polymorphism marker
- Π, Latent variable built on the P variables
Collapse
Affiliation(s)
- Francesco Tiezzi
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | - Justin Fix
- Acuity Ag Solutions, LLC, Carlyle, IL 62230, USA
| | - Clint Schwab
- Acuity Ag Solutions, LLC, Carlyle, IL 62230, USA.,The Maschhoffs, LLC, Carlyle, IL 62230, USA
| | | | - Christian Maltecca
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
4
|
Kruijer W, Behrouzi P, Bustos-Korts D, Rodríguez-Álvarez MX, Mahmoudi SM, Yandell B, Wit E, van Eeuwijk FA. Reconstruction of Networks with Direct and Indirect Genetic Effects. Genetics 2020; 214:781-807. [PMID: 32015018 PMCID: PMC7153926 DOI: 10.1534/genetics.119.302949] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 01/02/2020] [Indexed: 12/29/2022] Open
Abstract
Genetic variance of a phenotypic trait can originate from direct genetic effects, or from indirect effects, i.e., through genetic effects on other traits, affecting the trait of interest. This distinction is often of great importance, for example, when trying to improve crop yield and simultaneously control plant height. As suggested by Sewall Wright, assessing contributions of direct and indirect effects requires knowledge of (1) the presence or absence of direct genetic effects on each trait, and (2) the functional relationships between the traits. Because experimental validation of such relationships is often unfeasible, it is increasingly common to reconstruct them using causal inference methods. However, most current methods require all genetic variance to be explained by a small number of quantitative trait loci (QTL) with fixed effects. Only a few authors have considered the "missing heritability" case, where contributions of many undetectable QTL are modeled with random effects. Usually, these are treated as nuisance terms that need to be eliminated by taking residuals from a multi-trait mixed model (MTM). But fitting such an MTM is challenging, and it is impossible to infer the presence of direct genetic effects. Here, we propose an alternative strategy, where genetic effects are formally included in the graph. This has important advantages: (1) genetic effects can be directly incorporated in causal inference, implemented via our PCgen algorithm, which can analyze many more traits; and (2) we can test the existence of direct genetic effects, and improve the orientation of edges between traits. Finally, we show that reconstruction is much more accurate if individual plant or plot data are used, instead of genotypic means. We have implemented the PCgen-algorithm in the R-package pcgen.
Collapse
Affiliation(s)
- Willem Kruijer
- Biometris, Wageningen University and Research, 6708 PB Wageningen, Netherlands
| | - Pariya Behrouzi
- Biometris, Wageningen University and Research, 6708 PB Wageningen, Netherlands
| | | | - María Xosé Rodríguez-Álvarez
- BCAM - Basque Center for Applied Mathematics, 48009 Bilbao, Spain
- IKERBASQUE, Basque Foundation for Science, 48013 Bilbao, Spain
| | - Seyed Mahdi Mahmoudi
- Faculty of Mathematics, Statistics and Computer Science, Semnan University, 35131-19111 Semnan, Iran
| | - Brian Yandell
- University of Wisconsin-Madison, Wisconsin 53706-1510
| | - Ernst Wit
- Università della Svizzera italiana, 6900 Lugano, Switzerland
| | - Fred A van Eeuwijk
- Biometris, Wageningen University and Research, 6708 PB Wageningen, Netherlands
| |
Collapse
|
5
|
Maltecca C, Bergamaschi M, Tiezzi F. The interaction between microbiome and pig efficiency: A review. J Anim Breed Genet 2019; 137:4-13. [PMID: 31576623 DOI: 10.1111/jbg.12443] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 09/05/2019] [Accepted: 09/06/2019] [Indexed: 12/22/2022]
Abstract
The existence of genetic control over the abundance of particular taxa and the link of these to energy balance and growth has been documented in model organisms and humans as well as several livestock species. Preliminary evidence of the same mechanisms is currently under investigation in pigs. Future research should expand these results and elicit the extent of genetic control of the gut microbiome population in swine and its relationship with growth efficiency. The quest for a more efficient pig at the interface between the host and its metagenome rests on the central hypothesis that the gut microbiome is an essential component of the variability of growth in all living organisms. Swine do not escape this general rule, and the identification of the significance of the interaction between host and its gut microbiota in the growth process could be a game-changer in the achievement of sustainable and efficient lean meat production. Standard sampling protocols, sequencing techniques, bioinformatic pipelines and methods of analysis will be paramount for the portability of results across experiments and populations. Likewise, characterizing and accounting for temporal and spatial variability will be a necessary step if microbiome is to be utilized routinely as an aid to selection.
Collapse
Affiliation(s)
- Christian Maltecca
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | - Matteo Bergamaschi
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | - Francesco Tiezzi
- Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
6
|
Genomic Bayesian Confirmatory Factor Analysis and Bayesian Network To Characterize a Wide Spectrum of Rice Phenotypes. G3-GENES GENOMES GENETICS 2019; 9:1975-1986. [PMID: 30992319 PMCID: PMC6553530 DOI: 10.1534/g3.119.400154] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
With the advent of high-throughput phenotyping platforms, plant breeders have a means to assess many traits for large breeding populations. However, understanding the genetic interdependencies among high-dimensional traits in a statistically robust manner remains a major challenge. Since multiple phenotypes likely share mutual relationships, elucidating the interdependencies among economically important traits can better inform breeding decisions and accelerate the genetic improvement of plants. The objective of this study was to leverage confirmatory factor analysis and graphical modeling to elucidate the genetic interdependencies among a diverse agronomic traits in rice. We used a Bayesian network to depict conditional dependencies among phenotypes, which can not be obtained by standard multi-trait analysis. We utilized Bayesian confirmatory factor analysis which hypothesized that 48 observed phenotypes resulted from six latent variables including grain morphology, morphology, flowering time, physiology, yield, and morphological salt response. This was followed by studying the genetics of each latent variable, which is also known as factor, using single nucleotide polymorphisms. Bayesian network structures involving the genomic component of six latent variables were established by fitting four algorithms (i.e., Hill Climbing, Tabu, Max-Min Hill Climbing, and General 2-Phase Restricted Maximization algorithms). Physiological components influenced the flowering time and grain morphology, and morphology and grain morphology influenced yield. In summary, we show the Bayesian network coupled with factor analysis can provide an effective approach to understand the interdependence patterns among phenotypes and to predict the potential influence of external interventions or selection related to target traits in the interrelated complex traits systems.
Collapse
|
7
|
Bello NM, Ferreira VC, Gianola D, Rosa GJM. Conceptual framework for investigating causal effects from observational data in livestock. J Anim Sci 2018; 96:4045-4062. [PMID: 30107524 DOI: 10.1093/jas/sky277] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 07/03/2018] [Indexed: 01/07/2023] Open
Abstract
Understanding causal mechanisms among variables is critical to efficient management of complex biological systems such as animal agriculture production. The increasing availability of data from commercial livestock operations offers unique opportunities for attaining causal insight, despite the inherently observational nature of these data. Causal claims based on observational data are substantiated by recent theoretical and methodological developments in the rapidly evolving field of causal inference. Thus, the objectives of this review are as follows: 1) to introduce a unifying conceptual framework for investigating causal effects from observational data in livestock, 2) to illustrate its implementation in the context of the animal sciences, and 3) to discuss opportunities and challenges associated with this framework. Foundational to the proposed conceptual framework are graphical objects known as directed acyclic graphs (DAGs). As mathematical constructs and practical tools, DAGs encode putative structural mechanisms underlying causal models together with their probabilistic implications. The process of DAG elicitation and causal identification is central to any causal claims based on observational data. We further discuss necessary causal assumptions and associated limitations to causal inference. Last, we provide practical recommendations to facilitate implementation of causal inference from observational data in the context of the animal sciences.
Collapse
Affiliation(s)
- Nora M Bello
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Statistics, Kansas State University, Manhattan, KS.,Center for Outcomes Research and Epidemiology, Kansas State University, Manhattan, KS
| | - Vera C Ferreira
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI
| | - Daniel Gianola
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Dairy Science, University of Wisconsin-Madison, Madison, WI.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI.,Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
8
|
Hu P, Jiao R, Jin L, Xiong M. Application of Causal Inference to Genomic Analysis: Advances in Methodology. Front Genet 2018; 9:238. [PMID: 30042787 PMCID: PMC6048229 DOI: 10.3389/fgene.2018.00238] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2018] [Accepted: 06/14/2018] [Indexed: 11/20/2022] Open
Abstract
The current paradigm of genomic studies of complex diseases is association and correlation analysis. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), the identified genetic variants by GWAS can only explain a small proportion of the heritability of complex diseases. A large fraction of genetic variants is still hidden. Association analysis has limited power to unravel mechanisms of complex diseases. It is time to shift the paradigm of genomic analysis from association analysis to causal inference. Causal inference is an essential component for the discovery of mechanism of diseases. This paper will review the major platforms of the genomic analysis in the past and discuss the perspectives of causal inference as a general framework of genomic analysis. In genomic data analysis, we usually consider four types of associations: association of discrete variables (DNA variation) with continuous variables (phenotypes and gene expressions), association of continuous variables (expressions, methylations, and imaging signals) with continuous variables (gene expressions, imaging signals, phenotypes, and physiological traits), association of discrete variables (DNA variation) with binary trait (disease status) and association of continuous variables (gene expressions, methylations, phenotypes, and imaging signals) with binary trait (disease status). In this paper, we will review algorithmic information theory as a general framework for causal discovery and the recent development of statistical methods for causal inference on discrete data, and discuss the possibility of extending the association analysis of discrete variable with disease to the causal analysis for discrete variable and disease.
Collapse
Affiliation(s)
- Pengfei Hu
- Ministry of Education Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China
| | - Rong Jiao
- Department of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, United States
| | - Li Jin
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center for Genetics and Development, School of Life Sciences, Fudan University, Shanghai, China
- Human Phenome Institute, Fudan University, Shanghai, China
| | - Momiao Xiong
- Department of Biostatistics and Data Science, School of Public Health, University of Texas Health Science Center at Houston, Houston, TX, United States
| |
Collapse
|
9
|
Töpner K, Rosa GJM, Gianola D, Schön CC. Bayesian Networks Illustrate Genomic and Residual Trait Connections in Maize ( Zea mays L.). G3 (BETHESDA, MD.) 2017; 7:2779-2789. [PMID: 28637811 PMCID: PMC5555481 DOI: 10.1534/g3.117.044263] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 06/15/2017] [Indexed: 12/12/2022]
Abstract
Relationships among traits were investigated on the genomic and residual levels using novel methodology. This included inference on these relationships via Bayesian networks and an assessment of the networks with structural equation models. The methodology employed three steps. First, a Bayesian multiple-trait Gaussian model was fitted to the data to decompose phenotypic values into their genomic and residual components. Second, genomic and residual network structures among traits were learned from estimates of these two components. Network learning was performed using six different algorithmic settings for comparison, of which two were score-based and four were constraint-based approaches. Third, structural equation model analyses ranked the networks in terms of goodness of fit and predictive ability, and compared them with the standard multiple-trait fully recursive network. The methodology was applied to experimental data representing the European heterotic maize pools Dent and Flint (Zea mays L.). Inferences on genomic and residual trait connections were depicted separately as directed acyclic graphs. These graphs provide information beyond mere pairwise genetic or residual associations between traits, illustrating for example conditional independencies and hinting at potential causal links among traits. Network analysis suggested some genetic correlations as potentially spurious. Genomic and residual networks were compared between Dent and Flint.
Collapse
Affiliation(s)
- Katrin Töpner
- Plant Breeding, TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
- Institute for Advanced Study, Technical University of Munich, 85748 Garching, Germany
| | - Guilherme J M Rosa
- Institute for Advanced Study, Technical University of Munich, 85748 Garching, Germany
- Department of Animal Sciences, University of Wisconsin-Madison, Wisconsin 53706
| | - Daniel Gianola
- Institute for Advanced Study, Technical University of Munich, 85748 Garching, Germany
- Department of Animal Sciences, University of Wisconsin-Madison, Wisconsin 53706
| | - Chris-Carolin Schön
- Plant Breeding, TUM School of Life Sciences Weihenstephan, Technical University of Munich, 85354 Freising, Germany
- Institute for Advanced Study, Technical University of Munich, 85748 Garching, Germany
| |
Collapse
|
10
|
Ferreira VC, Thomas DL, Valente BD, Rosa GJM. Causal effect of prolificacy on milk yield in dairy sheep using propensity score. J Dairy Sci 2017; 100:8443-8450. [PMID: 28780093 DOI: 10.3168/jds.2017-12907] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2017] [Accepted: 05/30/2017] [Indexed: 11/19/2022]
Abstract
In animal production, it is often important to investigate causal relationships among variables. The gold standard tool for such investigation is randomized experiments. However, randomized experiments may not always be feasible, possible, or cost effective or reflect real-world farm conditions. Sometimes it is necessary to infer effects from farm-recorded data. Inferring causal effects between variables from field data is challenging because the association between them may arise not only from the effect of one on another but also from confounding background factors. Propensity score (PS) methods address this issue by correcting for confounding in different levels of the causal variable, which allows unbiased inference of causal effects. Here the objective was to estimate the causal effect of prolificacy on milk yield (MY) in dairy sheep using PS based on matched samples. Data consisted of 4,319 records from 1,534 crossbred ewes. Confounders were lactation number (first, second, and third through sixth) and dairy breed composition (<0.5, 0.5-0.75, and >0.75 of East Friesian or Lacaune). The causal variable prolificacy was considered as 2 levels (single or multiple lambs at birth). The outcome MY represented the volume of milk produced in the whole lactation. Pairs of single- and multiple-birth ewes (1,166) with similar PS were formed. The matching process diminished major discrepancies in the distribution of prolificacy for each confounder variable indicating bias reduction (cutoff standardized bias = 20%). The causal effect was estimated as the average difference within pairs. The effect of prolificacy on MY per lactation was 20.52 L of milk with a simple matching estimator and 12.62 L after correcting for remaining biases. A core advantage of causal over probabilistic approaches is that they allow inference of how variables would react as a result of external interventions (e.g., changes in the production system). Therefore, results imply that management and decision-making practices increasing prolificacy would positively affect MY, which is important knowledge at the farm level. Farm-recorded data can be a valuable source of information given its low cost, and it reflects real-world herd conditions. In this context, PS methods can be extremely useful as an inference tool for investigating causal effects. In addition, PS analysis can be implemented as a preliminary evaluation or a hypothesis generator for future randomized trials (if the trait analyzed allows randomization).
Collapse
Affiliation(s)
- Vera C Ferreira
- Department of Animal Sciences, University of Wisconsin, Madison 53706
| | - David L Thomas
- Department of Animal Sciences, University of Wisconsin, Madison 53706
| | - Bruno D Valente
- Department of Animal Sciences, University of Wisconsin, Madison 53706; Department of Dairy Science, University of Wisconsin, Madison 53706
| | - Guilherme J M Rosa
- Department of Animal Sciences, University of Wisconsin, Madison 53706; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison 53706.
| |
Collapse
|
11
|
Xavier A, Hall B, Hearst AA, Cherkauer KA, Rainey KM. Genetic Architecture of Phenomic-Enabled Canopy Coverage in Glycine max. Genetics 2017; 206:1081-1089. [PMID: 28363978 PMCID: PMC5499164 DOI: 10.1534/genetics.116.198713] [Citation(s) in RCA: 57] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Accepted: 03/03/2017] [Indexed: 12/25/2022] Open
Abstract
Digital imagery can help to quantify seasonal changes in desirable crop phenotypes that can be treated as quantitative traits. Because limitations in precise and functional phenotyping restrain genetic improvement in the postgenomic era, imagery-based phenomics could become the next breakthrough to accelerate genetic gains in field crops. Whereas many phenomic studies focus on exploratory analysis of spectral data without obvious interpretative value, we used field images to directly measure soybean canopy development from phenological stage V2 to R5. Over 3 years, we collected imagery using ground and aerial platforms of a large and diverse nested association panel comprising 5555 lines. Genome-wide association analysis of canopy coverage across sampling dates detected a large quantitative trait locus (QTL) on soybean (Glycine max, L. Merr.) chromosome 19. This QTL provided an increase in yield of 47.3 kg ha-1 Variance component analysis indicated that a parameter, described as average canopy coverage, is a highly heritable trait (h2 = 0.77) with a promising genetic correlation with grain yield (0.87), enabling indirect selection of yield via canopy development parameters. Our findings indicate that fast canopy coverage is an early season trait that is inexpensive to measure and has great potential for application in breeding programs focused on yield improvement. We recommend using the average canopy coverage in multiple trait schemes, especially for the early stages of the breeding pipeline (including progeny rows and preliminary yield trials), in which the large number of field plots makes collection of grain yield data challenging.
Collapse
Affiliation(s)
- Alencar Xavier
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Benjamin Hall
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - Anthony A Hearst
- Department of Agriculture and Biological Engineering, Purdue University, West Lafayette, Indiana 47907
| | - Keith A Cherkauer
- Department of Agriculture and Biological Engineering, Purdue University, West Lafayette, Indiana 47907
| | - Katy M Rainey
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|
12
|
Xavier A, Muir WM, Rainey KM. Assessing Predictive Properties of Genome-Wide Selection in Soybeans. G3 (BETHESDA, MD.) 2016; 6:2611-6. [PMID: 27317786 PMCID: PMC4978914 DOI: 10.1534/g3.116.032268] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Accepted: 06/16/2016] [Indexed: 11/30/2022]
Abstract
Many economically important traits in plant breeding have low heritability or are difficult to measure. For these traits, genomic selection has attractive features and may boost genetic gains. Our goal was to evaluate alternative scenarios to implement genomic selection for yield components in soybean (Glycine max L. merr). We used a nested association panel with cross validation to evaluate the impacts of training population size, genotyping density, and prediction model on the accuracy of genomic prediction. Our results indicate that training population size was the factor most relevant to improvement in genome-wide prediction, with greatest improvement observed in training sets up to 2000 individuals. We discuss assumptions that influence the choice of the prediction model. Although alternative models had minor impacts on prediction accuracy, the most robust prediction model was the combination of reproducing kernel Hilbert space regression and BayesB. Higher genotyping density marginally improved accuracy. Our study finds that breeding programs seeking efficient genomic selection in soybeans would best allocate resources by investing in a representative training set.
Collapse
Affiliation(s)
- Alencar Xavier
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| | - William M Muir
- Department of Animal Science, Purdue University, West Lafayette, Indiana 47907
| | - Katy Martin Rainey
- Department of Agronomy, Purdue University, West Lafayette, Indiana 47907
| |
Collapse
|